User Tag List

12 Last

Results 1 to 10 of 19

Thread: Databases

  1. #1
    ha-ha-hoo Julius_Van_Der_Beak's Avatar
    Join Date
    Jul 2008
    MBTI
    INTP
    Enneagram
    5w6 sp/so
    Socionics
    LII None
    Posts
    11,760

    Default Databases

    I feel like it's easy to overlook how ubiquitous they all are. Aren't all these posts in a database? It's just fascinating to me because we interact with this graphical overlay, but the "meat" of this place probably looks entirely different.
    A path is made by walking on it.

    -Zhuangzi


    Likes JAVO liked this post

  2. #2
    Senior Member Oberon's Avatar
    Join Date
    Feb 2019
    MBTI
    *NT*
    Posts
    170

    Default

    Databases are fascinating in a way. I took a database class at community college - aced it. Some background - I've been indecisive between computer science, law, business, dream analysis (Jungian Analytical psychology). I went the business route but every year I re-asses. So I took a database class (I know, I know - TMI).

    Anyways...some big points (abstractions).

    Data tables are connected to other tables by a primary key. Should databases be imbibed into a mega-database, theoretically, your name and email address, in addition to some other personal datum, could be a unique key. With this key everything you ever did on the internet could be made obtainable with a simple query.

    Although that is quite impossible at the moment...who knows.

    I don't really care though, got nothing to hide but embarrassing photos and random poems.

    Do you study databases? SQL? Python?
    "I dream in stereo, the stereo sounds strange, I know that if you hide, it doesn't go away, if you get out of bed.....my little dark age."
    Likes JAVO liked this post

  3. #3
    Liberator Coriolis's Avatar
    Join Date
    Apr 2010
    MBTI
    INTJ
    Enneagram
    5w6 sp/sx
    Posts
    24,338

    Default

    Quote Originally Posted by Oberon View Post
    Data tables are connected to other tables by a primary key. Should databases be imbibed into a mega-database, theoretically, your name and email address, in addition to some other personal datum, could be a unique key. With this key everything you ever did on the internet could be made obtainable with a simple query.

    Although that is quite impossible at the moment...who knows.

    I don't really care though, got nothing to hide but embarrassing photos and random poems.
    That's what Martin Niemöller thought, too.

    Quote Originally Posted by Oberon View Post
    Do you study databases? SQL? Python?
    I study Python, but it is a programming language, not a database.

    I've been called a criminal, a terrorist, and a threat to the known universe. But everything you were told is a lie. The truth is, they've taken our freedom, our home, and our future. The time has come for all humanity to take a stand...
    Likes JAVO, Oberon liked this post

  4. #4
    accept no imitations JAVO's Avatar
    Join Date
    Apr 2007
    MBTI
    eNTP
    Enneagram
    5w4
    Posts
    7,273

    Default

    Data streams have taken an increasing role as both data collection and sharing increased. This allows systems and applications to receive data in real time instead of wait for a batch job which runs nightly to give them what they want. For example, a patient has a disease which is fatal unless treated promptly, but the data about the collected specimens is critical too, as it contains information on specimen quality, percentage of malignant or dead cells, and genetic profiles of the disease (which reveal vulnerabilities to treatment options). You don't want the specimen analysis delayed for technology reasons, but at the same time, without the technological coordination, all of the information from multiple specimens quickly becomes difficult to track and integrate into useful information to treat the disease. The technology becomes even more important because the important specimen analysis is done at centralized locations, receiving specimens from hundreds of institutions, some of which are even in other countries.


    Old Database-Centered Method

    1. Lab processes/analyze specimen
    2. Technicians enter data into a database
    3. The specimen might then have to go to another lab at the same institution for more processing
    4. That other lab might use a different system and database
    5. A nightly job sends the data to each institution, or possibly several jobs from multiple databases/systems
    6. Hopefully that institution integrates the data with their system as part of their nightly job process
    7. The data about the specimen is available the next day



    New Data Stream-Centered Method

    As far as I know, this approach has just started to be used within the past few years.

    1. Lab processes/analyze specimen
    2. Technicians enter data into a database
    3. The specimen might then have to go to another lab at the same institution for more processing
    4. That other lab might use a different system and database
    5. A near real-time data stream producer detects the information in the database and sends the data to each institution within seconds or minutes of being entered.
    6. A data stream consumer at the patient's institution is subscribed to updates from the processing/analysis labs.
    7. It receives the specimen data, and integrates it immediately into the hospital's system.
    8. Result: the patient gets treated 1-3 days sooner, and with a treatment approach based on the latest medical knowledge and genetic analysis technology, not just whatever happens to be available at the local hospital.
    9. The data is still retained in a database at both institutions, but the communication of the data update has been moved away from the database to the data stream--using the right tool for the job.

  5. #5
    Senior Member Oberon's Avatar
    Join Date
    Feb 2019
    MBTI
    *NT*
    Posts
    170

    Default

    Quote Originally Posted by Coriolis View Post
    That's what Martin Niemöller thought, too.


    I study Python, but it is a programming language, not a database.

    Nice! You're studying or already fully utilizing and just improving?

    I know python isn't really a database language...but you can automate queries with it. You could also use PSQL if you're using an oracle product, or the Microsoft version of P-SQL.

    I'm learning python slowly as a hobby at the moment, hoping to find some use for it in what I do, but the gap is very large. Most sites sell on python's use in finance, or business, but the truth is the people who run these departments do not like employees automating things. They would rather pay consultants to design massive new software, or pair up with companies like SAP to automate it according to their own paradigms. That way they get the credit for it.
    "I dream in stereo, the stereo sounds strange, I know that if you hide, it doesn't go away, if you get out of bed.....my little dark age."

  6. #6
    Senior Member Oberon's Avatar
    Join Date
    Feb 2019
    MBTI
    *NT*
    Posts
    170

    Default

    Quote Originally Posted by JAVO View Post
    Data streams have taken an increasing role as both data collection and sharing increased. This allows systems and applications to receive data in real time instead of wait for a batch job which runs nightly to give them what they want. For example, a patient has a disease which is fatal unless treated promptly, but the data about the collected specimens is critical too, as it contains information on specimen quality, percentage of malignant or dead cells, and genetic profiles of the disease (which reveal vulnerabilities to treatment options). You don't want the specimen analysis delayed for technology reasons, but at the same time, without the technological coordination, all of the information from multiple specimens quickly becomes difficult to track and integrate into useful information to treat the disease. The technology becomes even more important because the important specimen analysis is done at centralized locations, receiving specimens from hundreds of institutions, some of which are even in other countries.


    Old Database-Centered Method

    1. Lab processes/analyze specimen
    2. Technicians enter data into a database
    3. The specimen might then have to go to another lab at the same institution for more processing
    4. That other lab might use a different system and database
    5. A nightly job sends the data to each institution, or possibly several jobs from multiple databases/systems
    6. Hopefully that institution integrates the data with their system as part of their nightly job process
    7. The data about the specimen is available the next day



    New Data Stream-Centered Method

    As far as I know, this approach has just started to be used within the past few years.

    1. Lab processes/analyze specimen
    2. Technicians enter data into a database
    3. The specimen might then have to go to another lab at the same institution for more processing
    4. That other lab might use a different system and database
    5. A near real-time data stream producer detects the information in the database and sends the data to each institution within seconds or minutes of being entered.
    6. A data stream consumer at the patient's institution is subscribed to updates from the processing/analysis labs.
    7. It receives the specimen data, and integrates it immediately into the hospital's system.
    8. Result: the patient gets treated 1-3 days sooner, and with a treatment approach based on the latest medical knowledge and genetic analysis technology, not just whatever happens to be available at the local hospital.
    9. The data is still retained in a database at both institutions, but the communication of the data update has been moved away from the database to the data stream--using the right tool for the job.
    Who produces the stream? Is it an automated process within the database administration software, or an automated process outside from a different system, or a human working with the later on an ongoing basis?

    The only problem I see with it, really, is that you risk errors perpetuating. The reason batch processing exists, besides it being a necessary step in the evolution of technology, is that it serves as an internal control. It is required by law that all public companies have internal controls to be traded on the market. Therefore, to bypass batch processing without a control in place to ensure optimal data integrity would be paramount to receiving massive fines and enlisting your company from being publicly traded.

    There could be another control of course, such as having a very low error rate of error, but that would still require independent technology auditors to tear open your systems and prove it independently, which would of course, offset any economic gains from the new process. Hence we have automated driving, but won't see it for another fifty years - legal reasons as noted above as well as a matter of practicality in business.
    "I dream in stereo, the stereo sounds strange, I know that if you hide, it doesn't go away, if you get out of bed.....my little dark age."

  7. #7
    accept no imitations JAVO's Avatar
    Join Date
    Apr 2007
    MBTI
    eNTP
    Enneagram
    5w4
    Posts
    7,273

    Default

    Quote Originally Posted by Oberon View Post
    Who produces the stream? Is it an automated process within the database administration software, or an automated process outside from a different system, or a human working with the later on an ongoing basis?
    I'm not sure if any databases have this built in. Generally it's a process specifically written to look for changes in the database. Apache Camel is often used to do this, sending messages to an Apache Kafka server.

    The only problem I see with it, really, is that you risk errors perpetuating. The reason batch processing exists, besides it being a necessary step in the evolution of technology, is that it serves as an internal control. It is required by law that all public companies have internal controls to be traded on the market. Therefore, to bypass batch processing without a control in place to ensure optimal data integrity would be paramount to receiving massive fines and enlisting your company from being publicly traded.

    There could be another control of course, such as having a very low error rate of error, but that would still require independent technology auditors to tear open your systems and prove it independently, which would of course, offset any economic gains from the new process. Hence we have automated driving, but won't see it for another fifty years - legal reasons as noted above as well as a matter of practicality in business.
    Yep, errors resulting from unexpected conditions still have to be dealt with. Sometimes that's though an automated alert which triggers a human to investigate. In many cases, the offending message can be automatically put on hold and then triggered for resend by the human once the error condition has been fixed. Another approach is to treat or mark real-time stream data as preliminary, with the understanding that it's for heads-up style interpretation, or to be used where acting on preliminary data is better than not acting on any data.
    Likes Oberon liked this post

  8. #8
    You have a choice! 21%'s Avatar
    Join Date
    May 2009
    MBTI
    INFJ
    Enneagram
    4w5
    Posts
    3,142

    Default

    I love databases. Love data. Love statistics. Love insights you can glean from data. Big data analyst is my dream job, because beyond all those numbers and text descriptions, sits an individual human -- a customer, a student, a someone behind a computer screen. The data links us. I want to see you. I want to understand you, and a thousand other yous with some variations. It's ultimately impossible, but it's fascinating.
    4w5 sp/sx EII
    Fear less, love more, be the change you want to see in the world.
    Likes JAVO liked this post

  9. #9
    Senior Member
    Join Date
    Jun 2009
    Posts
    27,492

    Default

    Quote Originally Posted by 21% View Post
    I love databases. Love data. Love statistics. Love insights you can glean from data. Big data analyst is my dream job, because beyond all those numbers and text descriptions, sits an individual human -- a customer, a student, a someone behind a computer screen. The data links us. I want to see you. I want to understand you, and a thousand other yous with some variations. It's ultimately impossible, but it's fascinating.
    I think/feel that too quite a lot, its struck me reading a lot of recent publishing how much of it is just reportage of data analysis.

    Although a lot of it really interests me in terms of how it is collected and what questions people are trying to answer with it etc. There's a lot of confirmation bias and a lot of research errors and that's before you ever get remotely close to the average reader or street level observer like myself.

    Its something I never understood the different times I was at university, we did have statistics classes but it was largely so that the degrees, diplomas and masters I did could claim to be sciences, or so I think, maybe that's cynical, there was only one (the masters) that I thought was created/designed with rigor anyway.

    Largely it was about using SPSS and I was lucky that it worked properly when I filled the fields etc. I was very aware at the time that I hadnt properly learned and mastered it as I could not have used it privately to satisfy my curiousity, I did learn a lot of good things about the short falls of quantitative research though.

    Its something that, if I were independently wealthy and could dedicate myself to learning, that I'd dedicate a few modules in my personal curriculum to.

  10. #10
    ha-ha-hoo Julius_Van_Der_Beak's Avatar
    Join Date
    Jul 2008
    MBTI
    INTP
    Enneagram
    5w6 sp/so
    Socionics
    LII None
    Posts
    11,760

    Default

    Quote Originally Posted by Oberon View Post
    Databases are fascinating in a way. I took a database class at community college - aced it. Some background - I've been indecisive between computer science, law, business, dream analysis (Jungian Analytical psychology). I went the business route but every year I re-asses. So I took a database class (I know, I know - TMI).

    Anyways...some big points (abstractions).

    Data tables are connected to other tables by a primary key. Should databases be imbibed into a mega-database, theoretically, your name and email address, in addition to some other personal datum, could be a unique key. With this key everything you ever did on the internet could be made obtainable with a simple query.

    Although that is quite impossible at the moment...who knows.

    I don't really care though, got nothing to hide but embarrassing photos and random poems.

    Do you study databases? SQL? Python?
    Well, I know a bit about both. Right now I'm trying to learn a bit more about NoSQL databases. Specifically, Mongo.

    What's interesting to me is how simple they are, and how long databases have actually been with is. It's like something from an earlier era of computing that worked well enough that people just continued to use it. I mean, I suppose NoSQL is shaking things up a bit, but I can't see every organization adopting it, and a lot of people are probably going to hand on to SQL.
    A path is made by walking on it.

    -Zhuangzi


    Likes JAVO liked this post

Similar Threads

  1. Acquiring a database versus creation of understanding
    By coberst in forum Philosophy and Spirituality
    Replies: 5
    Last Post: 11-12-2009, 02:47 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Single Sign On provided by vBSSO