User Tag List

First 123

Results 21 to 26 of 26

Thread: Big Data

  1. #21
    meh Salomé's Avatar
    Join Date
    Sep 2008
    MBTI
    INTP
    Enneagram
    5w4 sx/sp
    Posts
    10,540

    Default

    Data, Big or small, is not Information.

    Hope this helps.
    Quote Originally Posted by Ivy View Post
    Gosh, the world looks so small from up here on my high horse of menstruation.

  2. #22
    null Jonny's Avatar
    Join Date
    Sep 2009
    MBTI
    FREE
    Posts
    2,486

    Default

    Quote Originally Posted by Salomé View Post
    Data, Big or small, is not Information.

    Hope this helps.
    Thanks. Although I was aware of the distinction between what that site calls data and information, I was unaware that we'd agreed that the term data exclusively refers to information before being organized and presented in a useful manner, and the term information exclusively refers to information after being organized and presented in a useful manner. I'll use those terms going forward.

    Edit:
    I do find the examples on that site to be sort of silly though. I bet a student's parents would argue that her test score is information.
    [SIGPIC][/SIGPIC]

  3. #23
    null Jonny's Avatar
    Join Date
    Sep 2009
    MBTI
    FREE
    Posts
    2,486

    Default

    @Salomé

    By this site's definition, Big Data would be information, since it's "Captured Data and Knowledge". Oh do I love semantics. We should all get together in a big room and argue endlessly about what every world should mean. I bet the SJs would love it.

    Also, per Merriam-Webster:

    data -

    1) factual information (as measurements or statistics) used as a basis for reasoning, discussion, or calculation

    2) information output by a sensing device or organ that includes both useful and irrelevant or redundant information and must be processed to be meaningful

    3) information in numerical form that can be digitally transmitted or processed
    [SIGPIC][/SIGPIC]

  4. #24
    meh Salomé's Avatar
    Join Date
    Sep 2008
    MBTI
    INTP
    Enneagram
    5w4 sx/sp
    Posts
    10,540

    Default

    Quote Originally Posted by Jonnyboy View Post
    Thanks. Although I was aware of the distinction between what that site calls data and information, I was unaware that we'd agreed that the term data exclusively refers to information before being organized and presented in a useful manner, and the term information exclusively refers to information after being organized and presented in a useful manner. I'll use those terms going forward.
    I'm not sure who "we" are...but in the context of this thread, I think it's a safe assumption.

    You're right in that the terms are often used interchangeably - which is fundamental to my criticism...

    I think there is some kind of misconception that all you have to do is load up a supercomputer with a shit ton of raw data and something useful will magically appear out the other side, courtesy of "artificial intelligence". Something with inherent validity. Something that needn't be understood to be useful. When you get into the nuts and bolts of the thing, that fantasy falls apart in a gazillion ways.
    Quote Originally Posted by Ivy View Post
    Gosh, the world looks so small from up here on my high horse of menstruation.

  5. #25
    null Jonny's Avatar
    Join Date
    Sep 2009
    MBTI
    FREE
    Posts
    2,486

    Default

    I agree completely. As for the "we", I meant the collective "we"; i.e. English speakers. I hadn't seen the distinction made before, although to the extent that what you say is true (that people believe in that fantasy), I think it's a very useful one to be made. Sort of like the distinction between theory and hypothesis; man does that irk me when people misuse those. As I said, going forward I'll be using the terms data and information as you do.
    [SIGPIC][/SIGPIC]

  6. #26

    Default

    Quote Originally Posted by Jonnyboy View Post
    @ygolo I'll address your claims point by point:


    The Big Data community itself is a source of the problem.

    I agree. The problem isn't Big Data, but how people use it. I stated this in the video.
    Hehe. I suppose this sort of thing is typical of discussions where we don't agree about what terms mean. You'd think with a bunch of intp's this wouldn't be an issue.

    The usage of the words "Big Data" is quite confusing.

    I thought what you were referring to was that there are people outside the Big Data community overhyping what it can do, but that people involved in the actual process don't do this.

    I pointed out that the claims that people find ridiculous come directly from many of Big Data's leaders. I am using the words "Big Data" as the community of people involved in using heavy computing power, statistical inference, and lots of data to make conclusions. There is a "culture", as well as a set of credos that characterize Big Data efforts.

    This is what I was criticizing for the rest of my post. This is probably the second biggest category of criticisms that critics of Big Data are trying to point out. The biggest issues being related to security and privacy.

    The dangers of bits sitting on disk somewhere are quite different...and from your response to @Salomé, it doesn't seem like this was what you meant either by "Big Data"? You don't seem to be using these words in a way that I am familiar with. Even quoting the webster's definitions did not clear things up for me. You actually seem to be using the words for "models", which I have to say is not typical of the ethos I've seen in the community.

    Quote Originally Posted by Jonnyboy View Post
    You are assuming you are able to adjust quickly enough to new correlations showing up.

    Am I? I don't recall assuming that.
    Well, in your car example: What if orange no longer became a custom color, but you didn't realize it till you bought the car based on color? At this point, the data has done its damage. I probably should have used the word "one" instead of "you", but I've slipped in that American usage of "you".

    Quote Originally Posted by Jonnyboy View Post
    Statistics work on groups not individuals.

    I agree wholeheartedly. I don't pour my milk and cereal into a shoe if I have a bowl available, and I don't use statistics to make claims about an individual when I can simply talk with the person. As I stated in the video, Big Data is good insofar as we use it to guide our intuition about the world.
    It is good that you believe you avoid using statistics to make decisions about individuals when you have actionable information about individual cases.

    However, I have not found this to be common among those who've been steeped in the use of statistics for decision making. People in the Big Data community have a ready excuse..."It isn't different this time."

    I understand the folly of gamblers who believe they have a "feeling", and then get taken in. But this is a very different thing from knowing about a physical asymmetry in a roulette table. Grouping both a gamblers "feeling", and an understanding of the underlying mechanism is something I find common among people who only work with statistics.

    Of course, there are many in the social sciences who claim to have this mechanistic understanding when they don't. So the skepticism seems warranted. But like Nate Silver did when he incorporated the observations of baseball scouts into his model, the good forecasters have to distinguish between BS and actual mechanistic understanding.

    Quote Originally Posted by Jonnyboy View Post
    All statistics assume a prior model, even rank based statistics.

    I agree. This is rooted in the very way we understand the world. We cannot think or comprehend without constructing our own mental models of our environment, so how could we ever hope to utilize data without doing the same?
    Well, I figured you knew this. But many people still hold on to a frequentist interpretation of statistics that have no mechanistic basis.

    Quote Originally Posted by Jonnyboy View Post
    You have more information from common sense that you realize, and therefore blindly assuming a default prior could be very wrong.

    I think everyone does, and it would be foolhardy to take a shot in the dark when setting assumptions. But, this is entirely pragmatic in nature; we simply do not have time to test every possibility that could ever exist. We have to start somewhere, and using prior knowledge to guide us often saves time.
    I have not found that everyone does this. The checks on common sense are much simpler than people think, but many people just set their priors with the defaults that come in their software package.

    Call this "pragmatic" if you want, but I find it irresponsible.

    If your weather models violate thermodynamics, or your market models dictate faster than light information transfer, I think there are some problems with the models beyond simple overfitting. You are ignoring the distilled knowledge coming from centuries of observations that are "outside of sample".

    EDIT:
    It took me a while to write that post. So some of it may be irrelevant.

    Accept the past. Live for the present. Look forward to the future.
    Robot Fusion
    "As our island of knowledge grows, so does the shore of our ignorance." John Wheeler
    "[A] scientist looking at nonscientific problems is just as dumb as the next guy." Richard Feynman
    "[P]etabytes of [] data is not the same thing as understanding emergent mechanisms and structures." Jim Crutchfield

Similar Threads

  1. Hidden Danger of Big Data
    By SearchingforPeace in forum Science, Technology, and Future Tech
    Replies: 6
    Last Post: 08-19-2016, 11:13 AM
  2. TypologyCentral and Big Data?
    By st-t-toat in forum The Bonfire
    Replies: 19
    Last Post: 09-16-2014, 02:55 AM
  3. Is your head too big?
    By sdalek in forum General Psychology
    Replies: 22
    Last Post: 07-07-2007, 02:55 PM
  4. Big brother is Watching You
    By wyrdsister in forum Politics, History, and Current Events
    Replies: 1
    Last Post: 04-25-2007, 04:36 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Single Sign On provided by vBSSO