How to Grow a Mind: Statistics, Structure, and Abstraction pdf
11 March 2011
by Joshua B. Tenenbaum,Charles Kemp,Thomas L. Griffiths, and Noah D. Goodman
In coming to understand the world—in learning concepts, acquiring language, and grasping causal relations—our minds make inferences that appear to go far beyond the data available. How do we do it? This review describes recent approaches to reverse-engineering human learning and cognitive development and, in parallel, engineering more humanlike machine learning systems. Computational models that perform probabilistic inference over hierarchies of flexibly structured representations can address some of the deepest questions about the nature and origins of human thought: How does abstract knowledge guide learning and reasoning from sparse data? What forms does our knowledge take, across different domains and tasks? And how is that abstract knowledge itself acquired?Excerpt: The Challenge: How Does the Mind Get So Much from So Little?
For scientists studying how humans come to understand their world, the central challenge is this: How do our minds get so much from so little? We build rich causal models, make strong generalizations, and construct powerful abstractions, whereas the input data are sparse, noisy, and ambiguous—in every way far too limited. A massive mismatch looms between the information coming in through our senses and the ouputs of cognition.
Consider the situation of a child learning the meanings of words. Any parent knows, and scientists have confirmed, that typical 2-yearolds can learn how to use a new word such as “horse” or “hairbrush” from seeing just a few examples. We know they grasp the meaning, not just the sound, because they generalize: They use the word appropriately (if not always perfectly) in new situations. Viewed as a computation on sensory input data, this is a remarkable feat. Within the infinite landscape of all possible objects, there is an infinite but still highly constrained subset that can be called “horses” and another for “hairbrushes.” How does a child grasp the boundaries of these subsets from seeing just one or a few examples of each? Adults face the challenge of learning entirely novel object concepts less often, but they can be just as good at it.
Generalization from sparse data is central in learning many aspects of language, such as syntactic constructions or morphological rules. It presents most starkly in causal learning: Every statistics class teaches that correlation does not imply causation, yet children routinely infer causal links from just a handful of events, far too small a sample to compute even a reliable correlation! Perhaps the deepest accomplishment of cognitive development is the construction of larger-scale systems of knowledge: intuitive theories of physics, psychology, or biology or rule systems for social structure or moral judgment. Building these systems takes years, much longer than learning a single new word or concept, but on this scale too the final product of learning far outstrips the data observed.
Philosophers have inquired into these puzzles for over two thousand years, most famously as “the problem of induction,” from Plato and Aristotle through Hume, Whewell, and Mill to Carnap, Quine, Goodman, and others in the 20th century. Only recently have these questions become accessible to science and engineering by viewing inductive learning as a species of computational problems and the human mind as a natural computer evolved for solving them. The proposed solutions are, in broad strokes, just what philosophers since Plato have suggested. If the mind goes beyond the data given, another source of information must make up the difference. Some more abstract background knowledge must generate and delimit the hypotheses learners consider, or meaningful generalization would be impossible. Psychologists and linguists speak of “constraints;” machine learning and artificial intelligence researchers, “inductive bias;” statisticians, “priors.”
This article reviews recent models of human learning and cognitive development arising at the intersection of these fields. What has come to be known as the “Bayesian” or “probabilistic” approach to reverse-engineering the mind has been heavily influenced by he engineering successes of Bayesian artificial intelligence and machine learning over the past two decades and, in return, has begun to inspire more powerful and more humanlike approaches to machine learning. As with “connectionist” or “neural network” models of cognition in the 1980s (the last moment when all these fields converged on a common paradigm for understanding the mind), the labels “Bayesian” or “probabilistic” are merely placeholders for a set of interrelated principles and theoretical claims. The key ideas can be thought of as proposals for how to answer three central questions:
- How does abstract knowledge guide learning and inference from sparse data?
- What forms does abstract knowledge take, across different domains and tasks?
- How is abstract knowledge itself acquired?