Neural Network Models
One benefit of attempting neural analyses of explanation is that it becomes possible to incorporate multimodal aspects of cognitive processing that tend to be ignored from deductive, schematic, and probabilistic perspectives. In medicine, for example, doctors and researchers may employ visual hypotheses (say about the shape and location of a tumor) to explain observations that can be represented using sight, touch, and smell as well as words. Moreover, the process of abductive inference has emotional inputs and outputs, because it is usually initiated when an observation is found to be surprising or puzzling, and it often results in a sense of pleasure or satisfaction when a satisfactory hypothesis is used to generate an explanation. Here it is an outline of this:
The process of abductive inference
The framework used proposes three basic principles of neural computation:
- Neural representations are defined by a combination of nonlinear encoding and linear decoding.
- Transformations of neural representations are linearly decoded functions of variables that are represented by a neural population.
- Neural dynamics are characterized by considering neural representations as control theoretic state variables.
These principles are applied to a particular neural system by identifying the interconnectivity of its subsystems, neuron response functions, neuron tuning curves, subsystem functional relations, and overall system behavior. The complexity of a representation is constrained by the dimensionality of the neural population that represents it. In rough terms, a single dimension in such a representation can correspond to one discrete "aspect" of that representation (e.g., speed and direction are the dimensional components of the vector quantity velocity). A hierarchy of representational complexity thus follows from neural activity defined in terms of one-dimensional scalars; vectors, with a finite but arbitrarily large number of dimensions; or functions, which are essentially continuous indexings of vector elements, thus ranging over infinite dimensional spaces. This framework provides for arbitrary computations to be performed in biologically realistic neural populations, and has been successfully applied to phenomena as diverse as lamprey locomotion, path integration by rats, and the Wason card selection task. The Wason task model, in particular, is structured very similarly to the model of abductive inference discussed. Both employ
holographic reduced representations, a high-dimensional form of distributed representation.
Holographic reduced representations (HRRs) combine the neurological plausibility of distributed representations with the ability to maintain complex, embedded structural relations in a computationally efficient manner. This ability is common in symbolic models and is often singled out as deficient in distributed connectionist frameworks. HRRs have the important advantage of fixed dimensionality: the combination of two n-dimensional HRRs produces another n-dimensional HRR, rather than the 2n or even n2 dimensionality one would obtain using tensor products. This avoids the explosive computational resource requirements of tensor products to represent arbitrary, complex structural relationships. HRR representations are constructed through the multiplicative circular convolution (denoted by x) and are decoded by the approximate inverse operation, circular correlation (denoted by #). In general if C = AxB is encoded, then C#A ≈ B and C#B ≈ A. The approximate nature of the unbinding process introduces a degree of noise, proportional to the complexity of the HRR encoding in question and in inverse proportion to the dimensionality of the HRR. As noise tolerance is a requirement of any neurologically plausible model, this loss of representation information is acceptable, and the "cleanup" method of recognizing encoded HRR vectors using the dot product can be used to find the vector that best fits what was decoded. Note that HRRs may also be combined by simple superposition (i.e., addition): P = QxR + XxY, where P#R ≈ Q, P#X ≈ Y, and so on. The operations required for convolution and correlation can be implemented in a recurrent connectionist network.
In brief, the new model of abductive inference involves several large, high-dimensional populations to represent the data stored via HRRs and learned HRR transformations (the main output of the model), and a smaller population representing emotional valence information (abduction only requires considering emotion scaling from surprise to satisfaction, and hence only needs a single dimension represented by as few as 100 neurons to represent emotional changes). The model is initialized with a base set of causal encodings consisting of 100-dimensional HRRs combined in the form
antecedent x 'a' + relation x causes + consequent x 'b',
as well as HRRs that represent the successful explanation of a target
'x' (
expl x 'x'). For the purposes of this model, only 6 different "filler" values were used, representing three such causal rules (
'a' causes
'b',
'c' causes
'd', and
'e' causes
'f'). The populations used have between 2000 and 3200 neurons each and are 100- or 200-dimensional, which is at the lower-end of what is required for accurate HRR cleanup. More rules and filler values would require larger and higher-dimensional neural populations, an expansion that is unnecessary for a simple demonstration of abduction using biologically plausible neurons.
Following detection of a surprising
'b', which could be an event, proposition, or any sensory or cognitive data that can be represented via neurons, the change in emotional valence spurs activity in the output population towards generating a hypothesized explanation. This process involves employing several neural populations (representing the memorized rules and HRR convolution/correlation operations) to find an antecedent involved in a causal relationship that has
'b' as the consequent. In terms of HRRs, this means producing (
rule # antecedent) for [
(rule # relation ≈ causes) and
(rule # consequent ≈ 'b')]. This production is accomplished in the 2000-neuron, 100-dimensional output population by means of associative learning through recurrent connectivity and connection weight updating. As activity in this population settles, an HRR cleanup operation is performed to obtain the result of the learned transformation. Specifically, some answer is "chosen" if the cleanup result matches one encoded value significantly more than any of the others (i.e., is above some reasonable threshold value). After the successful generation of an explanatory hypothesis, the emotional valence signal is reversed from surprise (which drove the search for an explanation) to what can be considered pleasure or satisfaction derived from having arrived at a plausible explanation. This in turn induces the output population to produce a representation corresponding to the successful dispatch of the explanandum
'b': namely, the HRR
explb= expl x 'b'. Upon settling, it can thus be said that the model has accepted the hypothesized cause obtained in the previous stage as a valid explanation for the target
'b'. Settling completes the abductive inference: emotional valence returns to a neutral level, which suspends learning in the output population and causes population firing to return to basal levels of activity.
The basic process of abduction outlined previously maps very well to the results obtained from the model. The output population generates a valid hypothesis when surprised (since "
a causes b" is the best memorized rule available to handle surprising
'b'), and reversal of emotional valence corresponds to an acceptance of the hypothesis, and hence the successful explanation of
'b'. In sum, the model of abduction outlined here demonstrates how emotion can influence neural activity underlying a cognitive process. Emotional valence acts as a
context gate that determines whether the output neural ensemble must conduct a search for some explanation for surprising input, or whether some generated hypothesis needs to be evaluated as a suitable explanation for the surprising input.
Models of Scientific Explanation