A Bayesian unconscious?

A “Bayesian revolution” would gradually infuse cognitive science. This theory, based on a famous theorem in statistics — Bayes’ theorem — upsets our view of the unconscious. This would be based on a hierarchical architecture of the brain making this organ a “prediction machine”.

10 min readNov 13, 2021

Two urns contain, one 10 black and 30 white balls, the other 20 of each color. A ball is drawn at random from one of the ballot boxes, chosen at random. The ball is white. What is the probability that it comes from the first urn? To answer this question, a famous theorem attributed to Thomas Bayes in the 18th century is useful. Schematically, it makes it possible to update a probability from the observations. Far from being anecdotal, this theorem has important applications, in medicine for example, to analyze the results of tests taking false positives into account, or in neurosciences and artificial intelligence to understand how a human or a robot can learn to based on uncertain information.

Indeed, Bayesian theory could fundamentally change our understanding of brain function. At the heart of this transformation is the Bayesian Brain Hypothesis, a new theory in neuroscience. The foundations of this hypothesis date back to the work of Hermann von Helmholtz in the mid-19th century. Studying visual perception, the Prussian physiologist discovers that our brain is not a passive information processing system, but rather an inference generator predicting sensory input at all times. For Helmholtz, the brain is thus a “predictive machine” making predictions that he calls “unconscious inferences”, and which would interfere with our conscious processes. This hypothesis was then gradually enriched with numerous mathematical formulations and experimental proofs, resulting in the principle of predictive coding and that of free energy. What is it about ?

Predict and Minimize

Predictive coding theory suggests that the brain is constantly developing models of its environment, known as “beliefs,” and then using these models to predict future sensory inputs. When a mismatch between the predictions and the stimuli appears, the brain generates a prediction error, and uses this new information to modify its beliefs, and thus improve its future predictions. In a sense, this theory assumes that our brains “internalize” the causal structure of the world — that is, how it works — to predict how these sensations are generated.

“Technically, Bayesianism is a process of determining the best hypothesis explaining the observed states”

The principle of free energy, introduced in the early 2000s by British neuroscientist Karl Friston, offers a mathematical formulation of predictive coding. He assumes that prediction error produces uncertainty for the brain, and associates this degree of uncertainty with free energy, a concept derived from thermodynamics. According to Karl Friston, the main function of the brain is then to reduce this free energy by minimizing prediction errors. To do this, he can modify either the models of the world to match them with the sensory stimuli, that is to say “update” his belief, or the sensory inputs by acting differently in the world, in particular by realizing a action.

An example. In the Amazon jungle, at dusk, your visual system detects a fleeting orange patch in the foliage: you can reduce the uncertainty associated with this perception by generating a belief about your surroundings (“I believe there is a jaguar behind the bushes ”) or by taking action to refine your knowledge of the environment (“ I get closer to the vegetation to see what is hidden there ”). In both cases, the uncertainty associated with sensory input decreases. In this two-way relationship, the world provides the sensory data that forms the basis of inference, and the brain acts on the world to alter sensory flow. The principle of free energy minimization then constitutes a unifying process, able to explain both perception, learning and decision making in an uncertain world.

In a given environment, based on beliefs, our brain makes predictions about possible sensory inputs and programs actions based on them. The discrepancy between its predictions and reality generates prediction errors which are used to update beliefs.

Bayesian algorithms in the brain?

Bayesianism finds all its importance here. Indeed, we can “algorithmically” implement the principles of predictive coding and free energy minimization by applying the precepts of Bayesian inference. Technically, this is a process of determining the best hypothesis explaining the observed states. In the brain, it could help reverse the problem by generating a model of the causes of sensory input used to estimate what it should be like in reality. We thus go from a simple inductive process deducing the cause from the sensory input, inflexible in a situation of uncertainty, to an inference of the cause of the sensory input from its plausibility.

A little bit of probabilities
The origin of Bayesianism lies in the posthumous article by the English reverend and mathematician Thomas Bayes (1702–1761), “An essay towards solving a problem in the doctrine of chances”. This work published by his friend Richard Price in 1763 exposes what will become the “Bayes theorem”:
Let A and B be two events, Bayes’ theorem allows us to determine the probability of A knowing B, if we know the probabilities of A, B and B knowing A. In other words, this theorem describes the method optimal for updating a belief under conditions of uncertainty. It is widely used in artificial intelligence, behavioral engineering, neuroeconomics, social sciences, experimental psychology, and cognitive neuroscience to describe or model behavior.

The Bayesian brain hypothesis thus assumes that the brain can decipher what causes sensory input by constantly combining probabilistic information from its sensory organs with its predictions about the possible causes of that information. This reverse processing, from the result to its origin, reduces the ambiguity inherent in sensory information. The brain can then discriminate more finely between situations where the same sensory inputs are caused by different causes, or when different causes cause the same sensory inputs.

Bayesian inference also makes it possible to model the effect of uncertainty on updating beliefs after a prediction error. So when the information you perceive is uncertain, it is better to keep your old beliefs than to change all your models of the world. Conversely, when we receive new, very specific information that contradicts what we think, our beliefs can evolve to adapt effectively to the changes. In the end, we obtain a flexible system that adapts to the statistical structure of the world according to our experiences. But does this simplified model reflect how our brains actually work?

A hierarchy of beliefs

It would be too easy! Our world is too complex for such a simple algorithm. The sensory signals we perceive come from an ever-changing dynamic environment, with a multiplicity of interlocking causal structures. To take this into account, Bayesian brain theory suggests that our nervous system processes different sensory signals simultaneously in a “hierarchical” fashion. Rather than comparing a single probability to sensory evidence, our brains manipulate a hierarchy of beliefs at different scale levels of spatiotemporal, logical, and abstractional involvement.

Let’s go back to the jungle. The generation of the belief that a jaguar is hidden under the foliage is produced by the conjunction of a multitude of nested statistical models (the probability that a fawn-colored animal in the jungle is a jaguar, the probability that an animal hiding before jumping either a jaguar, the dangerousness of the uncertainty about the presence of a possible jaguar…). The generation of inferences, i.e. hypotheses about the world, from a prediction hierarchy with a large temporal, spatial and causal grain helps to reduce the massive ambiguity of the multiple sensory inputs, and to obtain a more or less optimal representation of the environment. From a simple neurocognitive process, Bayesian inference then becomes a powerful adaptive mechanism.

A coded hierarchy in the nervous system

As elegant as it is theoretically, is Bayesian theory supported by the neural organization of the brain? Several works have suggested that inferences would be made by a hierarchical assembly of top-down neural connections that implement predictions, and bottom-up connections that transmit prediction errors. These predictions and errors could in particular be encoded by particular neurons associated with a vast network of dendrites and synapses, the pyramidal cells.

In this hierarchical predictive model, top-down predictions from higher levels of the hierarchy are used to reduce prediction errors from lower levels. At each level, only prediction errors that could not be explained by the predictions are passed to the next level, and can then be used to optimize beliefs about the world.

These new predictions are then transmitted to lower levels through descending neural connections, to generate new inferences. In this way, any new unexpected experience is compared to previous knowledge through a vast hierarchical network associating different temporal, causal and spatial scales. This hierarchical predictive coding process mediated by neuronal signaling would thus constitute the keystone of all cortical information processing.

A hierarchy of representations

Several works suggest that the hierarchy of beliefs is correlated with the complexity of representations. The higher you go, the more abstract and global the encoded representations, the greater the level of logical implication. The further down we go, the more the level of detail becomes stronger, but the more the prediction space decreases, until we arrive at extremely precise predictive models. The high-level brain areas devoted to complex representations thus send predictive signals to the primary sensory areas, partly determining the expected causal and perceptual patterns.

“For the philosopher Jakob Hohwy, the brain would thus be a “mirror of nature”, simulating at every moment the causal structure of the world.”

High-level causal models would allow the brain to make predictions over a long time frame or over complex causal associations. Conversely, low-level models in the hierarchy predict, albeit accurately, the nature of simple sensory input, but at best only seconds in advance. The brain can only deduce the consequences of its actions if it can model the future. This ability to make predictions over long time scales is therefore essential to define the actions to be carried out. The hierarchical structure allows him to make predictions about things that never happened and might never happen: a form of simulation of a counterfactual world.

For the philosopher Jakob Hohwy, the brain would thus be a “mirror of nature”, simulating at every moment the causal structure of the world according to the predictions of its statistical characteristics. Predictive coding and Bayesian brain theory could thus offer a global conception of our cerebral functioning, with an explanatory scope as important as that of psychoanalysis in the previous century.

The Bayesian unconscious

One of the most important aspects of the predictive network is the notion of logical implication. It designates the phenomenon by which a given belief can involve a set of others. We can illustrate this with the notion of transitivity of the Dutch philosopher Bas Van Fraassen: if I believe that a vase breaks when it falls on the ground, then I also believe that it breaks when it is thrown against the wall. , when struck with a hammer… In this predictive model, the hierarchy of representations defines complex relations of logical implication which ensure a global coherence within our network of beliefs. So, if I believe that a jaguar has four legs, I also believe that it does not have five legs, not six legs, and an endless number of other associated beliefs. Although I am not aware of these implicit beliefs, they exist because of the logical structure of my belief network, constituting a form of “Bayesian” unconscious.

In hierarchical Bayesian models, top-down predictions (green arrows) from upper cortical (N) levels are compared (in pink) to inputs from lower levels (E) to external stimuli (in red). The resulting prediction errors (blue arrows) move up the cortical hierarchy, and help adjust predictions (black arrows) at each level until the system can reduce them.
© Marie Marty

This process is however complex to model, because with a single experience or a single belief, one can often generate an infinite number of relations of logical implication. Our brain must, at all times, sort through all possible hypotheses based on past experiences. For example, if I believe that all copper objects conduct electricity, I can rationally believe that all future copper objects that I encounter will conduct electricity. On the other hand, if I believe that all the objects placed on my desk this morning can conduct electricity, this does not mean that I should believe that all the objects that will be on my desk in the next few months will be conducting or that everything conductive is on my desk.

Our brain must therefore represent precisely the type of logical implication maintained by each of the internal models that it maintains within its hierarchy of beliefs. This distinction is all the more important since most of these beliefs are not conscious and yet actively interact with perceptions, thoughts and actions. The hierarchy of beliefs that we have detailed thus represents a “matrix” determining cognition, a model which strangely recalls the concept of the unconscious mobilized by Freudian psychoanalysis.

From Freud to Bayes

When defining his conscious-unconscious-preconscious triptych, Sigmund Freud follows in the footsteps of illustrious predecessors (Thomas Laycock, Wilhelm Wundt …). He postulates that unconscious cognitive processes can causally determine conscious processes and that conscious processes can become unconscious, that is, anything that is not within the scope of consciousness at a given time remains silently active and can continue to exist. interfere with conscious processes.

The causal influence of the unconscious on the conscious has since been illustrated by a great deal of experimental evidence. However, a multitude of epistemological factors have caused the split between neurocognitive theories of the unconscious and Freudian theory, favoring the emergence of alternative concepts such as that of “cognitive unconscious”, of which the Bayesian formulation is one of the nuances. .

Despite this dissonance, a number of sticking points were highlighted between the Freudian and the cognitive unconscious. In the hierarchical Bayesian brain hypothesis, the unconscious is “structural” in the sense that it relies on the functional architecture of the brain, that is, the organization and connectivity of neurons and synapses. The representations encoded therein are not intended to become directly conscious, but they nevertheless participate in consciousness at different levels of the cortical hierarchy. They constantly feed the consciousness of a multitude of hierarchical predictions simulating the causal, logical and temporal structure of the world. They then directly influence perception and decision-making, without our necessarily being aware of this influence. Mirroring with the Freudian unconscious, these unconscious representations can become conscious during perceptual inference, and conscious representations can become unconscious during the hierarchical updating of beliefs.

So what is the impact of consciousness on these unconscious processes encoded in the functional anatomy of the cortex? As we move through our environment, our brain inscribes the statistical structure of the world into its functional anatomy, in order to then simulate the causal framework of its environment. Changes in synaptic connectivity allow us to encode this experience by altering our beliefs. Rather than seeing consciousness as an instance of control regulating unconscious phenomena, we can then imagine it as a catalyst of predictive expectations and sensory inputs, defining the phenomenological structure of new associations to be encoded. Through it, by capitalizing on the legacy of our past and on a long-term simulation of our future, we minimize the uncertainty of the world.