81

by

in

Competition Whatseemstomethemostpromisingapproachisbasedonthe notion of competition. I know that some will see competition as an implementation of sampling but the key difference, as I have been saying, isthat competition routinely happens without any need for a perceptual decision. One kind—not the only kind— of competition is involved in the ‘global workspace’ model of consciousness [40] (figure 3). The outer ring indicates the sensory surfaces of the body. Circles are neural systems and lines are links between them. Filled circles are activated systems and thick lines are activated links. Activated neural coalitions compete with one another to trigger recurrent (reverberatory) activity, symbolized bythe ovals circling strong lyactivated networks. Sufficiently activated networkstrigger recurrent activity in cognitive areas in the centre of the diagram and they in turn feed back to the sensory activations, maintaining the sensory excitation until displaced by a new dominant coalition. Not everyone accepts the global workspace theory as a theory of Phil. Trans. R. Soc. B 373: 20170341hierarchy of modular processors losing representations automatically activated processors winning representations high-level processors with strong long-distance interconnectivity processors mobilized into the conscious workspace Figure 3. Schematic diagram of the global workspace. I am grateful to Stan Dehaene for supplying this drawing. Dark pointers added. consciousness (including me), but it does serve to illustrate one kind (again, not the only kind) of competition among sensory activations that in many circumstances is ‘winner-takes-all’, with the losers precluded from consciousness. (I have argued that recurrent activations confined to the back of the head can be conscious without triggering central activation. Because of local recurrence, these are ‘winners’ in a local competition without triggering global workspace activation [41]. Strong recurrent activations in the back of the head normally trigger ‘ignition’, in which a winning neural coalition in the back of the head spreads into recurrent activations in frontal areas that in turn feed back to sensory areas (figure 3). As Dehaene et al. [42] have shown, such locally recurrent activations can be produced reliably with a strong stimulus and strong distraction of attention. As I am concerned in this paper with normal perception, I will ignore my disagreement with the model here.) Anothercaseinwhichlosingrepresentationsare precluded from consciousness is rivalry, both rivalry that can be experienced with one or both eyes (as with the Necker cube and standard figure/ground stimuli) and binocular rivalry. In rivalry, alternative representations compete for dominance because of mechanisms of reciprocal inhibition. The losing representations rise again when the dominant perceptions are weakened by adaptation. One evolutionary explanation for reciprocal inhibition is that vision has to cope with damage to the eye in which there is some distorted registration that must be inhibited in favour of a dominant percept [43]. In this winner-takes-all competition, the mechanism is competition and dominance. Although the explanation of rivalry in terms of reciprocal inhibition and adaptation is very well confirmed [44], there are Bayesian accounts, including accounts based on sampling, that have some utility in predicting some specific details of the dominance of rivalrous stimuli. Some of these accounts take neuralnoisetobethefactorthattriggersswitches[45], whereas others suggest the driving factor is predictions in the frontal cortex in triggering switches [43,46,47]. In addition, there are Bayesian approaches to adaptation itself [48]. The sampling accounts in this application avoid the problem for sampling mentioned earlier of explaining perceptual decision rather than perception, because the rivalrous states are first and foremost rivalrous perceptions rather than perceptual decisions, and also obtain when there is no task [49]. However, what is most obviously probabilistic about rivalry is the transitions between perceptions, because one cannot predict the time or length of one episode of dominance on the basis of those that preceded it [50]. Sampling accounts canmodel probabilistic transitions among non-probabilistic representations rather than probabilistic representations [51]. One could say that Bayesian theories of perceptual transitions involve ‘implicit’ probabilistic representation [51], but it is explicit perceptual representation that leads to the question of the title of this article. Further, Bayesian models do not supplant models that appeal to adaptation and competition (reciprocal inhibition), but rather provide a framework for integrating rivalry with other perceptual phenomena [52]. Interestingly, in some perceptual situations, not only is the losing representation suppressed—its putatively probabilistic aspect is repressed too. Hakwan Lau and Megan Peters and their colleagues recorded from intracranial electrodes in epilepsy patients as they were viewing noisy stimuli that could either be faces orhouses.Theyfoundthatface/housedecisions were based on the strength of both face and house representations but that confidence judgements did not take into account the strength of the decision-incongruent representation [53]. The face representation can beat out the house representation by a slight margin but only the strength of the face representation is involved in determining confidence. This result is compatible with probabilistic representation but suggests limitations on it.1 Like Trump, consciousness likes winners, not losers, even if the losers are almost on a par with the winners, probabilistically speaking. In a binocular rivalry set-up if the two pictures are locally compatible, the perception will reflect merger of the pictures. For example, a male and a female face that are locally compatible will be seen as a combined androgynous face [55]. (Local compatibility has to do with whether small patches are lighter or darker than the background. If they are locally of opposite polarity (one patch lighter, one darker), then they are incompatible.) So, in a binocular set-up, competing images can lead to rivalry or to merging, depending on local compatibility. Merging can sometimes involve patches of both stimuli. These two modes are often contrasted, with only rivalry being classified as ‘winner-takes-all’. But merging can also be considered as a kind of winner-takes-all process that is different from rivalry in which the male and female faces are losers and the androgynous face is the winner. 5 rstb.royalsocietypublishing.org Phil. Trans. R. Soc. B 373: 20170341180 A similar process occurs in perception of motion direction. When neurons representing opposite directions are stimulated, the result is that one direction wins. When neurons represent different but not opposite directions, there is a kind of vector averaging process [8]. In both cases, varying representations give way to a single winner. Using electrodes that stimulate areas of MT in monkey cortex, Nichols and Newsome were able to show that when there are activations representing directions that differ by more than 1408, one direction is completely suppressed (as in rivalry), whereas when they differ by less than 1408, the result is a perception that averages the vectors. Theproposalthenisthat we shouldthinkabout population codesintermsofcompetitionfordominance.Whatisconscious is the result of competition, the competing representationsprior to that resolution being unconscious. Competing representations resolve either with the weak dying out so that the strong can live or with merging or averaging. The big advantage of the framework of competition over sampling is that competition does not require a perceptual decision. Why are the losing representations not represented in consciousness? It may be that consciousness requires a minimal level of strength of activation or local recurrent circuits, both of which have independent support [56,57]. The global workspace and higher order accounts are alternatives. Myhope is that getting clear about the role of competition in perception will help to guide research on this question. Advocates of sampling may say that competition is just an implementation of sampling and that losing representations are just representations that represent low probabilities. Further, the strongest activations do not always win, and that could be used to suggest probabilistic representation. Although the strongest, most skilled and heaviest wrestler probably will win, that does not show that the wrestlers represent probabilities or that representations of probabilities are involved in wrestling matches. One event can be more probable than another without any representation of probabilities. To reiterate: the competition framework does not require the imposition of cognitive categories and so is distinct from the sampling framework. I do not deny that competition can be understood in probabilistic terms. Winning a competition could be described as a probabilistic decision. Any detailed model of competition could be described in probabilistic terms. My conclusion is not that the probabilistic view is false but that it should be understood instrumentally rather than as describing actual probabilistic representations. As I will argue in the next section, the best attitude towards the Bayesian formalism is an ‘as if’ or instrumentalist attitude, and that attitude is very common in Bayesian writing. There are a numberof experimental studiesthat purport to showprobabilistic representations in human visual cortex. van Bergen et al. [34] start by conceding, of the probabilistic hypothesis that ‘direct neural evidence supporting this hypothesis is currently lacking’. They purport to remedy this situation. They showed subjects randomly oriented grids while doing brain scans (fMRI), focusing on early visual cortical areas (V1,V2, V3). Subjects were required to rotate a bar to match the orientations they saw, giving the experimenters a behavioural measure of precision of response. Using fMRI, they were able to decodetheorientationssubjectswereseeing.Theymeasured the ‘cortical uncertainty’ of orientations in an individual perception. The measure they adopted is not easy to describe in 6 decoded orientation (°) 90 0 90 presented orientation (°) 180 Figure 4 A graph of decoded orientation versus actual orientation of the stimulus. Thanks to Weiji Ma for this figure. See [34], Supplementary Material 1. a non-technical way but what is easy to describe is the way they chose among various candidates: by looking for the measure of uncertainty in an individual perception that correlated best with variation from perception to perception. This variation is depicted in figure 4. The width of the distribution in the graph of actual orientation versus decoded orientation is a measure of ‘cortical uncertainty’ over time. There are three results, all involving the notion of ‘cortical uncertainty’. My response to those results is that what they call ‘cortical uncertainty’ is equally well described as ‘degree of cortical competition’. One reason that the dots in figure 4 are scattered instead ofclusteredtightly isthat neurons respond to many different orientations, creating many competing representations for each stimulus. Of course, degree of cortical competition can be regarded as an implicit representation of uncertainty. But, merely implicit probabilistic representation does not give rise to the puzzle of the title of this article. The first of van Bergen et al.’s three results has to do with something called the ‘oblique effect’, a phenomenon not mentioned here so far (and one that has nothing to do with the discussion of orientations earlier). The phrase ‘oblique effect’ refers to the phenomenonthatsubjectsaremoreaccurate in reporting cardinal (horizontal and vertical) orientations than for oblique orientations. The van Bergen result is that their measure of cortical uncertainty was higher for oblique than for cardinal grids. They say cortical uncertainty explains behavioural uncertainty. However, degree of cortical competition gives essentially the same explanation, but without commitment to probabilistic representation. van Bergen et al. also showed (this is the second result) that when they presented the same orientation repeatedly, subjects’ behavioural precision was predicted by the cortical uncertainty. Again, this fact can be seen as behavioural precision predicted by degree of cortical competition. The third result is the most impressive. They argue that the visual system tracks its own uncertainty. It is well known that subjects’ orientation judgements are biased towards oblique and against cardinal orientations (a different sort of oblique effect). They found that when cortical uncertainty was high, the bias towards oblique orientations was stronger than when cortical uncertainty was low, suggesting that the visual system monitors its own uncertainty on a trialby-trial basis, relying more on bias when cortical uncertainty rstb.royalsocietypublishing.org Phil. Trans. R. Soc. B 373: 20170341(adaptive strategies may be complex for usto understand, without necessarily being complex for organisms to implement). Bacteria …fungi …, and plants generate flexible and impressively complex responses through ‘decision’ processes embedded in their physiological architecture, implementing adaptive responses that work well under a limited set of ecological circumstances (i.e. that are ecologically rational) In sum,sensitivity touncertaintydoesnotrequirerepresentation of anything, including uncertainty. 7. Why Bayesian approaches do not require probabilistic representation Figure 5. This is the ‘graphical abstract’ for [60]. Reproduced with permission from Elsevier. (Online version in colour.) is high. But ‘monitoring competition’ and ‘monitoring uncertainty’ can be descriptions of the same facts. Similar points apply to the observation that the weight giventodifferentsenseswhentheyareintegratedinperception depends on the relative reliability of those senses and how quickly it can be computed [58,59]. ‘Monitoring reliability’ and ‘monitoring competition’ can refer to the same process. van Bergen et al. conclude that this is ‘strong empirical support for probabilistic models of perception’ (p. 1729), but their results do not distinguish between instrumentalist and realist construals of this claim. There are independent grounds for caution in concluding that perceptual representations are probabilistic or that uncertainty is represented in perception. A useful corrective comes from a recent study of pea plants that shows that growth of roots of pea plants involves sensitivity to variation in nutrients [60]. Individual pea plants had their roots separated into different pots as indicated in figure 5. The conditions could be rich (lots of nutrients) or poor, and variable (i.e. fluctuating) or constant. In rich conditions, the plants grew a larger mass of roots in the constant pot; in poor conditions, the plants grew a larger mass of roots in the variable plot. As the authors note, the plants were risk prone in poor and risk averse in rich conditions, fitting the predictions of risk sensitivity theory. Weretheplantsmonitoringtheuncertaintyinnutrients reachingtheir roots?Theplantshavenonervoussystemandno onehasfoundanythingthatcouldbecalledarepresentationof uncertainty. Anytalkofplants‘monitoring’ uncertaintywould have toberegardedas‘asif’ talk unlessthere is evidence to the contrary. I suggest we should take a similar attitude towards the sensitivity to uncertainty shown in the van Bergen study: it should be understood in an ‘as if’ framework unless we have evidence for a more realistic interpretation. The conclusion of Dener et al. (p. 1766) fits with my methodological suggestion: Plants’ risk sensitivity reinforces the oft-repeated assertion that complex adaptive strategies do not require complex cognition One argument for probabilistic representation in perception is that Bayesianmodelsofperceptionhavebeenhighlysuccessful and that they (putatively) presuppose probabilistic representation. I will argue that on the most plausible construal of Bayesian models, they do not presuppose probabilistic representation. Bayesian accounts of visual perception compute the probability density functions of various configurations of stimuli in the environment on the basis of prior probabilities of those environmental configurations and likelihoods of visual ‘data’ if those environmental configurations obtain. (Visual data are often taken to be activations in early vision.) Bayes’ theorem states that the probability of a hypothesis about the environment (e.g. that there is a certain distribution of colours on a surface) given visual data is proportional to the prior probability of that hypothesis multiplied by the probability of the visual data given the hypothesis. If h is the environmental hypothesis, e is the evidence from visual data and p(hje) is the probability of h given e, then p(hje) is proportional to p(ejh) p(h). p(ejh) is the ‘likelihood’ (of the visual data given the environmental hypothesis) and p(h) is the prior probability of the environmental hypothesis. (An equivalence rather than a statement of proportionality requires a normalizing factor, so that probabilities sum to 1.) In Bayesian updating, the system uses the previous probability of the environmental hypothesis as the prior in changing the hypothesis about the environment in response to newvisualdata.So,Bayesianupdating requires multiplying one’scurrentpriorprobabilityestimatetimesone’scurrentestimate of likelihood to get the probability of the environmental hypothesis, given current stimulation. Then the posterior probability of the environmentalhypothesisbecomesthenewprior. The most plausible version of these theories are hierarchical in that the visual system is divided into stages with distinct priors and likelihoods at each stage. In the ‘predictive coding’ version of the account, predictions in the form of priors are sent down the visual hierarchy (i.e. towards the sense organs) while error signals (the prediction minus the data) are sent upwards [52,61]. Whatwouldshowthatsomethingthatdeservestobecalled Bayesian inference actually occurs in perception? In the most straightforward implementation, there would have to be perceptual representations of prior probabilities for alternative hypotheses, perceptual representations of likelihoods and some process that involves something that could be described as multiplication of these values. (Additional arithmetic complexity would be added by utility functions that compare the utility of the various environmental hypotheses.) It is common for those who emphasize Bayesian processes in perception to appeal to global optimality. Many perceptual 7 rstb.royalsocietypublishing.org Phil. Trans. R. Soc. B 373: 20170341processes are Bayes optimal but many are not. As Rahnev & Denison [62] note in a review of suboptimal processes in perception, there is anextensive literature documenting suboptimal performance. In any case, Bayes optimality is neutral between instrumentalist and realist construals. Often, Bayesian theories of perception are held as computations in an ideal observer, an observer who uses Bayesian principles to optimally compute what is in the environment on the basis of visual data. Ideal observer theories are instrumentalist in that they are not committed to the representation in real visual systems of priors or likelihoods or their multiplication within the system. Bayesian models, construed from the ideal observer point of view, do not licence attributions of probabilistic representation [63,64]. For example, Maloney & Mamassian [65] show how non-Bayesian reinforcement learning can result in behaviour that comports well with an ideal Bayesian observer. Sanborn & Chater [66] argue that an approximation process that samples from representations but does not compute over probabilities would mimic standard probabilistic fallacies in reasoning. They suggest implementation of the sampling process in a connectionist network of a sort that would not plausibly support probabilistic representation. However, some Bayesian accounts are more ‘realist’ about priors and likelihoods (and utilities) that are represented explicitly in perceptual systems. A major problem with realist theories in which Bayesian inference literally takes place in the brain is that the kind of Bayesian computations that would have to be done are known to be computationally intractable [66]. So, any realist version of Bayesianism will have to tell us what exactly is supposed to be involved in the computations. MichaelRescorlaarguesforarealistversionofBayesianism in which priors are explicitly represented [64,67]. He does not say explicitly that likelihoods are explicitly represented and that the multiplication of one by the other is real. Rescorla’s argument is based on the fact that we have good Bayesian models of how priors evolve in response to changing environmentalconditions.Forexample,suchmodelspredictthatifone exposes a subject to stimulation in which luminance and stiffness are correlated, the priors will change so that stiff objects are seen as more luminant. And, this prediction is born out. Further, the ‘light comes from overhead’ prior can be changed by experience. Overall, he says, a realist interpretation yields explanatory generalizationsthat wouldbemissedonaninstrumentalist interpretation. The principle is that the best explanation of successful prediction is that the entities referred to in the theoriesthat generate the prediction reallyexist and to a first approximation really have the properties ascribed to them in the theory [68]. The specific application here is that our ability to predict how priors will change supports the hypothesis that priors are really represented in perception. I find this argument unconvincing because whatever it is about the computations of a system that simulates the effect of represented priors (for example, the proposal by Sanborn & Chater) might also be able to simulate the effect of change of priors. Without a comparison of different mechanisms that can accomplish the same goal, the argument for realism is weak. Further, perception is an inherently noisy process in part because the neural processing is characterized by random fluctuations. The representations must be regarded as approximate. But what is the difference between approximate implementation of Bayesian inference and behaving roughly as if Bayesian inference is being implemented [7,64]? Until this question is answered, the jury is out on the dispute between realist and anti-realist views. Recent debates about Bayesianism in perception have appealed to David Marr’s famous three levels of description of perception. The top level, the computational level, specifies the problemcomputationally, whereasthenextleveldown,the algorithmic level, specifies how the input and output are represented and what processes are supposed to move from the inputandoutput.TouseoneofMarr’sexamples,inthecharacterization of a cash register, the computational level would be arithmetic. Onevariantofthealgorithmiclevelwouldspecifya base 10 numerical systemusingArabic numeralsplusthetechniques that elementary school students learn concerning adding the least significant digits first. An alternative to this type of algorithm and representation might use binary representation and an algorithm level involving AND and X-OR gates [69]. The lowest level, the implementation level, asks how the algorithms are implemented in hardware. In an oldfashioned cash register, implementation would involve gears and in older computer implementations of binary arithmetic, magnetic cores that can be in either one of two states [70]. Many prominent Bayesians say that most Bayesians are working at the computational level. For example, Griffiths et al. [2]: ‘Most Bayesian models of cognition are defined at Marr’s [71] “computational level,” characterizing the problem people are solving and its ideal solution. Such models make no direct claims about cognitive processes—what Marr termed the “algorithmic level”’. In sum, the Bayesian perspective is powerful, but it does not require a realist or algorithmic interpretation. Instrumentalist versions of Bayesianism as giving explanations at Marr’s computational level are well supported and are not committed to probabilistic representations. 8. Conclusion Myproposal isthat competition among unconscious representations yields conscious representations through winner-takesall processes of elimination or merging. The competition framework does not require any particular task or cognitive categorization and in that respect is better than the sampling framework. The process can be considered Bayesian but only on an instrumentalist interpretation pitched at Marr’s computational level rather than the algorithmic level. Perhaps, the strongest challenge to my account is Bayesian sampling accounts of competition, especially the use of samplingmodelstopredictsomeofthedetailsofbinocularrivalry [45,46]. However, (i) the conflict between different Bayesian models (noise versus predictions as the driving factor), (ii) the fact that probabilistic transitions in rivalry do not require probabilistic representations, (iii) the point made in connection with pea plants, and (iv) the strong considerations in favour of the computational rather than algorithmic level in Bayesian explanations counter the challenge. To head off one misinterpretation, I am not suggesting that there is a single stage of processing (a ‘Cartesian theatre’ [72]) where competition is resolved. The competition at any stage of the visual hierarchy may perhaps be resolved at the same stage or at a higher stage, but that does not entail that there is a single stage at which everything is resolved. 8 rstb.royalsocietypublishing.org Phil. Trans. R. Soc. B 373: 20170341In sum, my answer to the question ‘If perception is probabilistic, why doesn’t it seem probabilistic?’ is that we would do well to think of probabilities in perception instrumentally, avoiding the realist interpretations that motivate the question of the title. Data accessibility. This article has no additional data. Competing interests. I declare I have no competing interests. Funding. I received no funding for this study. Acknowledgement. I am grateful to Susan Carey, Rachel Denison, Randy Gallistel, Steven Gross, Thomas Icard, Weiji Ma, Ian Phillips, Stephan References 1. 2. Ma WJ, Beck JM, Latham PE, Pouget A. 2006 Bayesian inference with probabilistic population codes. Nat. Neurosci. 9, 1432–1438. (doi:10.1038/ nn1790) Griffiths T, Chater N, Norris D, Pouget A. 2012 How the Bayesians got their beliefs (and what those beliefs actually are): comment on Bowers and Davis (2012). Psychol. Bull. 138, 415–422. (doi:10.1037/ a0026884) Pohl and an anonymous reviewer for discussion and comments on a previous draft.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *