How Should We Understand the Distinction 0005464032.INDD 509 509 09-10-2022 11:32:13In paradigm cases, what happens when a perceptual representation is conceptualized is that it is broadcast in the global workspace. “On top of a deep hierarchy of specialized modules, a “global neuronal workspace,” with limited capacity, evolved to select a piece of information, hold it over time, and share it across modules” (Dehaene et al., 2017, p. 489). When a piece of information is held and shared in the global workspace, the perceptual information is enclosed in a cognitive envelope. Further, the perceptual information in the cognitive envelope is itself transformed as explained earlier. One reflection of the fact that working memory representations that contain perceptual materials are more abstract than perceptual representations is a difference in “tolerance.” Tolerance is a term used in the memory literature to describe whether the subject in a memory experiment regards an object as the same as one that was seen earlier. Visual long- term memory in humans is famously tolerant, especially in comparison to artificial intelligence programs that have a great deal of difficulty recognizing an object as the same one seen earlier but from a different vantage point angle (Schurgin and Flombaum, 2018). Schurgin and Flombaum showed that visual working memory is very tolerant, indeed substantially more tolerant than visual long- term memory. But perceptual representations are viewpoint- specific. An indication that the relevant features of object representations that is exploited in these experiments are cognitive aspects of the representations is that the links adverted to via the term “match” above may involve inference. The sound of a piano is said to “match” the picture of the piano. The sound of a dog barking is said to “match” the picture of the dog. Likewise, for a “match” between a sound and a picture of a train. Matching in this sense is inferential rather than perceptual. Jordan, et al. are aware of this possibility and they tried to hamper one form of inference by asking the subjects to memorize four digits presented before each trial. After the subjects give the matching response, they were to repeat the four digits. This was supposed to interfere with a strategy of coding the pictures verbally. But the matching can be inferential even if that inference is not accomplished in a verbal system. The subject does not have to state the premise and conclusion explicitly for the process to be inferential. Jordan et al. end up seeming to favor the hypothesis that I am suggesting, that the result concerns the working memory aspect of object files rather than their perceptual aspects: Alternatively, object file representations may not be intimately tied to any particular sensory modality. In this sense, object files should not be conceived of as visual or auditory, but rather as abstract amodal representations. Although no evidence to date can conclusively tease apart these alternatives, the existence of nonvisual object processing … may support the latter hypothesis. Such multisensory information could be bound in working memory via the episodic buffer’s linking of visual and verbal material. (Jordan et al., 2010, p. 501) Jordan et al. seem to be thinking that the results reflect abstract amodal aspects of working memory rather than perception. Another type of evidence for discursive perceptual object representations presented by Quilty- Dunn (2020b) involves transsaccadic memory. A saccade is a fast, ballistic movement of the eye, usually occurring two to three times per second. Visual processing is greatly reduced during a saccade, so the visual system must rely on memory to 510 0005464032.INDD 510 Ned Block 09-10-2022 11:32:13encode which objects in the scene after the saccade are the same as the ones in the scene before the saccade. If I am watching a horse race, my visual system must keep track of which horse is which as I saccade back and forth between them. There are indications that the same kind of object files that figure in the OSPB also have a role in the transsaccadic memory representations that are involved in tracking objects and guiding eye movements to them (Schut et al., 2017). As I understand him, Quilty- Dunn takes this as evidence that the object representations that are indexed by the OSPB are perceptual. However, there is ample evidence that transsaccadic memory representations are working memory representations. For example, Irwin (1992) did an analog of the Sperling experiment for transsaccadic memory. In Sperling’s experiment, subjects could recall only 3 or 4 items from an array of 12 but they could also recall 3 or 4 from any given row if cued after the stimulus had disappeared. Their iconic memory capacity was roughly 3 × 3.5, i.e., 10.5 letters. In Irwin’s transsaccadic memory version, subjects saw an array of letters at one fixation but were not given the cue until after they had moved their eyes to the new location. The result was that their memory capacity was about a third of that revealed in Sperling’s experiment. That suggests that the kind of memory involved is working memory, since that is a typical working memory performance for letters as stimuli. Irwin found that a mask presented within 40 ms of the stimulus had a significant impact, but there was no effect at periods longer than 40 ms (120 ms and 950 ms), suggesting that a visual icon is present but only very briefly, being wiped out by the saccade. (In the Sperling phenomenon, iconic memory lasts 200–300 ms.) Irwin concludes (p. 311), “It appears that transsaccadic memory retains visual aspects of a stimulus but perhaps for a brief time only.” (Irwin and Andrews, 1996) used a different procedure with similar results. Subjects saw an array of 6–10 colored letters in the center of the visual field together with a peripheral target to which subjects were supposed to move their eyes. The subjects then saccaded to the peripheral target at which time the central array disappeared and the peripheral target was replaced by an indicator of one of the positions that had been occupied by a letter. Subjects were supposed to report the letter and its color. The subjects could only do this via memory of the presaccade fixation, so this task uses transsaccadic memory. They could report the letter and its color for only three to four locations, the typical signature of working memory with this kind of stimuli. The fact that transsaccadic memory contains only some perceptual elements is widely appreciated. For example, Gordon et al. (2008) describe the Irwin and Andrews experiment as follows (p. 667): Contrary to what would be expected if transsaccadic memory had a very high capacity, Irwin and Andrews found that the subjects could report the color and identity of only 3–4 of the letters in the array. Interestingly, this capacity was very similar to that reported by Irwin (1992), who required subjects to report letter identity alone. Irwin and Andrews concluded that transsaccadic memory consists primarily of integrated object representations (which may include a number of object features), along with residual activity in the feature maps that underlie sensory processing. Subsequent work in which more complex stimuli were used also suggests that transsaccadic memory consists primarily of representations of a small number of objects in the scene). How Should We Understand the Distinction 0005464032.INDD 511 511 09-10-2022 11:32:13The point by Gordon et al. that the result by Irwin and Andrews (1996) and Irwin (1992) both come up with the limit of three to four even though one involved reporting two properties and the other reporting just one property comports with a well- known property of working memory, namely that its limit of three to four (with certain kinds of stimuli) is a matter of three to four items, independently of the number of features of those items. (To avoid misunderstanding, recall that the three- to four- item limit in working memory applies only with certain kinds of stimuli, including alphanumeric characters.) There is also evidence of long- term memory involvement in transsaccadic memory. Hollingworth and Henderson (2002) did an experiment in which subjects fixated naturalistic scenes while their fixations were being tracked with an eye- tracker. In one of their experiments, subjects were given a change- detection task. The experimenters decided on one of the objects in the scene as the target object. When subjects happened to fixate on it for more than 90 ms their attention was drawn to another part of the scene, and later a green square appeared, obscuring the object. Subjects had been instructed to fixate the green square and then decide as between two scenes which scene had the original object. Subjects were more than 80% correct even though numerous fixations had intervened between the original fixation and the fixation of the green square. The average number of intervening fixations was 4.6, and even with 9 fixations there was no sign of decreasing accuracy. The upshot is that there is a form of transsaccadic memory that integrates over multiple fixations. In other experiments, subjects retained object files for as long as 30 minutes. The authors conclude that there can be what they call “long- term memory object files.” As I understand Quilty- Dunn, he takes these transsaccadic memory results to indicate that the perceptual object representations before the saccade were not iconic. Here is his discussion of the analog of the Sperling experiment for iconic memory (Quilty- Dunn, 2020b, p. 826): Unlike in the Sperling experiments, however, participants only showed storage of three or four letters— the same limit for discursive object representations. This result falsifies the claim that icons are used in deriving object correspondence across saccades. … Since object correspondence needs to be computed by the visual system (and not merely by some post- perceptual process— cf. Block ms.), then there must be non- iconic representations in the visual system. But an alternative interpretation— bolstered by the masking experiment just described in which perceptual information lasts only 40 ms— suggests the opposite, that the perceptual object representations before the saccade were iconic and those iconic aspects do not survive the saccade very well. The upshot would be that transsaccadic memory is a form of working memory, or even long- term memory, with remnants of perception. So, it cannot be used in this way to show that perception is noniconic and conceptual. I have not yet addressed what many take to be the strongest argument for discursive format for perceptual object representations. In the multiple object tracking experiments mentioned earlier, subjects can track a number of objects despite radical changes in properties. This fact has been taken to suggest “syntactic separation” of the element 512 0005464032.INDD 512 Ned Block 09-10-2022 11:32:13that tracks and feature representations. For example, E. J. Green and Jake Quilty- Dunn (Green and Quilty- Dunn, 2021, p. 672) say object files involve explicit indexes, akin to demonstratives. There is strong reason to believe that these indexes are syntactically separate from any feature representations used to attribute features to the object … For example, indexes are plausibly maintained across changes in the feature representations held in an object file. Subjects can reliably track objects in MOT despite significant changes in colour, shape, and size during a trial …” My response should be clear by now: what they say may be true of the object files of working memory, but that does not show this conclusion applies to the object files of perception, i.e. to perceptual object representations. So should we just restrict the term “object file” to working memory representations and not use it to refer to perceptual object representations? I am afraid that the term “object file”— indeed, the concept of an object file as a perceptual object representation— is firmly ensconced in the perception literature, which is why in the first sentence of this article I quoted the definition of an object file from the Encyclopedia of Perception. The cleanest terminological revision would be to drop the term altogether. To sum up the argument of this paper: I mentioned three important differences between the perceptual information in the so called “object files” of perception and the “object files” of working memory (and of course, also singular thought using working memory representations). These differences are capacity, fundamental computations and task- specific computations. And of course the working memory (and subsequent thought) representations are enclosed in a cognitive envelope, unlike the representations of perception. I also argued that there is reason to believe that visual object representations are iconic and that evidence to the contrary can be explained away as dependent on working memory representations that enclose remnants of perception in a cognitive envelope. The term “object file” is used to apply to kinds of representations that are fundamentally different from one another and so the term is a source of confusion. We would be better off without it. Note 1 Steven Gross has described evidence that visual representation of letters is abstract enough to be common to lower- and uppercase letters. His article and my reply are forthcoming in Analysis Reviews.
Leave a Reply