How we perceive and recognise objects. The computational theory of perception
44 important questions on How we perceive and recognise objects. The computational theory of perception
What can be said about the cells in V1 (primary visual cortex)?
---> these neurons have relatively small and precise receptive fields
(p. 85 perceptual textbook)
What are the 3 theories of perception?
2. Ecological
3. Computational
What is the constructivist theory? Who was it proposed by?
---> Perception is inferred using cue and clues
- Higher grades + faster learning
- Never study anything twice
- 100% sure, 100% understanding
What is computational theory? Who was it proposed by?
---> Information-processing; everything is there in the image but needs to be made explicit
What is the definition of middle vision?
---> to successfully combine features into objects
(p. 92 perceptual textbook)
What is middle vision? How do we do this?
How we do this
- Finding Edges, contours, common fate, texture segmentation, similarity, proximity, parallelism, symmetry, synchrony.
- What you know to be true is also important: TOP-DOWN mechanisms
Middle vision carves up the retinal image into large scale objects.This leads to the global superiority effect
What is the global superiority effect?
---> this effect is consistent with the assumption that the first goal of middle vision is to carve the retinal image into large-scale objects
(p. 104 perceptual textbook)
What are the 5 principles that middle-vision uses to achieve its goals?
---> using Gestalt principles (similarity, proximity, parallelism, symmetry)
2. Split apart features that are not included in the object
---> using edge-finding processes that divide regions from each other
---> figure-ground mechanisms separate objects from the background
3. Use own knowledge about the object
4. Avoid accidents
---> avoid interpretations that require the assumptions of highly specific, accidental combinations of features or accidental viewpoints
5. Seek consensus and avoid ambiguity
---> eliminate all possibilities, thereby resolving the ambiguity and delivering a single solution to the perceptual problem
(p. 104 - 105 perceptual textbook)
What is meant by naive temporal theory? What is a limitation of this theory?
Limitation
- too many templates are required ---> if we needed a new template for every letter (for example) we would run out of brain
(p. 108 & 109 perceptual textbook)
What is a better theory compared to naive temporal theory in describing how we recognize objects?
i.e. the letter 'A' is being matched to its structural description
(p. 109 perceptual textbook)
What is a key component of structural theory?
- Marr = generalised cylinders ---> these cylinders could be scaled to represent differently shaped parts
- Biederman = geometric ions (geons)
These structural descriptions should be viewpoint-invarient
---> they should be equally recognizable from many different vantage points
(p. 109 & 110 perceptual textbook)
What is meant by viewpoint invariance?
(p. 110 perceptual textbook)
What is a problem with structural description theories?
(p. 110 perceptual textbook)
What is meant by object constancy?
Regarding object constancy, what have mental rotation studies found?
---> suggests the participants had to mentally rotate the objects back to the upright views they had stored in their memory
(Tarr & Pinker, 1990) - more info p. 110 perceptual textbook
What are the pathways for higher level visual processing?
2. Dorsal stream - involved in making appropriate movements to that object
As we move down to the temporal lobe (through the ventral stream) what happens to the receptive fields?
(p. 88 perceptual textbook)
What do lesions to the dorsal stream result in?
---> a deficit in making visually guided movements towards objects
What do lesions to the ventral stream result in?
---> patients are unable to recognise objects
Who investigated the relationship between the temporal lobe and object recognition?
---> large sections of the temporal lobe was lesioned in monkeys
Results
The monkeys behaved as though they could see but did not know what they were seeing
(later work found that the inferotemporal cortex of the temporal lobe is particularly important in the visual problems of these monkeys)
p. 88 perceptual textbook
What did Gross et al investigate?
Results
Cells in the inferotemporal cortex were discovered to have receptive fields that could spread over half or more of the monkey's field of view
Activating these cells:
- usual stimuli e.g. spots and lines didn't work well
- silhouette of a monkey hand worked well for some cells
- monkey faces excited other cells
---> are some cells specialised for certain objects?
(p. 88 perceptual textbook)
After the findings of Gross et al, what did Barlow (1972) propose?
---> small receptive fields and simple features of visual cortex are combined with greater complexity as one moves from striate cortex to inferotemporal cortex, eventually culminating in a cell that fires when you see a specific object
(p. 88 perceptual textbook)
What did Quiroga et al (2005) discover?
(p. 106 perceptual textbook)
Are the receptive fields of neurons in the inferior temporal lobe large or small?
---> they can see a lot of the world = gives a more global view
What do neurons in the inferior temporal lobe do?
- They respond to complex forms
- Different shapes elicit strong responses from other inferior temporal lobe neurons
What is something that we need to consider when asking how objects are represented?
How does the visual system achieve object constancy?
- Many neurons in the temporal lobe are responsive to one view of a object
- However, there are some neurons that respond to many different views (view-invariant). Presumably these neurons receive inputs from neurons that are only selective for one view
- Size invariant neurons respond to an object no matter what size it is on the retina. Location invariant neurons respond regardless of the location of an object in the visual field
What hemisphere are neurons that respond to faces found in?
Are specific neurons responsible for each object we can recognize (e.g. do we have a neuron that responds specifically to our Grandmother?) Or is the is the representation of specific stimuli by the pattern of firing of many neurons?
novel) but this plasticity is view dependant
---> It would seem that cells fire in synchrony in response to all aspects of that person, their names and memories of that person
(see slides 2 & 3 lecture notes)
The inferotemporal cortex maintains close connections with brain parts involved in memory formation (notably the hippocampus). Why is this important?
(p. 88 perceptual textbook)
Who demonstrated that cells in the inferotemporal cortex have plasticity?
---> trained monkeys to recognise novel objects
---> they found that inferotemporal cortex neurons responded with high firing rates to those objects
---> however, this only happened when the objects were seen from viewpoints similar to those from which they had been learned
- Human experiment on p. 89 - 90 perceptual textbook
Are different regions of the temporal lobe specialized?
- fMRI studies have revealed that pictures of faces selectively activate the fusiform gyrus region (FFA) of the temporal lobe.
---> This area shows a higher response to faces than to other objects.
- Other regions of the temporal lobe show selectivity for inanimate objects (LOC), buildings (PPA) and human body parts (EBA).
- However, the idea that regions of visual cortex are specialized has been challenged.
---> An alternative hypothesis is that all regions of the temporal lobe contribute to our perception of any object or face.
What may some prosopagnosia patients also suffer from?
- There are some cases of prosopagnosia where intact object recognition remains and vice versa (double dissociation)
e.g. patient W.J., who had prosopagnosia, could still name his own sheep!
Do we perceive faces and objects in a different way?
• When subjects are presented with images of objects in a study phase, they are just as good at recognising the objects in the test phase if the whole object or only part of the object is presented
- Face perception is less good at recognizing the component parts of the face
Are there other aspects of facial processing?
Some brain lesions can selectively affect emotional aspects of facial processing, but leave recognition intact.
Fregoli’s delusion = a condition in which sufferers assume that strangers are familiar
Capgras syndrome = the emotional recognition system is under rather than overactive. Capgras patients can recognize faces they know, but feel no emotional attachment. They also have difficulty decoding facial emotion
What is the argument regarding how neurons become specialized?
What evidence favours the argument that neurons become specialized due to nature (as opposed to nuture)?
---> At only a few weeks of age they have a preference to view faces longer than jumbled faces or other objects
What can be said about humans and recogizing faces from their own/different races?
- Recent brain imaging experiments have found that same-race faces elicit more activity in brain regions linked to face recognition such as the fusiform gyrus region.
- However, this effect was stronger in European Americans compared to African Americans. This suggests that our perception of faces has a learned component
What experiment investigated whether the fusiform gyrus region is specialized for faces?
- A recent fMRI study looked at the effect of recognising a certain type of non-face stimulus known as Greebles.
- Initially, the response in the fusiform gyrus to different types of Greeble before training is much smaller than to faces. However, after a period of training in which the subjects have to name and identify different Greebles, the response is similar that elicited to faces.
- Other studies have shown that the fusiform gyrus also responds to cars and birds in people who are experts in recognizing these categories of object
---> there is a learning component
What are the 5 stages of Marr's model of vision?
2. Raw Primal Sketch (low level vision)
3. Primal Sketch (middle vision)
4. 2 1/2-D Sketch (middle vision)
5. 3-D Model (object recognition)
Regarding Marr's model of vision, describe the Grey Level Representation (1st stage).
Procedure:
- Measure light intensity in each of a large number of small regions of the image called pixels
- This results in a 2-D array of light intensity values
Mechanism: Each pixel and its intensity value corresponds to a photoreceptor and its receptor potential
Regarding Marr's model of vision, describe the Raw Primal Sketch (2nd stage).
Procedure: Look for patterns in the light intensity changes to denote segments of lines, line ends, edges, circles and ellipses
Mechanism: Compatible with the function of the variety of orientation specific cells of V1
Regarding Marr's model of vision, describe the Primal Sketch (3rd stage).
Procedure: Further grouping of edge segments, bars, terminations and blobs from the raw primal sketch using Gestalt laws of organisation (similarity, common fate, good continuation, closure, relative size, surroundedness, orientation, symmetry and proximity)
Mechanism: Complex cells in V2 may be organised to link line ends
Regarding Marr's model of vision, describe the 2 1/2-D Sketch (4th stage).
Procedure:
- Integrates multiple depth cues (not unlike constructivist)
- Stereopsis, motion analysis, contour, texture, shading information is used
Mechanism: Top down, optic array processing
The 2½-D Sketch is from the viewer’s perspective, so the 3-D image is
implicit at this stage. Marr argued that in order to recognise objects,they must be transformed into object-centred transformations.
The question on the page originate from the summary of the following study material:
- A unique study and practice tool
- Never study anything twice again
- Get the grades you hope for
- 100% sure, 100% understanding