Professor of Physiology and Biophysics University of Washington, United States
Introduction:: Luminance and chromatic light edges play a critical role in human vision. In primary visual cortex (V1), edges are encoded by simple and complex cells. Some simple cells encode luminance edges and others encode chromatic edges. Simple cell outputs are received by complex cells. Some complex cells are sensitive to both luminance and chromatic edges, while others are only sensitive to luminance edges. Very few are only sensitive to chromatic edges [1]. The reason why the color tuning distribution of complex cells differs from the simple cells that underlie them is not immediately obvious.
One possibility is that the color tuning of complex cells facilitates efficient object recognition in natural images. To test this hypothesis, we experimented on convolutional neural networks (CNNs) trained for object recognition. Because CNNs perform a single task, any features we observe are likely useful for that task. We focused on CNN units that we identified as the closest analogues of V1 complex cells. For each unit, we analyzed collections of colorful stimuli that drove a common response. We found that many units had similar color tuning to complex cells.
These observations are consistent with the idea that a subset of V1 complex cells and CNN units combine luminance and chrominance in a manner that is well suited for object recognition. The paucity of purely chromatic complex cells in V1, and their counterparts in CNNs, suggests that such signals are relatively unhelpful for object recognition. These results provide valuable insight into the mechanisms underlying human visual perception.
Materials and Methods:: Analysis of many CNN units, particularly those similar to complex cells, is complicated by the nonlinear relationship between an input image and their activations. One approach is to characterize these units as an electrophysiologist characterizes a neuron. We replicated previous V1 experiments [2] on six CNNs which span a wide range of architectures and were all trained to recognize objects in the ImageNet dataset.
We presented CNNs with colorful grating stimuli and analyzed collections of stimuli that drove a common response from a unit. The colors of these “isoresponse stimuli” are represented in a three-dimensional color space where each axis represents the value of a color channel. In this space, isoresponse stimuli form surfaces that provide clues to the computations that units perform. Isoresponse surfaces were classified as either quadratic or planar. Quadratic surfaces (ellipsoids and hyperboloids) indicate a unit’s sensitivity to all or some colors. In contrast, planar surfaces indicate sensitivity to a single color direction orthogonal to the plane.
Results, Conclusions, and Discussions:: Results
The isoresponse surfaces of CNN complex units are similar to surfaces associated with V1 complex cells. Some surfaces were ellipsoidal or hyperbolic, similar to those associated with complex cells jointly sensitive to luminance and chrominance. Other surfaces were planar, indicating sensitivity to a single color direction orthogonal to the plane (Fig 1.). Nearly all planar surfaces were orthogonal to luminance. Thus, like V1 complex cells, almost all complex units sensitive to one color are sensitive to luminance.
Conclusions
The distribution of V1 complex cell color tuning is generally appropriate for mediating object recognition, at least as implemented by CNNs.
Discussion
The color tuning distribution of complex cells may arise out of common interactions between light and objects. Object boundaries will often produce superimposed luminance and chromatic edges, which could be detected by complex cells jointly sensitive to luminance and chrominance. Conversely, luminance-only complex cells may be useful for achieving lighting invariance. All complex cells are defined by their invariance to the contrast polarity of an edge. As a light source moves from one side of an object to the other, the contrast polarity of a luminance edge can flip whereas chromatic edges usually change little. Therefore, an equivalent chromatic-only complex cell may not be as useful.
Understanding the shared characteristics of V1 and CNNs can help elucidate the mechanisms underlying object recognition. The shared preferential sensitivity for luminance between V1 and CNNs, may explain how and why luminance has a greater contribution to form vision in humans. This trait could be used to guide the design of object recognition algorithms to match the robustness of the human visual system. Furthermore, understanding the relationship between luminance and chromatic signals in the early visual system and perception may also be useful for developing better prostheses to restore vision. These prostheses could bypass the early visual system by encoding stimuli in ways suitable for tasks such as object recognition.
Acknowledgements (Optional): :
References (Optional): :
Hubel, D. H. and M. S. Livingstone (1990). "Color and contrast sensitivity in the lateral geniculate body and primary visual cortex of the macaque monkey." J Neurosci10(7): 2223-2237.
Horwitz, G. D. and C. A. Hass (2012). "Nonlinear analysis of macaque V1 color tuning reveals cardinal directions for cortical color processing." Nat Neurosci15(6): 913-919.