Conflicting Resolutions is an artistic interpretation of digital standards for implementing IP camera-based surveillance systems. The three part visualization displays the three categories of monitoring human behavior (detection, recognition, and identification) outlined in Axis Communication's Perfect Pixel Count: Meeting your operational requirements (1). Further, Conflicting Resolutions focuses on the approximate-ness of modeling human behavior and the subtle biases (2), particularly racial baiases, that are inherent in even our most advanced contemporary technologies.
Produced by: Luke Demarest
In 2014, Axis Communications published a manual for upgrading analog surveillance systems to modern digital equipment. They introduced a method, the Pixel Density Model, for meeting operational requirements amidst lacking industry standards. A change from analog technology should coincide with a change from analog standards. The method notes a transition from lens qualities and human height to horizontal pixel densities and the human face. "The variances in face widths are less than those of body lengths or widths, which results in a smaller margin of error. The average human face is 16 centimeters wide (= 6.3 inches wide)." (1) Following suggested operational requirements from the Swedish National Laboratory of Forensic Science (SKL) and supported by their own test results, Axis Communications chose to use 80 pixels as the requirement for facial identification for challenging conditions. Axis Communications made the point that our standards for representing and understanding humans change as our technology progresses.
As systems move to pixels and faces, three classifications are apparent:
- Detection: Register if a human is present. Necessary minimum resolution of 4X4 pixels across an average 16 cm face.
- Recognition: Register if the same human is detected over time. Necessary minimum resolution of 20X20 pixels across an average 16 cm face.
- Identification: Register if a specific human is present 'beyond a reasonable doubt'. Necessary minimum resolution of 80x80 pixels across an average 16cm face in poor environments.
So much of science and the arts is based on simplifications and approximations. The same goes for surveillance systems. As the manual states, a surveillance system assists in "creating simplified models of a complex reality". (1) However, we often don't see the degrees of approximation and instead see the image, the screen, and the camera as arbiters of truth, regardless of the resolution or the conflict it brings. This becomes problematic when approximation occurs to a broader degree than we expect.
One recent example from 2016 shows how facial recognition software has been integrated as part of surveillance systems.
"Everything we know about the face recognition systems the FBI and police use suggests the software has a built-in racial bias. That isn’t on purpose—it’s an artifact of how the systems are designed, and the data they are trained on. But it is problematic. Law enforcement agencies are relying more and more on such tools to aid in criminal investigations, increasing the risk that something could go wrong... And while state-of-the-art face matching systems can be nearly 95 percent accurate on mugshot databases, those photos are taken under controlled conditions with generally cooperative subjects. Images taken under less-than-ideal circumstances, like bad lighting, or that capture unusual poses and facial expressions, can lead to errors." (2)
Conflicting Resolutions aims to highlight the aproximate-ness of our systems and encourage a humble and careful approach to classification. "There is no such thing as primitive man. There are primitive resources." - Le Corbusier (3) The visualization portrays wrongly convicted humans (by means of various technologies) in the three stages of classification (detection, recognition, and identification) asking the viewer to draw the line between the vague and the abstract.
The code, written in C++ with openFrameworks, cycles through 9 visualizations based on pixel data from images of wrongly convicted humans documented from the Innocence Project. (4) The 9 visualizations are projection mapped to a 3-boxed cell structure via the ofxPiMapper openFrameworks addon. First, the code animates and dissects the pixel data into 4X4 detection grids. Second, the code animates and dissects the pixel data into 20X20 recognition grids. Lastly, the code animates and dissects the pixel data into 80X80 identification grids. Throughout the visualization human approximations are created and destroyed, inching closer to reality.
In contrast to the important civilian ramafications, Neal Stephenson, in a humorous light, notes a similar technological change and how it affects other modes of human representation in the political system in his 1994 novel Interface (5):
Aaron: "Gee, it's really a shame that..."
Ogle: "That our political system revolves around such trivial matters...That's how it is, and how it will be until high-definition television becomes the norm."
Aaron: "Then what will happen?"
Ogle: "All of the politicians currently in power will be voted out of office and we will have a completely new power structure. Because high-definition television has a flat gamma curve and higher resolution, and people who look good on today's television will look bad on HDTV and voters will respond accordingly. Their oversized pores will be visible, the red veins in their noses from drinking too much, the artificiality of their TV-friendly hairdos will make them all look, on HDTV, like country-and-western singers. A new generation of politicians will take over and they will all look like movie stars, because HDTV will be a great deal like film, and movie stars know how to look good on film."
- https://www.axis.com/files/feature_articles/ar_perfect_pixel_count_55971_en_1402_lo.pdf, page 4 - 5
- Towards a New Architecture (Dover Architecture) by Le Corbusier (1985)
- http://www.gtfo.org/temp/Stephenson,%20Neal%20&%20Frederick%20George%20-%20Interface.pdf, page 60