The following is a transcript of a paper presented at the Graduate Philosophy Conference at the University of Western England in April 2014. Audio is available for download from the UWE Philosophy website.
Today, I am going to discuss the deployment of the female voice in automated and digital technology and begin to explore what this might indicate with regard to gender binaries, labour practices and feminist techno theory in the contemporary digital sphere. I believe the voice can act as a valuable point of access into the current status of feminist ideological endeavours since it has occupied such a prominent place in ongoing mainstream and journalistic debates around female empowerment. Metaphors of voice and voicelessness have also been crucial in the development of a post-structuralist feminist project, espoused by theorists such as Luce Irigaray and Hélene Cixous, which considers the act of coming to language and coming to voice as the decisive mode of countering phallocentric dominance. This rhetoric also crucially permeates the early techno-feminist writing of Donna Haraway, who stated with regard to her germinal vision of the cyborg that “This is a dream not of a common language, but of a powerful infidel heteroglossia. It is an imagination of a feminist speaking in tongues to strike fear into the circuits of the super-savers of the new right.” Therefore, with so much metaphoric weight placed upon the voice, it is perhaps worth exploring such statements in material terms: who is speaking in contemporary digital culture? Who is being addressed on behalf of whom? And what does she sound like?
I’m going to begin with Haraway and the Cyborg Manifesto. The text, first published in 1983 and well worked by feminist theorists following it, puts forward the assertion that the proliferation of biotechnologies and digital communication technologies has created an arena in which “we are all chimeras, theorized and fabricated hybrids of machine and organism, in short we are cyborgs”. The cyborg no longer insists that bodies end at the skin, but rather that the penetration, modification and dispersal of bodies through the machine is an inextricable aspect of embodiment. It operates as a material and historically specific social reality, yet also a creature that Haraway dreams will be realized in, and aid in the realization of, a post-gender future. The cyborg’s hybridity would demand an unravelling of hierarchical definitions of gender, species and subjectivity. This is the techno-utopian inheritance that feminist scholars continually contend with, even now when thirty years following the text’s original publication, the encumbered nature of this vision is evident.
Indeed, human interactions with technology have continued to proliferate at an astounding rate since the 1980s. The increasingly ubiquitous presence of smartphones and portable tablets creates an environment in which a vast number of individuals have become so reliant upon these devices that they come to operate as a form of prosthetic tool. With the swipe of a finger or spoken command, the subject transcends the limitations of their embodied presence, communicating across vast distances and fusing their brains with the reticular operations of global info networks. Technology is thus increasingly becoming an intimate component of our persons.
However, in observing the affective conditions of human life in this technologized sphere, political theorist Franco “Bifo” Berardi has proposed that cybernetic intertwining, in what he terms “an infinite game of mirrors” between humans and technology, does not create radical and invulnerable hybrids, but rather leaves the human subject to suffer in the ineradicable gap between machine and flesh. He argues that the hidden ultimate ambition of software production is the wiring of the human mind in accordance with the rapid-fire operations of digital networks that uphold dominant systems of info-capital. The result is the installation of “a cyber-panopticon inserted in the fleshy circuits of human subjectivity.” (35) The systems of surveillance and speed that permeate digital info networks thus become driving forces in embodied existence.
In order to articulate the conflicted nature of this subjective formulation, Berardi distinguishes between cyberspace and cyber-time; the former can be accelerated without limits while the latter is essentially tied to human experience, to the body and the brain, which remain tethered to certain biological limits. (40) These limits become stressed as the dispersive nature of info-labour in the digital age is such that corporate structures increasingly demand that humans work in accordance with this accelerated cyber-sprawl. As it becomes increasingly uncommon for human time to be purchased in regular packages of 8 hours a day, workers are subject to what Berardi terms the “fractalization of time.” They are enslaved by the demand for constant availability as labour time is broken down into increasingly smaller and more chaotic intervals. Unlike Haraway’s radical and re-inventive vision, interactions with digital technology become a seeming necessity to participate in, rather than transcend, the conditions of late capitalist production – conditions that have been generated according to technology itself. We must remain constantly attached to our machines so that we can work more like them, often to the detriment of our personal lives and mental health.
The appropriation of the human voice is perhaps one of the primary means through which digital technologies attempt to bridge or naturalize this discrepancy between the limitations of human flesh and the limitless expanses of cyberspace. In order for this fractalized form of labour to flourish, technology must appeal on an intimate level to its users; it must appear friendly and helpful in the service of the ambitions that techno-capital implants for its own upholding. And when the network presents itself to its users in a coherent anthropomorphic form, it does not give itself a face, but a voice. This is perhaps due to the malleable, yet strange and conflicting qualities that comprise the voice as such. Although often overlooked since it occupies such a basic place in our day-to-day communications, upon reflection the voice is a rather bizarre entity, a slippery device of embodiment and subjectivity. In every act of vocalisation, a boundary is transgressed: the voice separates itself from the body, yet continually refers back to its presence. It floats independently and immaterially in space yet seems to carry some intimate part of the speaker with it, as if constantly vacillating between inside and outside, material and immaterial, and subject and object. Referring to this elusive and enthralling quality of the voice as it is made audible in music, Roland Barthes coined the term “the grain,” describing it as “the very friction between the music and something else …” (185) It is that which provides the listener with direct access to the body of the speaker or singer, surpassing any expressive or subjective intent and touching instead upon some physical, or even erotic, charge of the secret interior of their flesh. This powerful and curiously evocative potential is perhaps what lends the voice both its strength and its susceptibility: it seems to contain an indelible and material human quality of the speaker, yet as soon as it leaves their mouth it is no longer their domain; it is open to capture.
Some of these more fluid and evasive qualities of the voice can be considered to bear an affinity to Haraway’s cyborg politics. In addition to her aforementioned conviction in a tactical “feminist speaking in tongues,” the liminal and transgressive nature of the voice as such indicates that it might provide a vehicle for the expansive presence of the cyborg body. Since the act of vocalization is necessarily a gesture of extending some aspect of the bodily interior into space, it is in its very essence implicated in a gesture that creates a body that is stretched beyond the limits of its skin. Indeed, the voice is also one of the primary channels through which the body engages with the communication technologies that re-craft it, such as telephones, recording devices and voice recognition software.
This prominent relationship between the voice and technology has been critically taken up by performers and composers, possibly most famously by Laurie Anderson. Her iconic album “Big Science” was released prior to the Cyborg Manifesto in 1982, but indicates a preoccupation with similar problems and potentialities regarding the interaction between the body – particularly the female body – and technology. Throughout the album, Anderson employs what she terms “audio masks,” as technological filters allow her to play numerous different characters, across age, gender, and species. Her voice morphs into a saxophone, a computer, an orchestra, a synth and bagpipes. I’m just going to play a very brief clip from “O Superman” which is probably her most famous piece and I’m sure many of you will be familiar with it. In this song, her staccato breath creates the beat as if her body is a metronome. On the melodic line, her voice is electrified and multiplied. She sings with herself up and down the octave, refracting her voice in to soprano, alto, tenor and bass lines which sing together in a machinic harmony. By singing in to technological circuits, her body becomes processed in such a way that it is contained in every gesture, creating a truly cybernetic production.
However, Anderson’s engagement of the inventive potential of the cyborg voice is by no means universal. The cybernetic voice has in fact long been a presence in recorded music; ever since the advent of multi-track recording has allowed singers to sing along with themselves, the voice has become an electronically produced composite of itself. In some respects, as with Anderson, this might grant the singer additional agency as a new species of digital auteur. However the majority of these vocal realizations do not partake in the noisiness and pollution of cyborg politics, but rather become smoothed of their vocal excesses in order to increase their communicative impact. They become auto-tuned, composite entities of expressive perfection. The same can be said of the automated voices employed in digital technologies that sustain the banal transactions of daily life such as telephone banking systems, self-checkout machines, and public transit announcements. Although it is not always the case, a disproportionately high number of these voices are female. The mass deployment of serene female voices by corporate and governmental bodies suggests that they are relying upon longstanding presumed affiliations between femininity and passivity to placate the masses as they filter through airports and wait on hold. These anonymous and digitally deployed women provide the emollient for the implementation of systems of exchange, surveillance and control. Gender remains the only marker of identity, the only shred of the original embodied subject, provided by these ageless, whitewashed and vaguely middle class voices.
Perhaps the most evolved and striking contemporary example of automated vocal software is Siri, the personal assistant app in the iPhone. Users are ostensibly able to speak any question or direction in to their phone and Siri will speak back: placing calls, adding dates to calendars, recommending restaurants and providing weather forecasts. It has been promoted by Apple as a sort of genie in the iPhone, with the slogan “Your wish is its command.” Siri speaks in the first person, knowing its owners by name and greeting them with a standardized “What can I help you with?” Siri is described by her creators as a “kinder, gentler HAL,” referring of course to the malicious computer in 2001: A Space Odyssey. Siri is instead a friendly assistant who strives to make her human users efficient, so that “we don’t have to think so much and work so hard.” Siri does indeed require less interactive effort than ever before on the part of her users; unlike a search engine which would require the manual pressing of buttons, and several steps of research and action to complete a task, Siri is described as a “do-engine;” it can accomplish all of this nearly instantaneously with one vocal cue. Although the promise might be to create more leisure time, the result is that life is increasingly accelerated into cyber-time. Siri facilitates the emersion of the human subject into the rapid-fire and chaotic flows of the digital info-network by anthropomorphising the internet into a single unified persona with whom one can have a conversation.
As Siri becomes more widely available in international markets, it is given different voices with regionally specific accents. Indeed, while originally gendered male and female respectively, both the British and American versions of the software now allow users to select the gender of their personal assistant. However, in its original conception, Siri’s femininity was an ineradicable quality of the program. The name Siri is in fact an actual Norweigan woman’s name, and was selected by its inventor Dag Kittlaus for its meaning, which is “beautiful woman who leads you to victory.” Siri thus capitalizes on the trop of feminine secondariness and servitude, insinuating a selfless Girl Friday type of loyalty to her users. It is this dynamic that becomes the central focus in Spike Jonze’ recent film Her, which I feel I would be remiss not mention given its recent commercial and critical success, and which could fuel an entire discussion about the voice as a marker of subjectivity. Jonze’s vision of artificially intelligent voice technology is in fact an only slightly more mature vision than that originally crafted by the creators of Siri, who have expressed their disappointment in the limited way that the software has been realized after it was sold to Apple in 2010. This Siri would have had a broader vocabulary, been able to perform more functions and been equipped with a set of “personality packs” from which the user could pick and choose in order to craft their own perfect companion.
Simply laid out, the film follows Theodore, played by Joaquin Phoenix, who is emotionally drained in the midst of a divorce and in a state of social and professional inertia. In an effort to ostensibly make his life more organized, Theodore purchases the latest artificially intelligent operating system. Samantha, voiced by Scarlett Johansson, thus springs to life and to his rescue. Since her organism operates according to the immaterial and reticular conditions of digital technology, the fractalized model of time poses no problem to her and is in fact an inherent assumption of her being. She is therefore perhaps the perfect model of the worker that humans strain and struggle to be, as she is unflinchingly able to provide the constant availability that is increasingly presumed of the labour force in contemporary capitalism. The fluidity and malleability permitted to her by her cybernetic makeup is such that she enacts a degree of attentiveness and servitude that no human could possibly ever perform, neither professionally nor romantically. Samantha good-naturedly teases and cajoles Theodore out of bed in the morning, perkily pushes him to unclutter his hard drive of old memories, and whisperingly reminds him to respond to emails from his divorce attorney. The expressive and sensitive qualities of her voice provide the emollient to unstick Theodore, pushing him forward to greater professional and personal achievements. She even encourages him to go on dates with women and when those don’t go well, she steps in to become his romantic partner, ensuring his successful participation in heterosexual rituals of courtship and monogamy. The film thus hyperbolically manifests the co-optation of the erotic aspects of the aforementioned grain of the voice. In addition to the accoutrements of benign gentleness and sensitivity equated with her femininity, the individuated qualities of Samantha’s voice compel Theodore into becoming invested in that erotic something, that undisclosed interiority housed within the voice that she effectively simulates. He thus becomes coddlingly enmeshed in the forward-moving flows of cyber-capital.
In this regard, Samantha can perhaps be considered a personification of what contemporary political theorist Jodi Dean terms communicative capitalism. Dean’s post-Lacanian reading of the relationship between humans and the internet suggest that “contemporary communications media capture their users in intensive and extensive networks of enjoyment, production, and surveillance.” Indeed we spend an inordinate amount of time and energy investing aspects of ourselves in the digital network. Rituals crafted through communications media foster a dangerous cult of excessive reflexivity, as we come to take pleasure in reimagining ourselves according to the options provided to us by social media platforms. Creativity and resistance are thus captured and diverted into narcissistic circuits. As the Internet increasingly becomes a site for the investment of libidinal desires and self-actualization, human subjectivity becomes increasingly mingled with the neoliberal interests that permeate the digital sphere. This system thrives on the deployment of affect, seeking to make users feel deeply so that the strength of their emotional investments comes to reinforce the network itself. Samantha operates in much the same way, cultivating Theodore’s emotional and libidinal investment in her and utilizing his devotion and pleasure in sharing himself with her to shore up her project to push him along according to capitalist programmes of progress.
In its present form, however, these libidinal investments are perhaps not quite so consuming since the vocal technology behind Siri is not yet as convincing or evolved as Samantha. This voice is inflected with awkward rhythms, mispronunciations and misunderstandings, revealing its cybernetic structure to its users. The combined human and algorithmic operations that go in to making Siri are still audible, situating the voice in what is referred to as an uncanny valley, the unnerving feeling produced by anthropomorphic automata as they bear an discomfiting likeness to the human subjects they seek to emulate. The inescapable fact remains that Siri’s voice belongs to someone else. Apple never disclosed who provided the voice behind the original Siri, but in 2013 an American voice actress named Susan Bennett came forward claiming the voice was hers.
According to Bennett, she spent months recording nonsensical phrases and phonemes which were later processed by technicians who adjusted her pitch and speed, plucking out vowels, consonants and diphthongs to be reassembled in a process called concatenation. The material conditions of producing automated voices thus propose yet another modified formulation of the voice within digital culture. The voice becomes the object of labour for the actor in such a manner that is entirely divorced from agency; Bennett has absolutely no control over the grammatical structures and messages her voice will come to convey. Her performative action is, to use Saussure’s terminology, an instance of parole without langue – of speech without meaning. This voice strangely still houses some shred Bennett’s body, and yet is entirely divorced from her agency and subjectivity. One could also argue that the nature of the concatenated voice thus embodies in a material sense Berardi’s notion of fractalization: it is broken down into increasingly complex and miniscule units only to be re-stitched by software in such a manner that it becomes isolated from the embodied experience of its original human emitter.
While audible jumps in pitch and rhythmic skips might be considered as flaws which vocal software developers seek to overcome, they make a deliberate and critical appearance in electronic music. In conclusion, then, as a means of beginning to propose a constructive counter to the corruption of the cybernetic voice, I want to briefly look at the work of Holly Herndon, a contemporary American vocalist and composer. This is a track called “Chorus” and it was released in January of this year.
While Herndon has cited Haraway’s cyborg as an influence on her musical thinking, her work is markedly different from electronic music produced in the 1980s such as Anderson’s. Herndon’s voice has also been refracted, but not into clear harmonic chords. The metronomic breath is replaced by staggering and asphyxiated vocal rhythms. Although, as was previously mentioned, nearly all recorded voices are subject to a practice like concatenation, the singing subject that Herndon reassembles is incoherent. As is emphasized by the music video, Herndon presents a vision of herself that is pixelated, like a file that has been converted and resized too many times; flesh becomes scrambled as it passes through the network. Only fragments of sentences can be deciphered in a product that valorizes vocalization outside of the dominant communicative voice, working with respirations and glitches – the excesses smoothed out by the capitalist drive towards streamlined efficiency. It is perhaps these terms of pixelatedness and fractalization, evasiveness rather than extension, that might inform a new feminist techno-politics. Through recombinations that audibly retain their chimerical structure, the radical potential of the cybernetic voice might thrive.