In an episode of the beloved animated 1960s sci-fi sitcom The Jetsons, Mrs. Jetson decides that housework is becoming too much for her to handle on her own. She doesn’t have time to run errands while balancing caretaking with frequent trips to the beauty salon, so she heads to “U Rent-a-Maid” and brings home a Rosie. Rosie is a metal tin on wheels donning a maid’s apron and feather duster that helps cook, organize, and entertain the kids. This robot maid is kind of helpful, but mostly it makes lovable and hilarious mistakes. It doesn’t open its mouth to speak, but a tinny, feminine voice regularly emits from its circuits.
Today, we have Rosies of our own in our homes — devices that help us cook, organize, and entertain, and which sometimes get instructions wrong in hilarious ways. And while the sci-fi writers of the past were wrong in predicting that our domestic devices would be made from hunks of steel and wheels, they did get a few things right, like the frilly apron. Siri and Alexa may not be physical, but their feminine personas feel as if they’ve been written and designed by teams of dinner-jacketed, cigar-smoking men in the 1960s.
Many articles from the past year have noted how digital voice assistants are gendered as female, either by default or exclusively. While banking and insurance apps often utilize male voices, reinforcing the male as authoritative, voice assistants for the home, such as Siri and Alexa, are female. These devices are spoken to while cooking, in order to pull up recipes; they set alarms, check the weather, tell time, and send emails. They play secretarial and domestic roles in our lives, and their carefully constructed personas align with traditional notions of homemaking and caretaking as feminine domains.
A UN report released in 2019 examined the sudden proliferation of AI voice assistants with female personas, and highlighted just how concerning this trend is becoming. According to the paper, voice assistants currently carry out more than 1 billion tasks per month, everything from changing a song to calling emergency services. By 2021, it’s predicted that there will be more digital voice assistants than people on the planet. The report “I’d Blush if I Could,” (titled after Siri’s former response to: “Hey Siri, you’re a bitch”) explores how despite these rapid technological leaps, our smart home devices are also entirely outdated. The voices behind the device respond flirtatiously to sexual harassment and don an overall eager, obliging, docile, and passive persona that perpetuates undesirable gender stereotypes.
In response to the troubling feminization of voice assistants in the mainstream, there’s a growing number of designers in the digital realm working on projects that bring a more nuanced approach to the topic of gender and voice technology. What these projects have in common is an attempt to expand the notion of what our future could look, or rather, sound like, away from some vision of sexy cyborgs voiced by Scarlett Johansson, into more imaginative territory, where a digital voice might come to act more like a companion.
What’s in a voice?
A gradient of browns, purples, beiges, and pinks fills a screen, while a transparent bubble hovers in its centre, morphing and pulsating gently like some biological entity from under the sea.
“Hi, I’m Q…” says an ambiguous, European-tinged voice, seemingly from the bubble as it spikes and vibrates in unison, “….the world’s first genderless voice assistant. Think of me like Siri or Alexa, but neither male, nor female.” This video was created to launch Q in 2019 — a voice designed to raise awareness of gender bias in AI assistants.
Listening to Q, one of the first questions that comes to mind is: Why is tech gendering voice assistants in the first place? The answer ultimately says a lot about Western society, and our own biases and culturally determined preferences. People are, apparently, more likely to buy from human-sounding devices; at the same time, they’re also more comfortable receiving help from feminine voices, according to Amazon and Apple. Tech companies want a customer’s user experience to be as smooth and frictionless as possible, so it’s unlikely they will be subverting gender norms any time soon. As Jessi Hempel wrote in Wired, “People tend to perceive female voices as helping us solve our problems by ourselves…We want our technology to help us, but we want to be the bosses of it, so we are more likely to opt for a female interface.”
Q is soothing and pleasant to listen to; it makes a very human sound, and also one that’s very difficult to ascribe with a binary gender. It was created by Vice’s creative agency Virtue in partnership with Copenhagen Pride, alongside a team of researchers, sound designers, and linguists from a diverse range of genders and backgrounds. To create Q, a number of people across the gender spectrum were recorded, and then sound designer Nis Nørgaard zeroed in on a pitch of voice that’s in the middle of what’s generally considered masculine or feminine. The spot sits in-between 145 and 175 hertz; go higher and the voice is usually considered female; lower and it tends to read as masculine (you can try it yourself in this interactive by dragging the bubble up and down).
“We tested a few variants of the voice through Vice,” says Ryan Sherman of Virtue. “We gave people a scale of 1-5, and asked them, where does this voice sit? We kept doing this until 50 percent of people couldn’t decide the gender, and 26 percent said it was male, and 24 percent said it was female. We aimed for the ‘hard to tell’ region.”
Biases in the script
Imagining Q as the voice of a smart device is an intriguing thought experiment since devices voiced by Q could help teach users that gender is a spectrum. Yet voice is just one, user-facing aspect of designing AI voice assistants. Beyond voice, there are many more potential ways designers and engineers might encode bias into an AI voice assistant.
The AI engines of leading voice assistants learn how to speak by mining conversational repositories that have been written by teams of people in order to give voice output a human feel. These often vast creative teams are paid to develop in-depth backstories for products, in order to humanize and help the AI express itself in a familiar way. For the most part, as Alexa and her chatty companions are a form of weak AI—what the machines say is largely scripted by a team of humans instead of being created by generative machine learning techniques. Clear themes emerge when you compare the scripts of Alexa, Siri, Cortana, and Google Assistant: All four have coy, submissive personalities. They coddle users, like stereotypical baby-sitters or mother figures. And while a bot’s personality can be stereotyped and gendered, the ingredients to train AI machines can be, too. The most famous example of this was Tay from Microsoft, an artificial Twitter chatbot that began to post racist and sexist tweets as it learned from others (it referred to feminism as a “cult” and a “cancer.”)
In 2019, non-profit organization The Feminist Internet created a chatbot that teaches is users about AI bias. It’s not an AI itself, but the way it’s been scripted is a useful model for thinking about encoding AI with certain principles. Its name is F’xa, a play on Alexa, but this chatbot doesn’t help with domestic tasks; its use is educational, and its tone does not comply with the servitude found in most conversational interfaces. F’xa is not feminist in the sense that it self-identifies as a feminist, or is presented with a feminist “personality.” Rather, it was built with feminist principles in mind — and so it approaches the subject matter of AI bias from a feminist perspective.
When F’xa provides its user with the definition for “artificial intelligence,” for example, it pulls up a number of definitions from diverse voices in the field that are less visible in the dominant AI conversations. If you Google “AI” — or indeed ask Alexa — the definition that’s given is often that from Wikipedia, which carries its own gender biases. When F’xa gives the definition of feminism, it says that “feminism means different things to people,” unlike Siri, which will pull up the definition from a top Google search. In acknowledging multiplicity of perspectives, F’xa’s team has encoded its own underlying intersectional, feminist philosophy into the chatbot in a subtle way.
Collectively beneficial functions
Today, digital assistants are mostly used to carry out a certain set of tasks. Wake words — like “OK Google” or “Hey Siri” — are spoken by the user in order to ask a question, to check the weather, to stream music, or to set an alarm. These wake words tend to be said while a user is cooking, or multitasking, or watching television, or getting into bed. Digital assistants play the roles of secretary — carrying out dull, necessary tasks at home. Amazon’s Echo Look has added another function to the roster of tasks typically carried out by digital assistants: With its built-in camera, it tells its users how they look and helps people shop, like a best friend.
The voices of these devices are soothing, feminized, and unthreatening for a reason—their feminine personas assuage anxieties that a consumer may have around AI or virtual assistants. And as we welcome these devices into our homes, they continue to extend their influence, and data gathering, into more and more areas of our private life. It’s estimated that over 15 million homes in the US have more than three smart speakers in their home. Alexa’s purpose is ultimately two-fold, designed to benefit both its user and maker: It helps carry out daily tasks, but it also encourages us to make purchases, while feeding Amazon information about our likes and dislikes.
Even if Alexa spoke with a different voice, or had a different scripted persona to learn from, its function as a product hasn’t been built with feminist principles in mind — and there’s the rub. As a product linked to internet-of-things technologies, it can exacerbate domestic abuse and stalking, as recent news has shown. Would it be possible to design these technologies with less focus on consumption, data collection, and surveillance? Could designers expand an assistant’s range of subjects or capacities towards collectively beneficial projects?
“We set out last year to hold a workshop on the idea of what a feminist Alexa could look like,” says Dr. Charlotte Webb, co-founder of The Feminist Internet. “The idea of looking at what exists from this corporate space, with a feminist lens, isn’t quite what happened in the end, though. Because Alexa is so orientated towards a domestic space, and the character that it has is connected to that particular location, the workshop became about re-locating the purpose of what a feminist assistant could be.”
Students at The Feminist Internet’s workshop ended up imagining a host of potential functions for conversational interfaces. One design, called B(o)(o), dealt with embarrassing body problems — its team imagined the user to be a young person called Silva that felt a sense of discomfort because of hair growing on their body. Another design called Egami became the very antithesis of Amazon’s Echo Look — a drag queen’s voice would emit from its speakers to help users with feelings of self-worth after engaging on social media.
AI mimicking human representations
“Ask a voice assistant if it is human and it will heavily negate, but it will tell you about preferences and hobbies for which it would have to be a human or have a body to do, like somersaults,” says Alexa Steinbrück. Steinbrück is not the Alexa, but a front-end developer based in Leipzig, Germany who studies voice assistants and works in computational creativity. “I perceive the personification and pseudo-humanity as narratives, just additional layers wrapped around the technology. And the question is: Do we even need these narratives and layers?”
Steinbrück has been exploring how to deconstruct the narratives that parallel artificial intelligence with human intelligence. Instead of asking how to gender technology with feminist principles in mind, Steinbrück asks, why are we even personifying and gendering technology in the first place?
“Personality and gender in voice assistants is not something that emerges due to the nature of ‘AI systems,’” says Steinbrück. “It is intentionally created based on the logic of market demand, gender biases, and prevalent unrealistic narratives about AI.” She compares the personification of voice assistants to skeuomorphism: both wrap products in representations that mimic different mechanisms, and thus hide their underlying logics. “In the case of voice assistants, it’s mimicking a real human being.”
As Google Duplex has made clear, the service that uses AI to call restaurants and make bookings in a human voice, technologists face new ethical dilemmas when they create synthetic voices that are so human-sounding that they trick people into thinking they’re speaking to an actual person. Steinbrück suggests that we should therefore focus on the “raw-AI-ness” of a product. This could mean: “Not trying to blur the borders between humans and machines, but instead highlighting the distinct capabilities and limitations of machine learning backed technology…Avoiding the first person pronoun in the language output…Developing a distinct new sound for synthetic voices that distinguishes them from human ones (but sounds pleasant).”
After deconstructing narratives that parallel artificial intelligence with human intelligence in her workshops, Steinbruck prompts participants to imagine alternative representations for voice assistants — beyond the human. At a recent workshop at the internet and tech festival Mozfest in London, one team came up with the idea of a mountain for an AI voice assistant.
“You might say: But that’s also not raw AI because it’s also a narrative wrapped around the real technology!” says Steinbruck. “But I think this narrative is different, because it’s far enough away from the truth, it’s conscious and playful animism. We know that we’re not speaking to a real mountain, but we can imagine doing so.” There are already some non-human voice assistants available on the market, such as one smart speaker from China in the shape of a cat. A pair of cartoonish eyes move around on its screen, and it emits a cartoonish voice from its speakers to answer its users questions.
“The more we integrate AI technology into our everyday life, from objection recognition in apps to speech recognition in phones, and the more the public gains understanding of the field, I hope that personification will become a distraction, a weird bag of cheap tricks, that we’ll eventually get rid of,” says Steinbrück.
If we want to get rid of twentieth century gender norms, then we need to move away from twentieth century visions of what technology’s role in our lives could be — in short, away from androids with female bodies and aprons performing subservient tasks. Could artificially intelligent, conversational technologies occupy roles in our life outside of consumption, and speak in ways that teach us about the complexity of the world, like Q? Or could they adopt the role of comforting companion, taking the form of something as poetic as a mountain?
This article was produced in partnership with AIGA Eye on Design.