A bark is worth more than a thousand words. Three researchers demonstrated that dogs' barks and growls hide information to understand their emotions. The key, therefore, would lie more in the sounds they emit than in their tail wagging. And a tool based on artificial intelligence (AI) could be capable of deciphering those noises. Something far from simple.
Artem Abzaliev took on the challenge while he was a doctoral student in Computer Science and Engineering at the University of Michigan. His work was guided by Radha Mihalcea, a professor of Computer Science and Engineering at the same university, and by Humberto Pérez Espinosa, a collaborator from the National Institute of Astrophysics, Optics, and Electronics (INAOE) in Mexico. The research began in 2015, and last year the results were published in the Cornell University journal.
74 dogs of different ages and sexes participated in the study, mainly from the Chihuahua, French Poodle, and Schnauzer breeds. A team led by Pérez Espinosa recorded the barks by exposing the animals to various stimuli in controlled environments involving the pets' owners or an experimenter.
The joy or anger behind a bark
"The protocol was designed and validated by experts in animal behavior. Emotional situations were created through specific scenarios, such as ringing the doorbell loudly, simulating an attack on the owner, speaking affectionately to the dog, interacting with toys, getting ready for a walk, or leaving the dog tied to a tree," Pérez Espinosa details to Crónica.
The dogs' reactions to these stimuli were recorded and segmented to feed a database. The collected information allowed for training AI tools for the classification and decoding of canine vocalizations. To do this, they started with a voice representation model created by Meta. Its name is Wav2Vec2 and was trained with 960 hours of human voice data.
Not having a previous tool that constituted a source of canine data, they relied on Wav2Vec2. "We adjusted this model to a new set of canine vocalization data. Two approaches were tested: a model trained from scratch using only dog barks and another previously trained on human speech and then fine-tuned with canine vocalizations," explains the INAOE researcher.
They discovered that the best approach was the second one, as the Mexican researcher details: "Our study shows that Wav2Vec2, originally trained on human speech, outperforms models trained from scratch in tasks such as dog recognition and breed identification," Pérez Espinosa points out. Furthermore, "it improves contextualization, allowing the model to better associate barks with their situational meanings. Human speech processing techniques can be effectively adapted to understand animal communication."
This way, the team took advantage of existing voice technologies, such as voice-to-text conversion and translation. These tools allow for distinguishing nuances in the voice - such as accent, pitch, and tone - to transform that data into something the computer can decipher. For example, the technology is capable of identifying the words being said and recognizing the speaker.
Starting from emotion recognition technology in humans, Pérez Espinosa continues, "allowed us to explore parallels between human and canine vocal expressions, and develop a more structured methodology for interpreting dogs' emotions through their vocalizations."
Humberto Pérez Espinosa, collaborator at INAOE in Mexico and one of the research authors.Provided
In conclusion, the study demonstrated that valuable information can be obtained from a dog's bark. "It is possible to predict a dog's breed based on its vocalizations... Additionally, the bark's context can be identified, linking vocalizations to specific situations, such as aggressive or normal barks towards a stranger," exemplifies the Mexican expert. He also points out that data such as sex and age can be obtained. Most importantly, this research is the starting point for training new systems focused on animal communication.
Although the findings are promising, the INAOE researcher clarifies that "understanding canine emotions requires considering additional factors", such as "body language and visual signals (tail movements and facial expressions), chemical and tactile signals, physiological data (heart rate and cortisol levels), and behavioral context (interactions with humans or other dogs)."
Pérez Espinosa highlights that "AI models could be trained to infer emotional states such as stress, anxiety, or excitement based on acoustic patterns." Furthermore, "AI could help detect health problems, identifying vocal anomalies related to pain or respiratory conditions." This research has the modest potential to improve animals' lives, helping their caregivers understand their emotional state and needs.