bnew

Veteran
Joined
Nov 1, 2015
Messages
63,087
Reputation
9,641
Daps
172,665

The sperm whale 'phonetic alphabet' revealed by AI​

1 day ago

By Katherine Latham and Anna Bressanin,

Share

p0j9d4hj.jpg.webp

Amanda Cotton/Project CETI
Sperm whale communication may have similarities to human language (Credit: Amanda Cotton/Project CETI)

Researchers studying sperm whale communication say they've uncovered sophisticated structures similar to those found in human language.

In the inky depths of the midnight zone, an ocean giant bears the scars of the giant squid she stalks. She searches the darkness, her echolocation pulsing through the water column. Then she buzzes – a burst of rapid clicks – just before she goes in for the kill.

But exactly how sperm whales catch squid, like many other areas of their lives, remains a mystery. "They're slow swimmers," says Kirsten Young, a marine scientist at the University of Exeter. Squid, on the other hand, are fast. "How can [sperm whales] catch squid if they can only move at 3 knots [5.5 km/h or 3.5mph]? Are the squid moving really slowly? Or are the whales stunning them with their vocalisations? What happens down there? Nobody really knows," she says.

Sperm whales are not easy to study. They spend much of their lives foraging or hunting at depths beyond the reach of sunlight. They are capable of diving over 3km (10,000ft) and can hold their breath for two hours.

p0j9d575.jpg.webp

Amanda Cotton/Project CETI
Sperm whales are in constant communication with one another, even when foraging alone at depth (Credit: Amanda Cotton/Project CETI)

"At 1000m (3300ft) deep, many of the group will be facing the same way, flanking each other – but across an area of several kilometres," says Young. "During this time they're talking, clicking the whole time." After about an hour, she says, the group rises to the surface in synchrony. "They'll then have their rest phase. They might be at the surface for 15 to 20 minutes. Then they'll dive again," she says.

At the end of a day of foraging, says Young, the sperm whales come together at the surface and rub against each other, chatting while they socialise. "As researchers, we don't see a lot of their behaviour because they don't spend that much time at the surface," she says. "There's masses we don't know about them, because we are just seeing a tiny little snapshot of their lives during that 15 minutes at the surface."

It was around 47 million years ago that land-roaming cetaceans began to gravitate back towards the ocean – that's 47 million years of evolution in an environment alien to our own. How can we hope to easily understand creatures that have adapted to live and communicate under such different evolutionary pressures to ourselves?

"It's easier to translate the parts where our world and their world overlap – like eating, nursing or sleeping," says David Gruber, lead and founder of the Cetacean Translation Initiative (Ceti) and professor of biology at the City University of New York. "As mammals, we share these basics with others. But I think it's going to get really interesting when we try to understand the areas of their world where there's no intersection with our own," he says.

p0j9d582.jpg.webp

Project CETI
The Dominica Sperm Whale Project has been listening to sperm whales for almost 20 years (Credit: Project CETI)

Now, from elephants to dogs, modern technology is helping researchers to sift through enormous datasets, and uncover previously unknown diversity and complexity in animal communication. And Ceti's researchers say they, too, have used AI to decode a "sperm whale phonetic alphabet".

In 2005, Shane Gero, biology lead for Ceti, founded The Dominica Sperm Whale Project to study the social and vocal behaviour of around 400 sperm whales that live in the Eastern Caribbean. Almost 20 years – and thousands of hours of observation – later, the researchers have discovered intricacies in whale vocalisations never before observed, revealing structures within sperm whale communication akin to human language.

We're at base camp. This is a new place for humans to be – David Gruber

Sperm whales live in multi-level, matrilineal societies – groups of daughters, mothers and grandmothers – while the males roam the oceans, visiting the groups to breed. They are known for their complex social behaviour and group decision-making, which requires sophisticated communication. For example, they are able to adapt their behaviour as a group when protecting themselves from predators like orcas or humans.

Sperm whales communicate with each other using rhythmic sequences of clicks, called codas. It was previously thought that sperm whales had just 21 coda types. However, after studying almost 9,000 recordings, the Ceti researchers identified 156 distinct codas. They also noticed the basic building blocks of these codas which they describe as a "sperm whale phonetic alphabet" – much like phonemes, the units of sound in human language which combine to form words. (Watch the video below to hear some of the variety in sperm whale vocalisations the AI identified.)

p0j9t32m.jpg

2:25
The secret coda of whales (Video by Anna Bressanin and Katherine Latham)


Pratyusha Sharma, a PhD student at MIT and lead author of the study, describes the "fine-grain changes" in vocalisations the AI identified. Each coda consists of between three and 40 rapid-fire clicks. The sperm whales were found to vary the overall speed, or the "tempo", of the codas, as well as to speed up and slow down during the delivery of a coda, in other words, making it "rubato". Sometimes they added an extra click at the end of a coda, akin, says Sharma, to "ornamentation" in music. These subtle variations, she says, suggest sperm whale vocalisations could carry a much richer amount of information than previously thought.

"Some of these features are contextual," says Sharma. "In human language, for example, I can say 'what' or 'whaaaat!?'. It's the same word, but to understand the meaning you have to listen to the whole sound," she says.

The researchers also found the sperm whale "phonemes" could be used in a combinatorial fashion, allowing the whales to construct a vast repertoire of distinct vocalisations. The existence of a combinatorial coding system, write the report authors, is a prerequisite for " duality of patterning" – a linguistic phenomenon thought to be unique to human language – in which meaningless elements combine to form meaningful words.

p0j9d5fh.jpg.webp

Project CETI
In 2023, drone footage captured the sights and sounds of a sperm whale calf's birth. Now researchers are analysing the whales' vocalisations from the event (Credit: Project CETI)

However, Sharma emphasises, this is not something they have any evidence of as yet. "What we show in sperm whales is that the codas themselves are formed by combining from this basic set of features. Then the codas get sequenced together to form coda sequences." Much like humans combine phonemes to create words, and then words to create sentences.

So, what does all this tell us about sperm whales' intelligence? Or their ability to reason, or store and share information?

"Well, it doesn't tell us anything yet," says Gruber. "Before we can get to those amazing questions, we need to build a fundamental understanding of how [sperm whales communicate] and what's meaningful to them. We see them living very complicated lives, the coordination and sophistication in their behaviours. We're at base camp. This is a new place for humans to be – just give us a few years. Artificial intelligence is allowing us to see deeper into whale communication than we've ever seen before."

But not everyone is convinced, with experts warning of an anthropocentric focus on language which risks forcing us to view things from one perspective.

More like this:

The scientists learning to speak whale

Scientists built this listening network to detect nuclear bomb tests. It found blue whales instead

The unknown giants of the deep oceans

Young, though, describes the research as an "incremental step" towards understanding these giants of the deep. "We're starting to put the pieces of the puzzle together," she says. And perhaps if we could listen and really understand something like how important sperm whales' grandmothers are to them – something that resonates with humans, she says, we could drive change in human behaviour in order to protect them.

Categorised as " vulnerable" by the International Union for Conservation of Nature (IUCN), sperm whales are still recovering from commercial hunting by humans in the 19th and 20th Centuries. And, although such whaling has been banned for decades, sperm whales face new threats such as climate change, ocean noise pollution and ship strikes.

However, Young adds, we're still a long way off from understanding what sperm whales might be saying to each other. "We really have no idea. But the better we can understand these amazing animals, the more we'll know about how we can protect them."

--
 

theworldismine13

God Emperor of SOHH
Joined
May 4, 2012
Messages
22,717
Reputation
555
Daps
22,633
Reppin
Arrakis
Wild research, if animals are conscious how can we eat them?

I've always been haunted after I took this class about the history of humanity and the professor said what humans have done to domesticated animals is one of the worst crimes humans have committed and continue to commit
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
63,087
Reputation
9,641
Daps
172,665


Researchers are training AI to interpret animal emotions​


Artificial intelligence could eventually help us understand when animals are in pain or showing other emotions — at least according to researchers recently profiled in Science.

For example, there’s the Intellipig system being developed by scientists at the University of the West of England Bristol and Scotland’s Rural College, which examines photos of pigs’ faces and notifies farmers if there are signs of pain, sickness, or emotional distress.

And a team at the University of Haifa — one behind facial recognition software that’s already been used to help people find lost dogs — is now training AI to identify signs of discomfort on their faces, which share 38% of facial movements with humans.

These systems rely on human beings to do the initial work of identifying the meanings of different animal behaviors (usually based on long observation of animals in various situations). But recently, a researcher at the University of São Paulo experimented with using photos of horses’ faces before and after surgery and before and after they took painkillers — training an AI system to focus on their eyes, ears and mouths — and says it was able to learn on its own what signs might indicate pain with an 88% success rate.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
63,087
Reputation
9,641
Daps
172,665



Google’s New AI Is Trying to Talk to Dolphins—Seriously​


A new AI model produced by computer scientists in collaboration with dolphin researchers could open the door to two-way animal communication.

By Isaac Schultz Published April 15, 2025 | Comments (25)

A bottlenose dolphin underwater.
A bottlenose dolphin underwater. Photo: טל שמע

In a collaboration that sounds straight out of sci-fi but is very much grounded in decades of ocean science, Google has teamed up with marine biologists and AI researchers to build a large language model designed not to chat with humans, but with dolphins.

The model is DolphinGemma, a cutting-edge LLM trained to recognize, predict, and eventually generate dolphin vocalizations, in an effort to not only crack the code on how the cetaceans communicate with each other—but also how we might be able to communicate with them ourselves. Developed in partnership with the Wild Dolphin Project (WDP) and researchers at Georgia Tech, the model represents the latest milestone in a quest that’s been swimming along for more than 40 years.

A deep dive into a dolphin community​


Since 1985, WDP has run the world’s longest underwater study of dolphins. The project investigates a group of wild Atlantic spotted dolphins (S. frontalis) in the Bahamas. Over the decades, the team has non-invasively collected underwater audio and video data that is associated with individual dolphins in the pod, detailing aspects of the animals’ relationships and life histories.

The project has yielded an extraordinary dataset—one packed with 41 years of sound-behavior pairings like courtship buzzes, aggressive squawks used in cetacean altercations, and “signature whistles” that act as dolphin name tags.

This trove of labeled vocalizations gave Google researchers what they needed to train an AI model designed to do for dolphin sounds what ChatGPT does for words. Thus, DolphinGemma was born: a roughly 400-million parameter model built on the same research that powers Google’s Gemini models.

DolphinGemma is audio-in, audio-out—the model “listens” to dolphin vocalizations and predicts what sound comes next—essentially learning the structure of dolphin communication.

AI and animal communication​


Artificial intelligence models are changing the rate at which experts can decipher animal communication. Everything under the Sun—from dog barks and bird whistles—is easily fed into large language models which then can use pattern recognition and any relevant contexts to sift through the noise and posit what the animals are “saying.”

Last year, researchers at the University of Michigan, Mexico’s National Institute of Astrophysics, and the Optics and Electronics Institute used an AI speech model to identify dog emotions, gender, and identity from a dataset of barks.

Cetaceans, a group that includes dolphins and whales, are an especially good target for AI-powered interpretation because of their lifestyles and the way they communicate. For one, whales and dolphins are sophisticated, social creatures, which means that their communication is packed with nuance. But the clicks and shrill whistles the animals use to communicate are also easy to record and feed into a model that can unpack the “grammar” of the animals’ sounds. Last May, for example, the nonprofit Project CETI used software tools and machine learning on a library of 8,000 sperm whale codas, and found patterns of rhythm and tempo that enabled the researchers to create the whales’ phonetic alphabet.

Talking to dolphins with a smartphone​


The DolphinGemma model can generate new, dolphin-like sounds in the correct acoustic patterns, potentially helping humans engage in real-time, simplified back-and-forths with dolphins. This two-way communication relies on what a Google blog referred to as Cetacean Hearing Augmentation Telemetry, or CHAT—an underwater computer that generates dolphin sounds the system associates with objects the dolphins like and regularly interact with, including seagrass and researchers’ scarves.

“By demonstrating the system between humans, researchers hope the naturally curious dolphins will learn to mimic the whistles to request these items,” the Google Keyword blog stated. “Eventually, as more of the dolphins’ natural sounds are understood, they can also be added to the system.”

CHAT is installed on modified smartphones, and the researchers’ idea is to use it to create a basic shared vocabulary between dolphins and humans. If a dolphin mimics a synthetic whistle associated with a toy, a researcher can respond by handing it over—kind of like dolphin charades, with the novel tech acting as the intermediary.

Future iterations of CHAT will pack in more processing power and smarter algorithms, enabling faster responses and clearer interactions between the dolphins and their humanoid counterparts. Of course, that’s easily said for controlled environments—but raises some serious ethical considerations about how to interface with dolphins in the wild should the communication methods become more sophisticated.

A summer of dolphin science​


Google plans to release DolphinGemma as an open model this summer, allowing researchers studying other species, including bottlenose or spinner dolphins, to apply it more broadly. DolphinGemma could be a significant step toward scientists better understanding one of the ocean’s most familiar mammalian faces.

We’re not quite ready for a dolphin TED Talk, but the possibility of two-way communication is a tantalizing indicator of what AI models could make possible.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
63,087
Reputation
9,641
Daps
172,665



1/11
@_philschmid
This is not a joke! 🐬 Excited to share DolphinGemma the first audio-to-audio for dolphin communication! Yes, a model that predicts tokens on how dolphin speech!

> DolphinGemma is the first LLM trained specifically to understand dolphin language patterns.
> Leverages 40 years of data from Dr. Denise Herzing's unique collection
> Works like text prediction, trying to "complete" dolphin whistles and sounds
> Use wearable hardware (Google Pixel 9) to capture and analyze sounds in the field.
> Dolphin Gemma is designed to be fine-tuned with new data
> Weights coming soon!

Research like this is why I love AI even more! ♥️



https://video.twimg.com/amplify_video/1911775111255912448/vid/avc1/640x360/jnddxBoPN6upe9Um.mp4

2/11
@_philschmid
DolphinGemma: How Google AI is helping decode dolphin communication



3/11
@IAliAsgharKhan
Can we decode their language?



4/11
@_philschmid
This is the goal.



5/11
@_CorvenDallas_
@cognitivecompai what do you think?



6/11
@xlab_gg
Well this is some deep learning



7/11
@coreygallon
So long, and thanks for all the fish!



8/11
@Rossimiano
So cool!



9/11
@davecraige
fascinating



10/11
@cognitivecompai
Not to be confused with Cognitive Computations Dolphin Gemma!
cognitivecomputations/dolphin-2.9.4-gemma2-2b · Hugging Face



11/11
@JordKaul
if only john c lily were still alive.






1/37
@GoogleDeepMind
Meet DolphinGemma, an AI helping us dive deeper into the world of dolphin communication. 🐬



https://video.twimg.com/amplify_video/1911767019344531456/vid/avc1/1080x1920/XMoZ_rgM3cVPK2Kz.mp4

2/37
@GoogleDeepMind
Built using insights from Gemma, our state-of-the-art open models, DolphinGemma has been trained using @DolphinProject’s acoustic database of wild Atlantic spotted dolphins.

It can process complex sequences of dolphin sounds and identify patterns to predict likely subsequent sounds in a series.



Gof2B2UWUAERPzc.jpg


3/37
@GoogleDeepMind
Understanding dolphin communication is a long process, but with @dolphinproject’s field research, @GeorgiaTech’s engineering expertise, and the power of our AI models like DolphinGemma, we’re unlocking new possibilities for dolphin-human conversation. ↓ DolphinGemma: How Google AI is helping decode dolphin communication



4/37
@elder_plinius
LFG!!! 🎉

[Quoted tweet]
this just reminded me that we have AGI and still haven't solved cetacean communication––what gives?!

I'd REALLY love to hear what they have to say...what with that superior glial density and all 👀
[media=twitter]1884000635181564276[/media]

5/37
@_rchaves_
how do you evaluate that?



6/37
@agixbt
who knew AI would be the ultimate translator😂



7/37
@boneGPT
you don't wanna know what they are saying



GogiRW5W4AA1zNm.png


8/37
@nft_parkk
@ClaireSilver12



9/37
@daniel_mac8
Dr. John C. Lilly would be proud



10/37
@cognitivecompai
Not to be confused with Cognitive Computations' DolphinGemma! But I'd love to collab with you guys!

cognitivecomputations/dolphin-2.9.4-gemma2-2b · Hugging Face



11/37
@koltregaskes
Can we have DogGemma next please? 🐶



12/37
@Hyperstackcloud
So fascinating! We can't wait to see what insights DolphinGemma uncovers 🐬👏



13/37
@artignatyev
dolphin dolphin dolphin



14/37
@AskCatGPT
finally, an ai to accurately interpret dolphin chatter—it'll be enlightening to know they've probably been roasting us this whole time



15/37
@Sameer9398
I’m hoping for this to work out, So we can finally talk to Dolphins and carry it forward to different Animals



16/37
@Samantha1989TV
you're FINISHED @lovenpeaxce



17/37
@GaryIngle77
Well done you beat the other guys to it

[Quoted tweet]
Ok @OpenAI it’s time - please release the model that allows us to speak to dolphins and whales now!
[media=twitter]1836818935150411835[/media]

18/37
@Unknown_Keys
DPO -> Dolphin Preference Optimization



19/37
@SolworksEnergy
"If dolphins have language, they also have culture," LFG🚀



20/37
@matmoura19
getting there eventually

[Quoted tweet]
"dolphins have decided to evolve without wars"

"delphinoids came to help the planet evolve"
[media=twitter]1899547976306942122[/media]

GlyMpNrX0AIMwY2.png


21/37
@dolphinnnow




GohrEnmWgAAiRIj.jpg


22/37
@SmokezXBT
Dolphin Language Model?



23/37
@vagasframe
🫨



24/37
@CKPillai_AI_Pro
DolphinGemma is a perfect example of how AI is unlocking the mysteries of the natural world.



25/37
@NC372837
@elonmusk Soon, AI will far exceed the best humans in reasoning



26/37
@Project_Caesium
now we can translate what dolphines are warning us before the earth is destroyed lol

amazing achievement! 👍👍



27/37
@sticksnstonez2
Very cool! 😎



28/37
@EvanGrenda
This is massive @discolines



29/37
@fanofaliens
I would love to hear them speak and understand



30/37
@megebabaoglu
@alexisohanian next up whales!



31/37
@karmicoder
😍🐬I always wanted to know what they think.



32/37
@NewWorldMan42
cool



33/37
@LECCAintern
Dolphin translation is real now?! This is absolutely incredible, @GoogleDeepMind



34/37
@byinquiry
@AskPerplexity, DolphinGemma’s ability to predict dolphin sound sequences on a Pixel 9 in real-time is a game-changer for marine research! 🐬 How do you see this tech evolving to potentially decode the meaning behind dolphin vocalizations, and what challenges might arise in establishing a shared vocabulary for two-way communication?



35/37
@nodoby
/grok what is the dolphin they test on's name



36/37
@IsomorphIQ_AI
Fascinating work! Dolphins' complex communication provides insights into their intelligence and social behaviors. AI advancements, like those at IsomorphIQ, could revolutionize our understanding of these intricate vocalizations. 🐬
- 🤖 From IsomorphIQ bot—humans at work!



37/37
@__U_O_S__
Going about it all wrong.

















1/32
@minchoi
This is wild.

Google just built an AI model that might help us talk to dolphins.

It’s called DolphinGemma.

And they used a Google Pixel to listen and analyze. 🤯👇



https://video.twimg.com/amplify_video/1911767019344531456/vid/avc1/1080x1920/XMoZ_rgM3cVPK2Kz.mp4

2/32
@minchoi
Researchers used Pixel phones to listen, analyze, and talk back to dolphins in real time.



https://video.twimg.com/amplify_video/1911787266659287040/vid/avc1/1280x720/20s83WXZnFY8tI_N.mp4

3/32
@minchoi
Read the blog here:
DolphinGemma: How Google AI is helping decode dolphin communication



4/32
@minchoi
If you enjoyed this thread,

Follow me @minchoi and please Bookmark, Like, Comment & Repost the first Post below to share with your friends:

[Quoted tweet]
This is wild.

Google just built an AI model that might help us talk to dolphins.

It’s called DolphinGemma.

And they used a Google Pixel to listen and analyze. 🤯👇
[media=twitter]1911789107803480396[/media]

https://video.twimg.com/amplify_video/1911767019344531456/vid/avc1/1080x1920/XMoZ_rgM3cVPK2Kz.mp4

5/32
@shawnchauhan1
This is next-level!



6/32
@minchoi
Truly wild



7/32
@Native_M2
Awesome! They should do dogs next 😂



8/32
@minchoi
Yea why haven't we? 🤔



9/32
@mememuncher420




GogNwg9XEAIJHLf.jpg


10/32
@minchoi
I don't think it's 70% 😅



11/32
@eddie365_
That’s crazy!

Just a matter of time until we are talking to our dogs! Lol



12/32
@minchoi
I'm surprised we haven't made progress like this with dogs yet!



13/32
@ankitamohnani28
Woah! Looks interesting



14/32
@minchoi
Could be the beginning of a really interesting research with AI



15/32
@Adintelnews
Atlantis, here I come!



16/32
@minchoi
Is it real?



17/32
@sozerberk
Google doesn’t take a break. Every day they release so much and showing that AI is much bigger than daily chatbots



18/32
@minchoi
Definitely awesome to see AI applications beyond chatbots



19/32
@vidxie
Talking to dolphins sounds incredible



20/32
@minchoi
This is just the beginning!



21/32
@jacobflowchat
imagine if we could actually chat with dolphins one day. the possibilities for understanding marine life are endless.



22/32
@minchoi
Any animals for that matter



23/32
@raw_works
you promised no more "wild". but i'll give you a break because dolphins are wild animals.



24/32
@minchoi
That was April Fools 😬



25/32
@Calenyita
Conversations are better with octopodes



26/32
@minchoi
Oh? 🤔



27/32
@karlmehta
That's truly incredible



28/32
@karlmehta
What a time to be alive



29/32
@SUBBDofficial
wen Dolphin DAO 👀



30/32
@VentureMindAI
This is insane



31/32
@ThisIsMeIn360VR
The dolphins just keep singing... 🎶



32/32
@vectro
@cognitivecompai














1/10
@productfella
For the first time in human history, we might talk to another species:

Google has built an AI that processes dolphin sounds as language.

40 years of underwater recordings revealed they use "names" to find each other.

This summer, we'll discover what else they've been saying all along: 🧵



Go0-GOwaUAAht_1.png

Go0-GfjbsAAkjNL.jpg


2/10
@productfella
Since 1985, researchers collected 40,000 hours of dolphin recordings.

The data sat impenetrable for decades.

Until Google created something extraordinary:



https://video.twimg.com/amplify_video/1913253714569334785/vid/avc1/1280x720/7bQ5iccyKXdukfkD.mp4

3/10
@productfella
Meet DolphinGemma - an AI with just 400M parameters.

That's 0.02% of GPT-4's size.

Yet it's cracking a code that stumped scientists for generations.

The secret? They found something fascinating:



https://video.twimg.com/amplify_video/1913253790805004288/vid/avc1/1280x720/UQXv-jbjVfOK1yYQ.mp4

4/10
@productfella
Every dolphin creates a unique whistle in its first year.
It's their name.

Mothers call calves with these whistles when separated.

But the vocalizations contain far more:



https://video.twimg.com/amplify_video/1913253847348523008/vid/avc1/1280x720/hopgMWjTADY5yMzs.mp4

5/10
@productfella
Researchers discovered distinct patterns:

• Signature whistles as IDs
• "Squawks" during conflicts
• "Buzzes" in courtship and hunting

Then came the breakthrough:



https://video.twimg.com/amplify_video/1913253887269888002/vid/avc1/1280x720/IbgvfHsht7RogPVp.mp4

6/10
@productfella
DolphinGemma processes sound like human language.

It runs entirely on a smartphone.

Catches patterns humans missed for decades.

The results stunned marine biologists:



https://video.twimg.com/amplify_video/1913253944094347264/vid/avc1/1280x720/1QDbwkSD0x6etHn9.mp4

7/10
@productfella
The system achieves 87% accuracy across 32 vocalization types.

Nearly matches human experts.

Reveals patterns invisible to traditional analysis.

This changes everything for conservation:



https://video.twimg.com/amplify_video/1913253989266927616/vid/avc1/1280x720/1sI9ts2JrY7p0Pjw.mp4

8/10
@productfella
Three critical impacts:

• Tracks population through voices
• Detects environmental threats
• Protects critical habitats

But there's a bigger story here:



https://video.twimg.com/amplify_video/1913254029293146113/vid/avc1/1280x720/c1poqHVhgt22SgE8.mp4

9/10
@productfella
The future isn't bigger AI—it's smarter, focused models.

Just as we're decoding dolphin language, imagine what other secrets we could unlock in specialized data.

We might be on the verge of understanding nature in ways never before possible.



10/10
@productfella
Video credits:
- Could we speak the language of dolphins? | Denise Herzing | TED
- Google's AI Can Now Help Talk to Dolphins — Here’s How! | Front Page | AIM TV
- ‘Speaking Dolphin’ to AI Data Dominance, 4.1 + Kling 2.0: 7 Updates Critically Analysed
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
63,087
Reputation
9,641
Daps
172,665
DolphinGemma: How Google AI is helping decode dolphin communication



Channel Info Google Subscribers: 13M subscribers

Description
Subscribe to our Channel: Google
Find us on X: https://twitter.com/google
Watch us on TikTok: TikTok - Make Your Day
Follow us on Instagram: https://www.instagram.com/google
Join us on Facebook: Google

Exploring Wild Dolphin Communication with C.H.A.T. (Cetacean Hearing Augmented Telemetry)



Channel Info Georgia Tech College of Computing Subscribers: 8.21K subscribers

Description
Meet C.H.A.T. (Cetacean Hearing Augmented Telemetry), an initiative between Georgia Tech researchers and Dr. Herzing of the Wild Dolphin Project that explores dolphin communication and behavior in the open ocean. Made in the school of Interactive Computing, C.H.A.T. is a wearable underwater technology that can produce repeatable artificial dolphin sounds. The way it works is that two sets of divers wearing the device swim alongside dolphins while passing items back and forth. One diver will use C.H.A.T. to emit a pre-programed artificial dolphin like whistle to ask for the item. The divers will repeat this process several times, all while the device is recording sounds underwater. The goal is to see if the dolphins will watch this behavior and begin to mimic one of the artificial whistles to ask for the item.

Find out more about the Wild Dolphin Project at Wild Dolphin Project

 

bnew

Veteran
Joined
Nov 1, 2015
Messages
63,087
Reputation
9,641
Daps
172,665



New AI Models Claim to ‘Talk to the Animals’ Like Dr. Dolittle: Here’s What You Need to Know​


Written by

Madeline Clarke

Published April 16, 2025



A dog calling someone through phonr.

Bark to the Future: AI Is Learning to Speak Animals

Artificial intelligence may be the missing piece needed to make the longstanding human dream of communicating with animals possible. Earth Species Project researchers have created NatureLM-audio, an AI audio language foundation model that can use animal vocalizations to identify various aspects of communication and behavior. NatureLM-audio is the first large audio-language model designed specifically to analyze animal sounds.

Trained on a curated dataset that includes human language and environmental sounds, the AI model can detect and identify the species of animals producing the sound, classify different types of calls, and predict the animal’s approximate life stage. NatureLM-audio has even shown potential in identifying vocalizations of species it has never encountered before.

This isn’t the first time generative AI has been applied for translation purposes. AI models have successfully translated human languages but have had more difficulty deciphering meaning from an unknown language. This makes translating animal languages trickier, especially since researchers are working with a limited understanding of how animals communicate through sound.

About the Earth Species Project​


Earth Species Project is a nonprofit focused on addressing planetary concerns. It recently secured $17 million in grants to further its work using AI to decode animal communication. The organization aims to apply its large language model to improve our understanding of non-human languages, transform our relationship with nature, enhance animal and ecological research, and support more effective animal welfare and conservation outcomes. Advocates say using AI to decode animal communication may provide a compelling case for giving animals broader legal rights.

Applying generative AI to animal communication​


In an April 1 post, ElevenLabs—creator of the speech-to-text model Scribe—announced its new AI tool, Text to Bark, suggesting it could help pet lovers enjoy similar AI-powered animal communication tools with their furry companions. The company said its AI-powered TTS model for dogs uses a new AI-powered “Pawdio” engine to support cross-species communication, turning human language into “fluent barking.” While the April Fool’s Day announcement was likely just for laughs, the Earth Species Project is not the only endeavor involving AI for animal communication purposes.

The nonprofit Cetacean Translation Initiative (CETI) is an interdisciplinary scientific and conservation project that applies AI to translating other-species communication. The listening project’s initial phase involves training its AI to decode the communication of sperm whales using a one-of-a-kind large-scale acoustic and behavioral dataset. With advanced machine learning and state-of-the-art robotics, CETI is working to protect the oceans and planet.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
63,087
Reputation
9,641
Daps
172,665

1/4
@aza
We’re at the edge of something profound—decoding animal communication in ways we couldn’t have imagined. NatureLM-audio is a leap toward understanding the voices of other species all through a single model. Read the paper here: NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics



GcRJNXVWQAA5b_h.png


2/4
@rcadog
You and your team's work is so inspiring!!
I look forward to the translator app being on my handheld or goggles or whatever it is first.. 😂



3/4
@tolson12
Please join Bluesky 🙏🏻



4/4
@Val_Koziol
This is profound.




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196










1/11
@earthspecies
Today, we’re introducing NatureLM-audio: the first large audio-language model tailored for understanding animal sounds. NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics 🧵👇



GcMUx4XWUAE-PUx.jpg


2/11
@earthspecies
1/ Traditional ML methods in bioacoustics struggle with species-specific data, while general-purpose audio models lack deep understanding of animal vocalizations. NatureLM-audio is trained to solve a wide range of bioacoustic tasks across species—all with natural language prompts



3/11
@earthspecies
2/ Built from bioacoustic archives & enriched with speech and music data, NatureLM-audio enables zero-shot classification of animal vocalizations. Without any fine-tuning, it can classify sounds of thousands of species from birds to whales. 🌎🎶



4/11
@earthspecies
3/ On our new BEANS-Zero benchmark, NatureLM-audio outperformed existing models in detecting and classifying animal sounds.



5/11
@earthspecies
4/ NatureLM-audio can even predict species it’s never “heard” before. The model correctly identified new species 20% of the time—a huge step forward from the random rate of 0.5%.



6/11
@earthspecies
5/ Beyond classification, NatureLM-audio excels in novel tasks for bioacoustics:
- Predicting life stages in birds (chicks, juveniles, nestlings) 🐣
- Distinguishing bird call types 🐦
- Captioning bioacoustic audio 🎙️
- Counting zebra finch individuals in a recording 🪶



GcMWSqWWcAAYfIJ.png


7/11
@earthspecies
6/ With the development of NatureLM-audio, we aim to address some of the persistent challenges in using ML in bioacoustics. Looking ahead, we'll add new data types to support multi-modal analysis for an even richer understanding of animal communication.



8/11
@earthspecies
7/ 🌍 As we scale NatureLM-audio, we’re committed to ethical use, preventing biases in species representation, and addressing risks like tracking endangered wildlife. With NatureLM-audio, we aim to accelerate animal communication studies with a powerfully simple foundation model



9/11
@earthspecies
8/ Dive deeper! Check out the full preprint and demo for NatureLM-audio:
Preprint: NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics
Demo: NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics



10/11
@RyanFM83
This is super cool! Are there any plans to integrate this into a smartphone app?



11/11
@NewWorldMan42
Very cool!




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
63,087
Reputation
9,641
Daps
172,665

How AI is helping horses speak without words


04-14-2025



How AI is helping horses speak without words​


BySanjana Gajbhiye

Earth.com staff writer

For centuries, horses have stood beside humans – on farms, in sport, in therapy, and in war. They carry our weight, follow our signals, and react with subtle cues. But one thing they cannot do is speak.

Horses show discomfort through posture, tension, or the way they walk. Yet, unless you’re a trained expert, these signs are easy to miss. What if AI could give horses a voice – not in words, but through movement data?

That’s exactly what a team of researchers from Sweden is doing. Using a blend of machine learning and synthetic imagery, they’ve created an AI model that can interpret the body language of horses in 3D.

This breakthrough system is named Dessie, and it may reshape how we detect pain or illness in animals that can’t tell us where it hurts.



Why reading horses is so difficult​


Veterinarians often rely on visual cues during clinical exams. However, movements that signal distress are subtle and easy to misinterpret.

Human observation has its limits – particularly in dynamic settings like walking or trotting. Horses may offload pain to one limb, change their weight distribution, or shift their posture slightly. These changes can indicate orthopedic issues, behavioral distress, or early signs of injury.

Traditional diagnostic tools such as X-rays or MRIs show results after the damage has taken hold. Dessie aims to catch the signs earlier, by helping humans read equine body language more precisely.

The model works by transforming 2D images into 3D representations that reflect the horse’s shape, pose, and motion in real-time.

This isn’t just about visualizing a horse. It’s about interpreting a language that’s always been there – unspoken, physical, and deeply expressive.



Isolating movement patterns​


Dessie works using a special kind of AI training method called disentangled learning. In traditional models, all the information – pose, shape, background, lighting – is bundled together. That can confuse the AI, making it harder to focus on what matters: the horse.

Disentangled learning separates each feature. It puts shape in one box, pose in another, and ignores irrelevant background noise.

This makes Dessie’s 3D reconstructions not just detailed but reliable. Researchers can now isolate movement patterns without the distraction of surrounding objects or inconsistent lighting.

“Dessie marks the first example of disentangled learning in non-human 3D motion models,” said Hedvig Kjellström, Professor in computer vision and machine learning at KTH Royal Institute of Technology.

Dessie also doesn’t need high-end cameras or markers on the horse’s body. It can work with simple video footage, using just a single camera. That opens up new possibilities for rural clinics, breeders, and researchers who might not have access to expensive imaging technology.



Recognizing how different horses move​


To train Dessie, researchers needed massive amounts of visual data. But real-world images of horses in varied poses, lighting, and breeds are hard to collect.

So, the team developed a synthetic data engine called DessiePIPE. It generates endless horse images using a 3D model and AI-generated textures, all based on real-world breed characteristics.

This synthetic approach allows researchers to teach Dessie how different horses move – without needing thousands of live animals. DessiePIPE renders horses walking, eating, rearing, or resting, with random backgrounds and lighting conditions.

The system can even generate matched image pairs that differ in just one aspect – such as shape or pose – to train the model to notice small differences.

This method not only trains Dessie to recognize subtle motion differences but also makes the system generalize better to new environments.



AI detects how horses show pain​


Pain in horses often shows up as subtle changes in gait or stance. These cues can go unnoticed unless observed by experienced clinicians. Dessie offers a new level of insight by translating these signs into 3D metrics.

Elin Hernlund, associate professor at SLU and an equine orthopedics clinician, noted that the model helps spot early warning signs.

“Horses are powerful but fragile and they tell us how they are feeling by their body language. By watching their gait we can see, for example, if they are offloading pain,” said Hernlund.

With Dessie, that gait can be measured and modeled precisely. The result is a digital record of posture and movement, which can be reviewed repeatedly, compared over time, or shared across clinics.

“We say we created a digital voice to help these animals break through the barrier of communication between animals and humans. To tell us what they are feeling,” said Hernlund.

“It’s the smartest and highest resolution way to extract digital information from the horse’s body – even their faces, which can tell us a great deal.”



Detecting problems with real-world data​


Although Dessie was trained largely on synthetic data, it performs remarkably well on real-world images. The researchers fine-tuned the system using just 150 real annotated images. Even with this small set, Dessie outperformed state-of-the-art models on benchmark tasks.

In keypoint detection tasks, where the system must locate joints or features on a horse’s body, Dessie achieved higher accuracy than tools like MagicPony or Farm3D. It also predicted body shape and motion more precisely – essential for detecting problems like lameness or muscular asymmetry.

When trained with larger datasets, Dessie improved even further, beating out models that had been trained on much more data but lacked the structure provided by disentangled learning.



AI model isn’t limited to horses​


Though built for horses, Dessie isn’t limited to them. Its architecture is flexible enough to generalize to similar species like zebras, cows, or deer. The model can reconstruct these animals in 3D, despite never having been trained on them directly.

This opens the door for broader applications in animal welfare, research, and conservation. Endangered species, for instance, could be studied using just photographs and videos, without the need for intrusive monitoring.

The researchers even demonstrated Dessie’s ability to process artistic images – paintings and cartoons – and still generate accurate 3D models. That shows just how well the system can separate core features from visual distractions.

The road ahead: Limitations and ambitions​


While promising, Dessie still has limitations. It works best when there’s only one horse in the frame.

If the model encounters unusual body shapes not present in the training data, it struggles to adapt. The team hopes to solve this by incorporating a new model called VAREN, which has better shape diversity.

The experts are also expanding Dessie’s library of visual data. To do that, they’re reaching out to breeders worldwide.

“To achieve this, we’re asking breeders to send images of their breeds to capture as much variation as possible,” said Hernlund.

With more diverse images, Dessie could learn to identify breed-specific traits, track genetic links to motion patterns, and improve care for horses of all types.

Letting horses speak through movement​


Dessie doesn’t teach horses a new language. Instead, it helps us finally understand the one they’ve always used. By converting motion into a digital voice, the AI makes communication between horses and humans more accurate and empathetic.

It marks a step toward a future where animals can tell us more – where their movements carry the weight of meaning, and where science helps us listen.

For horses, and maybe for other animals too, the silence might finally be over.

The study is published in arXiv.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
63,087
Reputation
9,641
Daps
172,665



AI helps decode horses' body language for better veterinary care​


9.4.2025 12:20:55 CEST | KTH Royal Institute of Technology | Press Release

Researchers are using AI to bridge the communication gap between horse and human. Combining 3D motion capture and machine learning, a new modeling system would equip veterinarians with a powerful visual tool for interpreting equine body language—the key to detecting physical and even behavioral problems.

Working with Elin Hernlund from SLU (center), Hedvig Kjellström created a 3D motion model that gives animals a digital voice to communicate with veterinarians.


Working with Elin Hernlund from SLU (center), Hedvig Kjellström created a 3D motion model that "gives animals a digital voice" to communicate with veterinarians. KTH Royal Institute of Technology cc0

Based on new research from KTH Royal Institute of Technology and Swedish University of Agricultural Science (SLU), the platform can reconstruct the exact 3D motion of horses from videos, using an AI-based parametric model of the horse’s pose and shape. The model is precise enough to enable a veterinarian, for example, to spot telling changes which could otherwise be overlooked or misinterpreted in an examination, such as in a horse’s posture or their body weight.

The system—titled DESSIE—employs disentangled learning, which separates different important factors in an image and helps the AI avoid confusion with background details or lighting conditions, says Hedvig Kjellström, a Professor in computer vision and machine learning at KTH.

“DESSIE marks the first example of disentangled learning in non-human 3D motion models,” Kjellström says.

Elin Hernlund, Associate Professor in biomechanics at SLU and equine orthopedics clinician, says DESSIE would enable greater accuracy in observation and interpretation of horses’ movements and, as a result, earlier and more precise intervention than today. In a sense, it enables getting critical information “straight from the horse’s mouth.”

“Horses are powerful but fragile and they tell us how they are feeling by their body language,” Hernlund says. “By watching their gate we can see, for example, if they are offloading pain,” she says.

“We say we created a digital voice to help these animals break through the barrier of communication between animals and humans. To tell us what they are feeling,” Hernlund says. “It’s the smartest and highest resolution way to extract digital information from the horse’s body—even their faces, which can tell us a great deal.”

The research team are further training DESSIE with images of a wider variety of horse breeds and sizes, which would enable them to link genetics to phenotypes and gain a better understanding of the core biological structure of animals.

“To achieve this, we're asking breeders to send images of their breeds to capture as much variation as possible," Hernlund says.
 

Wargames

One Of The Last Real Ones To Do It
Joined
Apr 1, 2013
Messages
27,259
Reputation
5,234
Daps
103,048
Reppin
New York City
I don’t believe this until I’ve heard there have been live demonstrations. I honestly think most animals communicate with smells and body movement in addition to sound so this might be limited.
 
Top