Experimental composer Holly Herndon built an AI voice clone that anyone can use

abdulmanannet77@gmail.com1 week ago

0 0 5 minutes read

Experimental composer Holly Herndon built an AI voice clone that anyone can use

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c

March 3, 2026

4 min reading

Google logo Add us on GoogleAdd SciAm

This musician built an AI clone of her voice so everyone can sing like her

Experimental composer Holly Herndon says this technology is not here to replace artists and that the future of creativity belongs to collective intelligence

By Deni Ellis Béchard edited by Eric Sullivan

Holly Herndon standing inside the Serpentine North Gallery in London, framed by a suspended circular sculptural structure, with brick walls in the background. — Holly Herndon at the Serpentine North Gallery in London in October 2024.

Matthew Chattle/Future Editions via Getty Images

Holly Herndon hears the future of music in data. Herndon came to electronic music after singing in churches and choirs in East Tennessee. She earned a master’s degree from Mills College and a doctorate from the Center for Computer Research in Music and Acoustics at Stanford University.

When she began experimenting with machine learning in 2015, the results seemed “raucous,” but she remembers seeing “the diamond in the rough.” Today, these experiences have evolved into personalized models that allow everyone to play like her.

Scientific American spoke to Herndon about training his AI models and his belief that creativity has always been collective – AI simply makes it visible.

On supporting science journalism

If you enjoy this article, please consider supporting our award-winning journalism by subscription. By purchasing a subscription, you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

[An edited transcript of the interview follows.]

You describe your work as “protocol art”. What does this mean?

In the 20th century, the place of media generation – the paper and pen where music was written – was the artistic act. With protocol art, the creative act occurs before the media generation. It’s about creating a set of rules and conditions under which art is created.

We are really interested in training our own models. I always say “we” because I work with my partner, Mat Dryhurst. We treat each step of the model-making process as a moment of creative intervention. Creating the dataset is part of the artwork. I often write music for practice – music not necessarily for human ears but for a computer to learn something.

Can you give me an example of what this looks like in practice?

We currently have an exhibition in Berlin. We were inspired by Hildegard von Bingen, a medieval composer. We wanted to pretend that polyphony had existed during his lifetime. We started with a template of his compositions and added sets of rules so that it could generate polyphony in his style. We took these releases, rearranged them and gave them to human singers to perform. Then we created a huge installation in which artists sing and invite the public to practice with us.

It’s not about saying “write me a pop song with a guitar”. It’s about using this technology to bring humans together to create art in real space.

Most commercial AI models are trained on data scraped from the Internet. Why do you insist on building your own models?

As an electronic musician, I’ve never been one to sample: I’ve always created my own sonic palettes. When we started, before Suno and before all this, we had to create our own dataset. It felt natural to me, like creating my own samples or digital instruments.

A product review [like Suno] is that they have a very “average” sound, trained on everything or the most average. My models appear unique because I create the training data myself. I also think there are incentives under the hood in Suno that limit him to three-minute songs with a verse-chorus structure. There are guardrails which make the whole thing annoying. I would like them to release certain constraints.

Has a model ever surprised you?

We made a project called Holly+ around 2021, a clone of my particular voice. We worked with Voctro Labs to train a voice model that works in real time so people can sing using my voice. This was a game changer.

If it works in real time, other people can make an identity in real time. When we tested it, my partner, who is British, was singing along to it. I heard my voice with a British accent. It was so strange I had to leave the room – he was singing like me. It was one of the biggest mental discoveries of how weird and cool this stuff can get.

I think it will take five to ten years for everything to be seamless. But once we transform our body in real time, imagine if you could create a whale voice model and then create a hybrid soprano whale. When you sing high, it becomes opera; when you sing low, you’re more like a whale or Barry White. We are no longer linked to my larynx.

Where do you think we will be in 10 years?

A lot of the fear around this technology actually has to do with the way the Internet works today – the attention economy, the difficulty of making a living as a creator. My partner always says, “Scrolling is for robots and walking is for humans.” »

Our most optimistic vision is to use agents to handle all the bullshit and filter things out, bringing us together in the real world. That’s why our projects involve people meeting IRL and doing things together. Some of my smartest developer friends code with multiple agents while cooking or hiking with their toddler. Things could be really beautiful if we imagined and built them that way.

Does this technology change your definition of creativity?

This whole AI thing might force us to consider ourselves as perhaps not the only creative actors in the universe. It doesn’t have to be scary, it could be beautiful and liberating.

Creativity happens in swarms, in community. AI is just collective intelligence: aggregated human intelligence. The artistic model of the 20th century is linked to an individual genius who touches an object and gives it value. It’s a reversed situation. I am a collective intelligence team.

It’s time to defend science

If you enjoyed this article, I would like to ask for your support. Scientific American has been defending science and industry for 180 years, and we are currently experiencing perhaps the most critical moment in these two centuries of history.

I was a Scientific American subscriber since the age of 12, and it helped shape the way I see the world. SciAm always educates and delights me, and inspires a sense of respect for our vast and beautiful universe. I hope this is the case for you too.

If you subscribe to Scientific Americanyou help ensure our coverage centers on meaningful research and discoveries; that we have the resources to account for decisions that threaten laboratories across the United States; and that we support budding and working scientists at a time when the value of science itself too often goes unrecognized.

In exchange, you receive essential information, captivating podcasts, brilliant infographics, newsletters not to be missed, unmissable videos, stimulating games and the best writings and reports from the scientific world. You can even offer a subscription to someone.

There has never been a more important time for us to stand up and show why science matters. I hope you will support us in this mission.

abdulmanannet77@gmail.com1 week ago

0 0 5 minutes read