Opinion | ‘Artificial Intelligence’? No, Collective Intelligence. (2024)

ezra klein

From New York Times Opinion, this is “The Ezra Klein Show.”

[MUSIC PLAYING]

We are already awash in crappy A.I. content. Some of it is crappy commercial A.I. content that wants to sell you things. Some of it is crappy A.I. art. And it got me interested amidst all this complaining. What does it mean right now to be making good A.I. art? And so I read this profile of the A.I. artist and musician Holly Herndon in “The New Yorker,” and then separately, this DJ I met mentioned her work to me. So I should check this out.

And so I went and listened to her 2019 album, “PROTO,” which was done alongside an A.I. voice trained on her voice and others. And I was walking to work when the song “Fear, Uncertainty, Doubt” came on.

[HOLLY HERNDON, “FEAR, UNCERTAINTY, DOUBT”]

And I just stopped walking.

[HOLLY HERNDON, “FEAR, UNCERTAINTY, DOUBT”]

What makes so much A.I. art so bad, in my opinion, is that it’s so generic. These are generative systems. We keep calling them generative. But generative is so — when we use that term, it usually means it helped you get somewhere new. But these systems are mimics. They help you go somewhere old. They can help us write or draw or compose like anyone else. But I find it much harder when using them to become more like yourself. And most of what I see coming out of people using them, it’s all riffing on others in this very obvious way.

What I like about Herndon’s art is she uses A.I. to become weirder, stranger, more uncanny, more personal. It’s going in the exact opposite direction. And some of her art questions the entire way these systems work. She and her partner, Mat Dryhurst, did this project at the Whitney Biennial this year, where they created an image generator based on images of Herndon, or at least what the A.I. system seemed to think she looked like, which is she’s got this very striking copper hair. And so the way it understood her was really around this striking copper hair. She is, as she put it, a haircut.

And so they manipulated these images and they made this A.I. system where anybody can generate any image in the style of what A.I. systems think Holly Herndon is. So you can generate an image of a house, and it’ll have this long flowing copper hair. And it’ll tag itself as an image of Holly Herndon. And because it’s on the Whitney Biennial, these images have a certain authority in the way these A.I. scrapers work.

And so as they are scraping the internet for images in the future, she is potentially poisoning their idea of what she is. She is taking control over the A.I.‘s idea of Holly Herndon. I find that fascinating, A.I. art that is acting as a kind of sabotage of A.I. systems and the lack of voice we have in how we appear in them. Along with a bunch of collaborators, Herndon has a lot of projects trying to blaze a trail and do not just good A.I. art, but fair economics and ethics. And so I wanted to have her on the show to talk about it. As always, my email, ezrakleinshow@nytimes.com. [MUSIC PLAYING]

Holly Herndon, welcome to the show.

holly herndon

Thanks. It’s great to be here.

ezra klein

So something I find fascinating about you is that you grew up singing in church choirs. Then you moved to Berlin after college and got deep into Berlin techno. And I think those are respectively the most human and the most inhuman forms of music that human beings make. So how did they shape you?

holly herndon

Yeah, that’s a really good question. I mean, I feel like I’m such a product of the environments that I’ve spent a lot of time in, so I’m really interested in folk singing traditions coming from East Tennessee. Of course, growing up in a town next to where Dolly Parton is from, she always loomed large. Then I spent a lot of time in Berlin. And so of course, electronic music and techno has played a really big part of my story. And then also moving to the Bay Area, where I got really deeply interested in technology.

I feel like even though techno might sound and does kind of have a synthetic palette and does sound maybe inhuman, I feel like the rituals that happen around the music are very human and very sweaty and very embodied. So I think if you experience that culture in person, it feels less inhuman.

ezra klein

But why does that magic happen? So I was in Berlin, and I was down in the sort of big room in the bunker, I would call it, as sort of the way it felt to me. And I would say the music felt like being inside of a machine gun but in a good way. And meanwhile, as you say, what’s happening around it — I mean, it was actually the most inhuman music I’ve ever heard. And I like electronic music. But what’s happening around it is so human. I mean, all these people engaged in this most physical, sweaty, smelly ritual of dancing together. How do you understand both the meaning and the function of it? Why does music like that create that kind of transcendence?

holly herndon

I mean, this might sound strange, but music is a kind of coordination technology. So 4/4 techno beat is maybe the most clear communication of that. It’s so easy to participate in. It’s fairly easy to make. It’s also fairly easy to dance to and understand. So I feel like as a kind of — if I want to call it a kind of protocol, it’s an easy way to communicate what to do in that scenario. So I think that that’s why people have organized around it so much.

ezra klein

When I go out and listen to the further reaches of techno, in Berlin, in New York, where I live, I’ll often find myself at some point in the night thinking every piece of sound in this music is a choice. And when that choice sounds very artificial, right, when it sounds like something so removed from somebody playing strings or somebody singing, I think this person wanted to communicate in this extraordinarily machine-like way.

And this has been happening for a long time, I mean, talk boxes and synthesizers and all of these technologies. And I’m curious, as somebody who’s made some of that music or is deeply, at least within the culture that has made it, what is appealing about that? I mean, you said it creates this very sweaty human ritual. But first, there is this transition of the person into something that does not sound like people. It sounds like music that robots might make. It sounds like music from a faraway culture.

holly herndon

Maybe there’s something about living in in such a technologically mediated world that makes us want to find how we fit into that as humans. And music is such a kind of innate part of being a human. I mean, as a performer of the laptop, I was always trying to find a way to make the laptop feel really embodied, because at the time, when I started performing a lot, there was this criticism that, oh, you could be checking your email, or this doesn’t really feel like a lively performance. So I started using my voice as a kind of input stream.

[HOLLY HERNDON, “FADE”]

And the thing that I found really liberating about using my voice in that way is that I could kind do anything to digitally manipulate my voice to make it be so much more than it is physically. But what I really enjoyed was using my voice as a kind of controller or data stream, and then it could do things that I couldn’t imagine once I put it in the laptop and was able to process it in specific ways.

So there’s something about trying to come to terms with the systems around us by working through them and working with them. Collaborating maybe helps us kind understand where we sit in that feedback loop.

ezra klein

So in a minute, I want to play a clip of a piece of music you made. But first, I want to talk about how you made it. So tell me about Spawn.

holly herndon

Spawn, who was our A.I. baby experiment. “PROTO” was released in 2019, and Spawn came about two years before that. So at the time, it was a very different time, especially for audio. A lot of the visual models were developed earlier.

But eventually, things got better. We started playing with a project called SampleRNN and some other software. And you’ll hear still from the from the stems that we might play later the vocal quality, the sound quality from 2017, 2018. To me, it sounded like the really early recordings that you can find on YouTube. I think it’s like the earliest audio recording. It sounds really scratchy and super low fidelity. That’s what the audio sounded like back in the day.

And so it was this real issue of trying to get the high fidelity recordings that I was doing with my ensemble in the studio to live in the same universe as this really scratchy, low-fi audio that I was generating through Spawn.

ezra klein

Well, why don’t we play a bit of that? Because you kindly shared the stems for the song “Swim.” And maybe we should start here by playing the ensemble, the sort of chorus you brought together to sing for the album.

[VOCALIZATION]

So that’s really beautiful and really human. And now on the other side, I want to play the Spawn track on its own.

[MUSIC PLAYING]

What am I hearing when I hear that somewhat nightmarish Spawn there?

holly herndon

So Spawn was trained on the voices of the ensemble. And so back then, we couldn’t deal with polyphony, which means one — more than one note at a time. So what we had to do was break each line into an individual line, and then we would feed that line to Spawn, who would then sing it back through the voice of our ensemble. And I think we were feeding it through with a either a voice synthesizer or a piano. I can’t remember. It’s been so long.

But so we basically use this idea, which is called timbre transfer. So that’s where the computer learns the logic of one sound and kind of superimposes that onto the performance of another. So that’s what we did. We had the ensemble sing a variety of phrases. We trained Spawn on their voices. And then we did a timbre transfer. We fed her the line that we wanted her to sing. And then she sang it back to us.

ezra klein

And I think hearing that, one question you could have is, well, what do you need Spawn for? Why not just have a human being sing into a talk box or use a synthesizer or ableton? We can make people’s voices sound strange already, auto-tune. What is the value of Spawn here?

holly herndon

I think overall spawn has a unique timbral quality that I actually really love because it really is a snapshot in time. It doesn’t sound like that anymore. It sounds really clean and, yeah, really high fidelity. But at that period of time, it’s I almost have this like romanticism around that, almost like a vinyl hiss or a pop for that very particular period of time in machine learning research.

But also, I felt like I really needed to be making my own models and dealing with the subject directly in order to have a really informed opinion about it. And I’m really glad that I made that decision, because it’s informed so much of the work that I do today. Just even the very basic understanding that that a model’s output is so tied to the training data, the input, I don’t know that I would have come to the profundity of that had I not been training my own models. And that’s really informed all of the work that I’ve done since then. So I think sometimes you just have to deal with the technology in order to make informed work around that technology.

ezra klein

We’re going to come back to the profundity of that, because I actually think it’s really important. But I wanted to do two things before we do. One is to play a bit of the full song, “Swim,” so people can hear where this ended up.

[HOLLY HERNDON, “SWIM”]

And so then I want to play something you just released more recently using — I don’t know if you’d call this an updated Spawn, but we’re calling Holly+, which is this much more modern voice model trained on your voice that you had covered Dolly Parton’s Jolene.

[HOLLY HERNDON, “JOLENE”]

So obviously the unearthly quality is gone. What am I hearing? Who — what is singing?

holly herndon

So that is a voice model trained on my voice. I worked with some researchers in Barcelona in a studio called Voctro Labs at the time. And Holly Plus was born. And as you can hear, it’s leaps and bounds better, more high — higher fidelity than little Spawn.

So basically, that version of Holly+, there are multiple versions. There’s a version that can be performed in real time, but this particular version is a score-reading piece of software. So I basically just write out a score with the text written out in phonemes. And then the software spits out basically pitch-perfect performance of that song. And of course, it helps to have Ryan Norris playing a beautiful human guitar accompaniment.

ezra klein

That’s a use case that I’m fascinated by, that I imagine will become more and more common in the future, which is a model trained on a person that one can sort of almost autonomously create as if it were that person. You can imagine somebody training a model on all of my podcasts. And then the model generates questions they could ask somebody, or a model generated all of my columns and you can spit out a an Op-Ed.

What is your relationship with that? And do you see it as an extension of what you can do? Or do you see it as a kind of partner you can collaborate with? Or do you see it as just some version of you that makes you scale because you can’t take commissions to sing from everybody out in the public, but they can all go to Holly Plus and get it to sing on their behalf? Like, what is your relationship with this nascent other you, or at least other voice of you, that now exists in the world?

holly herndon

I think I’m probably an outlier in my relationship here because my practice involved so much vocal processing. So if you listen to movement —

[HOLLY HERNDON, “MOVEMENT”]

— or a platform —

[HOLLY HERNDON, “CHORUS”]

The albums before “PROTO” before I started working with machine learning, I was already taking my voice and kind of mangling it beyond recognition, turning it into a machine itself. So for me to make a model of my voice that felt like the natural next step in an already very kind of highly-mediated process with my voice, I don’t expect everyone to have that relationship. I don’t really see the Holly+ voice as something that replaces me in any way. It’s something that I have fun playing with. I can attempt to perform things that I wouldn’t normally be able to.

You know, I did a performance with Maria Arnal in Barcelona, and, I mean, that music is so difficult to perform. I could never sing that. She can do all of these amazing melismatic diva runs that I could never dream of, but my voice model could do it. And that was really fun. And it didn’t confuse me to think, OK, I can do that now. It was more just fun to hear myself do something that I know that I couldn’t do alone acoustically. So I guess for me, it’s maybe like an extension or an augmentation of my own self.

ezra klein

So what did Holly+ add to that cover of Jolene? I mean, you could have just sung a few tracks of harmony and added them above the melody. So what does A.I. mean to you in it specifically?

holly herndon

Well, I think that one is perhaps a little personal because growing up in east Tennessee, Dolly Parton was kind of the patron saint of that region. And the kind of music that I usually perform has very heavily processed vocals and is usually, it’s a bit more abstract than a Dolly Parton song. So it was almost like I wouldn’t afford myself that or allow myself that, but I would allow Holly+ to do it, because there was this kind of level of removal. It’s almost like Holly+ can perform things that I would be too bashful to perform myself.

ezra klein

Oh, that’s really interesting, the idea that having another version of yourself out there could give you license to try things you wouldn’t otherwise try.

holly herndon

Yeah, like Jolene. I mean, I love Jolene as a project, but it doesn’t have the same ghostliness and quality as the music on “PROTO,” which is why I didn’t release it as an album, you know. It’s just not as interesting, somehow.

ezra klein

I guess the other thing, there’s a question of meaning here that I’ve been circling in my own time playing around with A.I. I spent a bunch of time recently creating sort of A.I. friends and therapists. And, you’re trying to understand, like, the relational A.I.s that you can build now.

And on the one hand, I was amazed at technically how good a lot of them were. At the same time, I find I never end up coming back. I find it very hard to make the habit sticky or the relationship sticky.

When I sit with my friend or my partner, the fact that they are choosing to be there with me is separate from the things that they are saying. And an experience I’m having with a lot of A.I. projects is that the output is pretty good, right? Holly+ sings really well. Or the therapist friend I made on Kindroid texts in a way that if you had just shown me the text, I would not know it’s not a human being.

But the absence of there being the meaning of it that another person brings, the fact that I know it’s Holly+, like, it’s a cool project, but I’m not going to keep listening to it. The fact that I know the Kindroid can’t not show up to talk to me, that that’s a relationship I control totally. It robs the interaction of meaning in a way that makes it hard for me to keep coming back to it.

And so somebody who works a lot with the question of meaning and sees a lot of these A.I. efforts happening, how do you think about what imbues them with meaning, and in what cases they end up feeling hollow?

holly herndon

It’s really funny. We did a live performance from “PROTO,” I guess in 2019, in New York. And we had the ensemble on the stage. And afterwards someone came up to me and they said, “I really enjoyed the show, but I don’t understand what it has to do with A.I.”

And actually that was the biggest compliment that I could receive, because I wasn’t trying to project this kind of super future, you know, A.I., high-tech story.

I was trying to show all of the kind of human relationships and the human singing that goes into training these models. That’s something I was really trying to get to with that album is, you know, allowing the some of the things that the computer can do, you know, some of the coordination that it can do is remarkable. But it can also free us up to just be more human together, to really just focus on the parts that we really want to focus on, which is just enjoying that moment of singing on stage together.

I’m also not so interested in necessarily having an A.I. therapist. That’s not what I find interesting or compelling about the space. I’m interested in exploring some of the weirdnesses in how we as a society define different things. That’s the kind of stuff that I’m interested in, not having a kind of like A.I. chat pet.

[MUSIC PLAYING]

ezra klein

I’ve heard you say that with A.I., it’s the model that’s the art, not necessarily the output of the model.

holly herndon

Yeah, that’s one thing that we’re exploring quite a bit. So one of the potentials around machine learning is that you’re not limited to just a single output. You can create a model of whether that’s my own singing voice or whether that’s my own image or likeness, and you can allow other people to explore the logic of that model and to prompt through your world.

So it’s almost kind of like inviting people into your subjectivity, or inviting someone into your video, the video game of your art. So I think it has a lot of potential to be interesting in a kind of collaborative way with your audience. One term that we’re often using is “protocol art,” basically understanding that any work that’s made is a kind of seed for infinite generations. So we’re trying to lean into that.

So, for example, if we make a sculpture, which we did a project called “Ready Weight,” we also make it available as a package with an embedding and a Lora and all the kind of tools that anyone would need to be able to explore that sculpture in latent space. Or, you know, when we made the model of my voice with Holly+, we made that publicly available so anyone could make work for that. So that’s the example of protocol art, where really, it becomes a collaborative experience between myself and the people who are engaging with my work.

And in a way, art’s kind of always a little bit like that. It’s a conversation between the work that you’re making and the viewer or the recipient, but that becomes a little bit more complicated and fun, I think, in an A.I. world.

ezra klein

You wrote something in 2018 that I think is worth exploring where you said that A.I. is a deceptive, over-abused term. “Collective intelligence” is more useful. Why?

holly herndon

Because I really do see it as a kind of aggregate human intelligence. It’s trained on all of us. Specifically, when you look at music, it’s trained on human bodies performing very special tasks. And I think it does humans a great disservice to try to remove that from the equation.

I think that’s why I like to draw a parallel, also, to choral music, because I see it as a kind of coordination technology in the same kind of lineage as group singing. I think it’s a part of our evolutionary story and I think it’s a great human accomplishment that should be celebrated as such.

ezra klein

I want to explore what changes when you emphasize the collectivity of these models, the fact that they are in some ways an aggregate of all of us versus the artificiality of them, right? Artificial intelligence, which really emphasizes no, there’s something that somebody has written into software over here. They’re unearthly. They’re a new kind of thing. And one thing is actually, I think, economic, that there’s this whole question about who gets compensated and who’s going to make the money off of this and what all this training data is going to end up doing economically.

And it does seem very different to me if you understand these as on some level, a societal output, something that’s built on a kind of commons as opposed to a tremendous leap and feat of technology that is the sort of individual result of software geniuses working in garages and office parks somewhere.

holly herndon

Yeah, I mean, that basically summarizes the work that I’ve been doing for the last several years. It’s kind of like shouting that from the rooftops. Because I think if you see it through that lens, then it becomes something really beautiful and something to be celebrated and also something that’s not entirely new. You know, we’ve been embarking on collective projects, the entirety of our humanity to make things that are bigger than ourselves. And so if we can find a way to make that work in the real world with the kind of future of the economy, then yeah, I think it behooves us to figure that out.

ezra klein

The not entirely new part feels important to me. The degree to which this is all a continuum feels often underplayed in conversations about A.I., about the future of work, about humans and machines. But there’s also a way in which you see the A.I. companies using this argument to say that they should be given much more free rein and much more full profits over the products of these models, because they say, look. We’re not doing anything different than any other artist or anyone ever has.

Scientists today work off of the collective body of knowledge of science before them. You know, Holly Herndon is influenced by folk music and choral music and German techno, and everybody is always absorbing what has come before them and mixing it into something new. That’s all we’re doing. We’re not doing something new. We’re not making a copyright infringement.

So how do you understand the effort to use the collectivity, right? The fact that human beings have always been in collective projects, but we do give people a lot of individual ownership and authorship over their works from what might be different here in the scale and the nature of what these models are doing.

holly herndon

So, OK, I think that there’s a middle ground that can work for everyone, that can allow people to experiment and have fun with this technology while also compensating people. So spawning is a neologism that I like to use to kind of describe what’s happening here. And it’s a 21st century corollary to sampling, but it’s really distinctly different. And that difference, I think, is really important. It’s different in what it can do and also how it came about.

So what it can do, we’ve kind of gone into that already. You know, you train a model, the kind of logic of one thing to be able to perform new things through that logic. So it’s distinctly different from sampling, which is really like a one-to-one reproduction of a sound created by someone else that can then be processed and treated to make something new. But with spawning, you can actually perform as someone else based on information trained about them. So that’s distinctly different.

But also the way that it comes about, with sampling, it’s this one-to-one reproduction. With spawning, it’s a little bit more of a gray area in terms of intellectual property because you’re not actually making a copy. The machine is ingesting that media, if you want to call it looking at, reading, listening to, learning from. So I kind of land in that — I like to call it the “sexy middle ground” between people who are all for open use for everything and people who want to have really strict I.P. lockdown.

And so that’s one of the reasons why spawning, then, kind of mutated even further into an organization, which is something that I co-founded with three other people, Mat Dryhurst, Patrick Hoepner and Jordan Meyer, to try to figure out this messy question of essentially data manners. How do we handle data manners around A.I. training? Because what’s happening right now isn’t working for everyone.

ezra klein

Are there experiments that you find exciting or that you’ve conducted that you found the results of them promising?

holly herndon

Yeah. I mean, I think Holly+ was a really fun experiment because people then actually used my voice and we were able to, you know, sell some works through that and generate a small profit, but enough to be able to continue to build the tools for the community. So that was a fun experiment that I think really worked. And there’s one experiment that I’m running right now that I’m really excited about. My partner, Mat Dryhurst and I had an exhibition at the Serpentine London in October, and as part of that, we are recording choirs across the U.K. I think there’s 16 in total, and they’re joining a data trust. And we’ve hired a data trustee to pilot this idea of governance where we’re trying to work out some of the messy issues around how a data trust might work. And then we’ll negotiate with that data trust directly as to how we can use their data in the exhibition and moving forward.

I think it’s a really fun experiment, and it’s also because it’s singing and it’s choral music, it’s not really sensitive health data. We can really experiment and try out different ways to make this work in a way that’s not dealing with such sensitive information. So I’m really excited to see how people engage with that and how much do people really want to deal with the kind of day-to-day governance of their data. That’s also a big question.

ezra klein

So you were saying earlier that often the models are the art, but in this case, the governance is the art.

holly herndon

You know, in this case, I think the model and the governance and the protocol around it are all the art.

ezra klein

This idea of control is interesting, though. I mean, so it came out a while ago that Facebook and Meta had been training its A.I. on a huge cache of pirated books. And I think my book was in there. My wife’s book was in there. Like, the books of virtually everybody I know were in there. And so, a bunch of authors sued. And I also felt some part of me, like, I wanted to be paid for my inclusion. But I didn’t want to not be included in all of these. And it reminds me a bit of social media where at a certain point, whether or not you wanted to be on social media or not, it was sort of important that you had something representing you there, right?

It could be not your real photo, right? You could have some control over it, but if you didn’t do it, then you had absolutely no control over what you appeared as online. And it probably wasn’t plausible. You could appear as nothing online. So maybe something you didn’t want would be your top Google search result.

And here it’s going to get even weirder because there isn’t really — you can’t have your home page in the artificial intelligence model. All you are is training data. And so there’s something very strange about this. You know, if before all you were was kind of a profile, which was a very flattened version of you, now your training data — which is a very warped version of you. And this question of how do you have any control over that data, like if you want to participate but you want some definition over how you participate, there’s no real obvious avenue towards that.

holly herndon

There’s none at the moment, but I think that that’s coming. I think people will opt in under terms that they feel comfortable with to be able to shape the way that they appear in this new space. I don’t think it’s tenable that people have no agency over how they appear in the future of the internet.

ezra klein

That feels idealistic to me. I mean, I feel like we’ve been we’ve been going through an internet for a long time where I would have said this level of data theft or use is not tenable. This level of surveillance is untenable. This level of flattening, the way we get each other to treat each other in social media, it doesn’t feel like this is going to hold.

Like, I am amazed that people are still on X. As hostile as that platform has become to many of them, it’s just so impossible to imagine leaving something happening that they will accept something they really feel angry about. They really feel like the way it is run is hostile to them, that it is degraded. But, you know, what are you going to do? I’m amazed at how powerful the “what are you going to do” impulse is in life.

holly herndon

Well, I mean, I totally get that. But what we decided to do was to try to build a universal opt out standard. And it’s actually gaining traction. And there’s precedent in the E.U. A.I. Act. Ideally, it would be something that would be from the beginning, all training data would have been at. You know, people would have been asked permission from the offset, but that’s not how things played out. So now we’re in a position where we’re building tools where people can really easily opt out the data that they don’t want to have included in these models.

We have an A.P.I. where you can install that on your website and easily have everything on your website not be included in crawling. So I do think that there are things that we can do. It requires a little bit of legislation. It requires a little bit of diplomacy. But I don’t think that we should just throw up our hands and say, OK, it’s over. They should just have everything.

You know, if we do have a situation where we’re able to get the opt out as a kind of standard, then I think you can start to build an economy around an opt in. Something that I’m really proud of, we just announced Source.Plus. So I’m not trying to shill here, but I think this is a really important part of this conversation where we put together a data set of all public domain data, and it’s huge. And people should be training their base models there. And then you can allow people to opt in to fine tune their models and create an economy around that.

If you have a public domain base layer model, then you can actually create an economy around that. But I don’t think we should give up.

ezra klein

I definitely agree. I don’t think we should give up. For a lot of people, they need to make a living out of the work they’re doing.

holly herndon

Yes.

ezra klein

One thing that I find inspiring about the idea of thinking of it as a collective intelligence is it maybe points the way towards the idea that there’s modes of collective ownership, or modes of collective compensation. And at least in the space of art, when you’re thinking about this idea that you might have your voice out there for anybody to use, I think for a lot of people, that’s scary, right?

I mean, we’re very used to business models that are about nobody can use this thing of mine unless they pay me, right? We have patents, we have copyrights. What does that spark for you? What if we — what are the ways to do this in a more collective open source way that you think might work to make it possible for people to live, but also to create? Well,

holly herndon

I think first and foremost, it should not be a one-size-fits-all solution. I mean, you know, we’re talking about art. And that encompasses so many different practices that function economically in so many different ways. That’s something that was really devastating, I think, when it came to streaming. Streaming was really revolutionary and wonderful for a lot of people, but it was really devastating for a lot of other people because everything had to have the same economic logic as pop music.

And a lot of experimental music doesn’t follow that per play valuation logic. A lot of experimental music is about the idea, and you just need access to that idea once. You don’t need to listen to it on repeat. And so if the access to that idea costs a fraction of a cent, that’s going to be really difficult to pay for. It’s almost more you almost need more like a movie model where you pay a little bit more to gain access to that idea.

I think what’s really needed is that people have the ability to create whatever subcultures and whatever kind of economic models work for their subcultures and aren’t squeezed into a kind of sausage factory where everything has to follow the same logic.

ezra klein

So I know you and your partner are working on this book for this forthcoming exhibition that has, I think, the most triggering possible title to two people in my industry, “All Media is Training Data.” What’s the argument there?

holly herndon

Yeah, so this is a book that’s a series of commissioned essays and interviews between me and Mat about our approach to A.I. and data over the past 10 years. I do realize that this is kind of triggering for a lot of people, but I think it’s something that’s worth kind of recognizing. You know, as soon as something becomes captured in media, as soon as something becomes machine legible, it has the potential to be part of a training canon.

And I think that we need to think about what we’re creating moving forward with that new reality. You know, a lot of the work that we’re doing around the exhibition is we’re creating training data deliberately. So we’re treating training data as artworks themselves. I’m writing a song book that a collective of choirs across the UK will all be singing from, and those songs were written specifically to train an A.I. So all of the songs cover all of the phonemes of the English language so you can really — the A.I. can get the full scope of the sound, of each vocalist.

So we’re kind of playing with this idea of making deliberate training data, kind of — we like to call them mind children that we’re sending to the future.

ezra klein

I want to talk about what’s triggering in it for a minute. Because I think when people hear that, they might think media in the sense of the news. And I’m actually least worried about the news, because the news is where we’re covering new things that happened that are not in the training data.

But media, if you think about it broadly, right, visual media and music and all the other things human beings create, I think when people hear all media’s training data, what they hear is — everything we do will be replaceable, right? That the A.I. is going to learn how to do it and it’s going to be able to spit it back at us and then it doesn’t need us anymore.

When we become the training data, we’re sort of training our replacement, right? Like, the sort of very grim stories that will come out of factories before they outsource somewhere, where people are training. You know, the people are going to replace them at a lower cost. Is that how you see it? If you’re training data, does that mean you’re replaceable?

holly herndon

Art is a very complex cultural web. It’s a conversation. It’s something that’s performed. It’s a dialogue. It’s situated in time and place. We wouldn’t confuse a poster of the Mona Lisa for the Mona Lisa. Those are two different things. So I’m not worried about artists being replaced or about, you know, infinite media, meaning that artists have no role in meaning-making anymore. I think that the meaning-making becomes all the more important.

I do think we have to contend with a future where we do have infinite media, where the single image is perhaps no longer carrying the same weight as it did before. So yeah, there are some things to contend with, but I think that we won’t be replaced and I think it’ll be weird and wonderful.

[MUSIC PLAYING]

ezra klein

There are a bunch of programs that are coming out now that use A.I. to generate this sort of endless amount of pretty banal music for a purpose. So I have this one I downloaded called “Endel,” and it’s like, do you want music for focus? Do you want it to sleep? Do you want it to — And it’s fine. If I heard it on one of those playlists on Spotify, I wouldn’t think much of it.

[endel, "wind down

EVENING ENERGY RISE”]

And I think it points towards this world where I think the view is, we’re going to know what we want. And what we’re going to want is a generic version of it. And we’re going to be able to get it in kind of vast quantities forever. But you’re an artist and you said something in an interview I saw you give about how reality always ends up weirder. It always mutates against what people are expecting of it.

And so I wonder how much you suspect or see the possibility of the sameness that A.I. makes possible, the kind of endless amount of generic content, leading to some kind of backlash where people actually get weirder in response, both weirder with these projects, but also more interested in things created by humans in the same way that a lot of artisanal food movements got launched by the rise of fast food.

I mean, how much do you think about backlash and the desire for differentiation as something that will shape cultures and software here?

holly herndon

Well, there’s a lot in there. I mean, the backlash has been huge. I think that A.I. has certainly joined the ranks of culture wars and especially on Twitter. So I think the backlash is already there. But I think we’re also in really early days. So some of the examples that you gave, I feel like they’re kind of trying to please everyone. And as we move into a situation where your specific taste profile is being catered to more, I think it will feel less mid and feel more bespoke.

One direction where some of this can go, I think a lot of people are really focused on prompting at the moment because that’s how we’re interfacing with a lot of models. But in the future, it might look more like, you know, maybe you have a kind of taste profile where the model understands your tastes and your preferences and the things that you are drawn to and just kind of automatically generates whatever media that would kind of like please you.

So the kind of production to consumption pipeline is kind of collapsed in that moment. One of the things that I always appreciated as a young person growing up was hearing things that I didn’t like and didn’t understand, and that was something I always found really difficult with algorithmic recommendation systems, is I just kept getting fed what it already knew that I liked.

But, you know, when I was just being exposed to new music as a young person, I really needed to hear things that I didn’t like to expand my palate and understanding of what’s possible in music. And so that’s one thing that I think you could just kind of have a stagnation of taste if people are constantly being catered to. So I think people will crave something different or will crave to be challenged. Some people won’t, but some people will.

ezra klein

One of the things that occurred to me while I was looking through a lot of your work was that what I enjoyed about it was that you were using the relationship with the generative system to make yourself and make the work stranger. And that felt refreshing to me because my experience using ChatGPT or Claude or anything, really, so often is that it makes me more generic.

And that there’s this way in which A.I. feels like it is this great flattening. It’ll give you a kind of lowest common denominator of almost anything that human beings have done before, and that the danger of that feels to me like it’s a push toward sameness whereas a lot of your art feels to me like a push towards weirdness and a kind of sense that you can interact with different versions of these systems in a less sanded-down way and find something that neither a human or a machine could create alone. Is that a reasonable read of what you’re doing? Is there something there?

holly herndon

Yeah, I think that’s largely because I use my own training data. I create training data specifically for this purpose of training models rather than using something that’s just laid out for me. I think you get a lot of mid or averaging from these really large public models, because that’s basically the purpose.

You know, it’s supposed to kind be a catchall, but I’m not interested in the catchall. I’m interested in, you know, this weird kind of vocal expression or I’m interested in this other weird thing. And so that’s what I really want to create training data around and really focus on for whatever my model is. So I think people should just get into training their own models.

ezra klein

I want to end by going back to a song from “PROTO.” And it’s one of the stranger songs on the album. And I thought maybe we could just talk about what it’s doing and people could hear it. So why we play a clip of “Godmother“?

[RHYTHMIC SQUEAKING]

What’s happening there?

holly herndon

OK, so, yeah, when I delivered that single to 480, I was like, here’s a single for the next album. They were like, um, OK, what do we do with this? So this is, I guess, a really early voice model trained on my voice. So if you compare that to the Jolene song, that’s basically how far we’ve come in the last five years, which I think is just remarkable. It’s — the speed is incredible.

So I trained Spawn on my voice, my singing voice, and then I fed Spawn stems from a collaborator of mine named Jlin. And so Spawn is attempting to sing Jlin’s stems through my voice. And Jlin’s music is very percussive. It’s mostly percussion sounds, so it ends up being this kind of almost like a weird beatboxing kind of thing because it’s trying to make sense of these sounds through my voice.

ezra klein

Well, here, why don’t we play a clip of the Jlin? This is one of my favorite songs from her. It’s called “The Precision of Infinity.”

[JLIN & PHILIP GLASS, “THE PRECISION OF INFINITY”] And so, yeah, it’s not that it’s a machine, it’s just something that a human being cannot do quite on their own. I mean, there’s like a Philip Glass sample in there. It’s beautiful. But I don’t know. It’s funny when you say that that Spawn feels so old because something I like about it is it feels very — compared to a lot of what’s coming out now, its strangeness feels much more modern. It feels truer to how A.I. feels to me than the much more polished things we’re currently hearing or seeing, which it’s like this thing has exploded in all of its weirdness. And all this effort is being made to make it seem normal.

And I think the reason “PROTO” sounded very current to me when I heard it for the first time this year is it in sounding abnormal, it feels more actually of this moment, which feels very strange even as everybody keeps trying to make it seem not that strange.

holly herndon

Well, thank you. I appreciate that. I feel like at the time I was — this A.I. conversation has been going for so long. The hype was kind of already started back then. And I feel like so many things that were being marketed as A.I., it was kind of misleading what the A.I. was doing or how sophisticated things were. So at the time, a lot of people were creating A.I. scores and then having either humans perform them or having really slick digital instruments perform them.

And so it was giving this impression that everything was really slick and polished and finished, and that’s why we decided to focus on audio as a material, specifically because you could hear how kind of scratchy and weird and unpolished things were at that time. And that’s — I wanted to meet the technology where it was, and that required a whole mixing process with Marta Salogni, who’s an amazing mixing engineer in London, to try to get the human bodies and the slick studio to occupy the same space as the kind of crunchy lo-fi Spawn sounds.

But it was really important to me that I wasn’t trying to do the whole smoke and mirrors of like, this is some glossy future thing that it that it wasn’t, because I actually found the weirdness in there so much more beautiful.

ezra klein

As somebody who has now been for years playing around with models and working in these more sort of decentralized possibilities, I think it’s easy if you’re outside this and don’t have any particular A.I. software engineering expertise, as I don’t — as I think most of my listeners don’t — and you see, well, there’s models by OpenA.I., by Google, by Facebook — it feels like that no human being can do this, right? Companies getting billions of dollars.

How are you able to participate in this world of models? How much expertise do you need? How do you figure out what are the interesting projects, right? If somebody wants to understand this kind of world of homebrew A.I., so to speak, how did you start, and where do they start?

holly herndon

That’s a really good question. I mean, I think the landscape has changed so much since I started. I would say, you know, first thing, you can interact with publicly available models. And once you kind understand how those are working, then I would just do the really boring work of reading the academic research papers that are tedious. Take your time, drink a coffee, watch the YouTube video where they presented at a conference and maybe some people asked questions and that that helps to flesh it out.

This was our process. It’s been really, really kind of messy. And yeah, we didn’t have a lot of hand-holding, but I think if you’re really interested in learning more, the information is out there. You just kind of have to roll up your sleeves and get your hands dirty. I think that’s a nice place to end. So always our final question, what are three books you would recommend to the audience? OK, so Reza Negarestani wrote a book called “Intelligence and Spirit.” It’s a pretty dense philosophical book about intelligence and spirituality that I think is really great. On a lighter side, “Children of Time” by Adrian Tchaikovsky is a really enjoyable A.I. science fiction about intelligent, genetically-modified spiders.

ezra klein

One of my favorite books.

holly herndon

Yeah, it’s so good. So you kind see the kind of society and technology that a super intelligent spider society would build, which I love. And then there’s a book called “Plurality” that was led by Glen Weyl and Audrey Tang and a wide community of contributors. I also contributed a small part to this book. It’s about the future of collaborative technology and democracy, and it was actually written in an open, collaborative, Democratic way, which I think is really interesting. So check it out.

ezra klein

Holly Herndon, thank you very much.

holly herndon

Thanks so much. This was really fun.

[MUSIC PLAYING]

ezra klein

This episode of The Ezra Klein Show is produced by Annie Galvin, fact-checking by Michelle Harris.” Our senior engineer is Jeff Geld with additional mixing by Aman Sahota. Our senior editor is Claire Gordon. The show’s production team also includes Rollin Hu, Elias Isquith and Kristin Lin. We’ve original music by Isaac Jones and Aman Sahota. Audience strategy by Kristina Samulewski and Shannon Busta. The executive producer of New York Times Opinion Audio is Annie-Rose Strasser, and special thanks to Sonia Herrero and Jack Hamilton.

[MUSIC PLAYING]

Opinion | ‘Artificial Intelligence’? No, Collective Intelligence. (2024)

References