AI is starting to become part of everyday pop culture. From the Harry Potter Balenciaga Video (we made our own by creating a digital twin of an employee, but it will remain an internal piece of fun) to South Park writing an episode with ChatGPT, about ChatGPT, generative AI is so useful and accessible to everyone that it’s seeping into just about everything.
Most recently, generative music has taken center stage as an AI-created song called Heart on My Sleeve, featuring cloned voices of Drake and The Weekend went viral with over 15 million listens and according to the New York Times, “Rocked the music world.” It was removed from Spotify and YouTube. (The link I posted might get taken down again.)
So, what’s up with AI-generated music? The Fake Drake song used actual music created by a human, only the voices were synthesized. As far as tunes go, there are a lot of early demos of tools out there, including OpenAI’s Jukebox, but likely the most powerful model is Google’s MusicLM. Essentially, it’s text-to-music software trained on almost 300,000 hours of music. It’s frighteningly good. You can listen to some samples of it in action here, and TechCrunch has a great article review here, which also notes that Google seems to have no plans to release it anytime soon.
Most of the off-the-shelf tools available are fairly rudimentary, but one stands out as very interesting. It’s called Riffusion. Using Stable Diffusion, it creates music from IMAGES of music. How crazy is that?
Here’s a one-hour experiment we conducted to learn what would happen by “connecting” GPT-4 to an AI music tool with the goal of creating music inspired by several brands. It’s not going to sound great, but it’s interesting. And when someone does release a tool as powerful as Google MusicLM, we’ll have something interesting that could very well work.
ONE-HOUR AI EXPERIMENT
Hypothesis: AI can translate brands into distinctive musical riffs
Methodology:
- Test baseline capabilities of Riffusion and Mubert
- Use ChatGPT (GPT-4) to obtain brand list and musical prompts
- Request a list of brands which lend themselves most to distinctive musical and auditory styles, experiences or sensations
- Describe the musical and auditory landscape of the brand
- Translate the “essence” of each into 3 musical prompts using under 8 words for each brand
- Enter GPT-4 prompts
- Share the best-of-three outputs here
Results:
Riffusion is fun, innovative and users have gotten very interesting outputs. London producer patten has used samples to produce interesting music you can hear, here in a Bandcamp article. To test it, I tried asking for “Super Mario Theme”, however, and got the below result. Just to level set expectations. While Riffusion is an experiment, Mubert is an AI stock music house, and the results are more in line with what we might come to expect from stock music, though none of the examples really break through as an a-ha! With a much larger volume of test samples, we probably would have found something very interesting.
In the first round of testing, GPT-4 generated prompts that were too esoteric to be translated by the Stable Diffusion model into sounds. Words like durable, individuality, and ink mean something to us, but Riffusion is more literal. Google Music LM would likely understand. The prompt was modified to be more literal and only use musical styles, genres and instruments for the following results.
Coca-Cola
Auditory and Musical Essence: Coca-Cola’s musical essence captures the brand’s timeless appeal, happiness, and refreshing taste. The auditory experience should evoke a sense of fun and optimism while keeping a globally recognizable feel.
Prompts:
Steel drums, calypso, chimes
Pop piano, whistle, handclaps – Mubert
Marimba, Afrobeat, shakers – Riffusion
Riffusion
Mubert
After trying two more brands and getting poor results, I decided to “train” GPT a little more by providing example prompts from online that had gotten good results.
Lego
Auditory and Musical Essence: LEGO’s music should capture the brand’s creativity, playfulness, and endless possibilities. The auditory experience should be fun and imaginative, reflecting the brand’s commitment to inspiring the builders of tomorrow.
Prompts:
Toy orchestra symphony (+ happy, upbeat, plastic block sounds) = Mubert
Retro video game meets playful jazz – Riffusion
8-bit adventure soundtrack
Riffusion
Mubert
Ben & Jerry's
Auditory and Musical Essence: Ben & Jerry’s music should convey the brand’s fun, socially conscious, and indulgent character. The auditory landscape should be vibrant and feel-good, reflecting the brand’s commitment to producing unique and delicious ice cream flavors.
Prompts:
Upbeat indie folk-pop
Funky ice cream truck jingle – Riffusion and Mubert
Psychedelic jam band groove
Riffusion
Mubert
GoPro
Auditory and Musical Essence: GoPro’s musical essence should capture the brand’s adventurous, action-packed, and adrenaline-fueled spirit. The auditory experience should be energetic and dynamic, reflecting the brand’s commitment to capturing and sharing life’s most exciting moments.
Prompts:
High-energy electronic rock
Epic cinematic sports anthem
Adrenaline-fueled drum and bass
Riffusion
Mubert
Realtree
Not very satisfied with the results from GPT-4 generated prompts, I decided to do it the old-fashioned way, and came up with a prompt for one of our great clients: Realtree, the hunting, fishing and outdoor lifestyle brand. Realtree began when Bill Jordan created his first hand-drawn, highly realistic camo pattern based on an oak tree. At the time, he had already founded an archery company in Georgia. Taking that for inspiration…
Prompt:
An upbeat country rock anthem played on a hunting bow
And got a bit of a nicer result in Riffusion, but Mubert has more limited training on country music.
Marketer Takeaway: As with text-to-image and text-to-video, text-to-music will follow its own evolution in quality and usability, eventually entering the mainstream as tools become better and produce more interesting and enjoyable results. Overall, the results are interesting and warrant further exploration. Generating hundreds of samples would likely yield a few gold nuggets from the rough, and Mubert creates a very rich range of musical styles. AI tools become dramatically more powerful as you combine them, so try experiments of your own. In all science, experiments are required to discover new things, and most of them don’t pan out as we would like!
Legal note: Music generated by Mubert https://mubert.com/render
Also published on Medium.