India’s Beatoven.ai Shows the World How AI Music Generation Is Done Right

AI music generation is a tricky business. Amid copyright claims and the need to fairly compensate artists, it becomes a tough task for AI startups, such as Suno.ai or Udio AI, to gain revenue and popularity.

However, Beatoven.ai, an Indian AI music startup, has tackled this in the most ethical and responsible way possible.

One of the main reasons for this is that co-founder and CEO Mansoor Rahimat Khan is a professional sitar player himself, and comes from a family of musicians that goes back seven generations. “I was very fascinated by this area of ​​music technology,” he said.

Khan said GOAL that he started his journey at IIT Bombay and realised that even though there weren’t many opportunities in India, he wanted to combine his passion for music and technology.

Beatoven.ai is part of the JioGenNext 2024, Google I/O 2024 and AWS ML Elevate 2023 programs. Khan said the team applied to many accelerator programs because they realized they would need a lot of computing power to achieve the goal of building an AI music generator.

The company raised $1.3 million in pre-Series A round led by Entrepreneur First and Capital 2B, for a total funding of $2.42 million.

After switching jobs, Khan met Siddharth Bhardwaj and built on their shared passions for music and technology, founding Beatoven.ai in 2021. “After coming back from Georgia Tech, I got involved in the startup ecosystem and started working with ToneTag, an audio tech startup funded by Amazon,” said Khan.

Everyone needs background music in their life

The co-founders discovered that the biggest market was generating soundtracks for indie game developers, agencies and production houses. “But when you look at the nitty gritty of the industry, copyright is a really scary thing. We thought that generative AI could be a solution to that.” Khan said the idea was to figure out how to give users simple prompts and generate audio.

Mansoor Rahimat Khan with Lucky Ali

The original idea was to create a simple, consumer-facing UI where users could select a genre, mood, and duration to generate a soundtrack. But that was before the LLM era and NLP wasn’t good enough for such tasks. “We started in 2021, before the LLM era, and our venture funding came from Entrepreneur First. We raised a million dollars in 2021 and quickly built our technology from scratch.”

The biggest challenge, as with any other AI company, was collecting data. “You either partnered with labels that charged huge licensing fees or you scrapped[data]. That was the only other option. But if you did that, you got sued,” Khan said.

All technology

This is where Beatoven.ai takes the lead over other products in the market. Khan and his team started reaching out to small and slowly larger artists to create partnerships and source their own data. The company had a head start because no one was talking about this field at the time. Within a year, they collected over 100,000 data samples, all of which were proprietary to them.

In the early days, Beatoven.ai didn’t use Transformers. Khan said that was one of the reasons the quality wasn’t great. Later, when Diffusion models came into the picture, the team realized that this is the way forward for AI-based music generation.

The company started using different models for different purposes, including the ChatGPT API from OpenAI. The Beatoven.ai platform also uses CLAP (Contrasting language audio pretraining)which is mainly used for generating video.

In addition, the company uses latent diffusion models such as those from Stability AI Stable audio, UAE models, And AudioLLMfor different tasks such as individual instruments within the generated music. The company then uses an Ensemble model to mix all of these individual audios.

For inference, the company uses CPUs (instead of GPUs), which keeps everything fast and optimized and reduces costs.

Trained honestly

Khan admitted that the audio files generated by Suno.ai are currently of superior quality, but they also use Diffusion models, which makes them a bit slow. “The quality is significantly better than where we started, but it’s not quite there yet.” Khan added that the speed is currently high because the company uses different models for different tasks.

To further expand its data, Beatoven.ai began partnering with outlets like Rolling Stone and packaged it as a creator fund. In January 2023, it announced a $50,000 fund for indie music as part of its Humans of Beatoven.ai program to help expand its catalog.

This gave Beatoven.ai a lot of popularity and many artists wanted to collaborate with the team. Khan said that the company wants to do more licensing deals to expand music libraries. “As far as Indian labels are concerned, they are not open to licensing deals yet,” Khan said.

Beatoven.ai’s model is certified as Fairly Trained and also certified by AI for Music as an ethically trained AI model.

In addition to music generation, Beatoven.ai is launching Augment, similar to ElevenLabs’ voice generation model. This will allow agencies to connect to Beatoven.ai’s API and train on their own data to create remixes of their own music. For the demo, Khan showed how a simple sitar melody can be turned into a hip-hop remix.

“You can just use your existing content and make new songs. That’s the idea,” he said.

Beatoven.ai is also currently testing a video-to-audio model using Google’s Gemini, where users can upload a video and the model would understand the context and generate music based on that. Khan showed a demo to GOAL where the model could also be controlled using textual cues, for better audio quality.

Not everyone is a musician

Khan predicts that in the near future, companies like Spotify and YouTube will open source their data and offer APIs to make the AI ​​music industry a bit more open.

Meanwhile, while I’m with GOALUdio co-founder Andrew Sanchez said: “It allows people who are just starting out and don’t have a big professional career yet to invest the resources, time or money into making a career. “It allows a whole new group of creators to come in. This would make everyone a musician.

As for Beatoven.ai, he said he wants to go more into a B2B direction, as building a direct-to-consumer app doesn’t make sense. “I don’t believe everyone wants to make music,” Khan added, saying that not everyone in the world is learning music. That’s why the company is currently only focusing on background music without vocals.