Thursday, December 3, 2020

AI Generated Music

We started with the original SampleRNN research code in theano. It's a hierarchical LSTM network. LSTMs can be trained to generate sequences. Sequences of whatever. Could be text. Could be weather. We train it on the raw acoustic waveforms of metal albums. As it listens, it tries to guess the next fraction of a millisecond. It plays this game millions of times over a few days. After training, we ask it to come up with its own music, similar to how a weather forecast machine can be asked to invent centuries of seemingly plausible weather patterns.

It hallucinates 10 hours of music this way. That's way too much. So we built another tool to explore and curate it. We find the bits we like and arrange them into an album for human consumption.

It's a challenge to train nets. There's all these hyperparameters to try. How big is it? What's the learning rate? How many tiers of the hierarchy? Which gradient descent optimizer? How does it sample from the distribution? If you get it wrong, it sounds like white noise, silence, or barely anything. It's like brewing beer. How much yeast? How much sugar? You set the parameters early on, and you don't know if it's going to taste good until way later.

We trained 100s of nets until we found good hyperparameters and we published it for the world to use.

No comments:

Post a Comment