I've spent the past 24 hours being quite astonished at Udio, which I already consider vastly superior to any AI music generator I have heard to date.

To be clear: I share this for the purposes of awareness and conversation, not endorsement.

The lyrics I gave it were random text relating to a conversation I was having with a friend this morning. Since Udio only gives you 33 second clips, I had to extend it three times (adding an instrumental into, 2nd set of text and final, largely instrumental outro)

BTW, as of this morning, Udio is claiming it is removes artist names (here "Richard Strauss") from the prompts, but as you can clearly hear (if you know his music), they must still be using it in the underlying mechanics

Good Morning Soo
The dogs need a walk
Can you hear them?
I hope you have a lovely day

Are you getting the groceries and the meal kits?
Yes, this day is for shopping.
How marvelous!

Udio | Good Morning Soo - Full Version by Driscollmusick

You need to be a member of composersforum to add comments!

Join composersforum

Email me when people reply –


  • This is fascinating; how much, if any, musical direction or input did you provide?

    I've contended for some time that AI will eventually render all of us artistically obsolete. Thousands upon thousands of examples of superior art, created by great masters of the past.. great, yet not inimitable. It's an unpopular position to adopt publicly, but I believe even despised truths must be uttered, and eventually confronted. Killing the messenger does naught but to temporarily assuage the pain of an encroaching technological world that threatens us all to the fate of drones in a colossal, collective hive of knowledge and endeavor. Eventually, the last cries of defiance will be drowned in oceans of unforseen technological capability, dwarfing our own, individual aspirations.

    One allegorical example I often turn to is the world of chess. I was an avid player at a tender age, and was reassured by experts that a computer algorithm could never compete against an accomplished human Grandmaster, because computer thought is essentially serial (brute force calculation), whereas the human brain processes problems on a parallel processing level, ie, no matter how great the processing power, a computer could never autonomously generate the sort of tactical trickery and creative thought of which the human mind is capable. The tipping point, and the watershed moment of my disallusionment came in the mid-late 90's, when Deep Blue toppled Kasparov, the reigning World Champion (it was in the 2nd confrontation, if I recall, but don't hold me to it).

    I apologize in advance for the multitude of typos and misspellings, but apparently Ning's spellcheck AI is as useless as "tits on a boar", so to speak.


    • We will not be artistically obsolete. On the contrary, we will be the masters of AI. Have faith laughing

  • I'll copy my reply from VI-Control:

    Udio is currently indeed better than most AI music generators. However, it is not a threat to composers who can compose at an advanced level. I have a good idea of what will happen and I am already preparing for the near future. But for me, it is certain that any hobby composer who composes classical/new-age/film-music with only chord progressions and a melody will be sidelined. The libraries that want to survive will have to use filters to reject AI music, or else they will be inundated with superficial music. Although, many libraries already survive by commercializing music without actually selling it. 

  • I must echo Kris's question -- I don't understand what exactly you needed to input here yourself to give this fascinating result!

    I think there is nevertheless some difference between chess and music. And I've been for a long time interested in computer chess where nowadays, not only are computers far stronger, they also often play more creative and interesting chess since the development of AlphaZero which learned chess simply by playing against itself and working out what wins and what loses -- nothing else. . AI surely will either replace or at the very least assist most commercial music for the simply reason that much of it it is banal. And possibly, AI could replace great composers if trained on their output but on this one, I'm still a bit sceptical as great works are not great for technical reasons (at least what I regard as great works which is somewhat subjective) but because thy give a unique insight into the human condition -- something which is my own aim in composing. AI cannot replicate vision because it has none but I guess it's not impossible it might fool you into thinking it does have somewhere down the road.

  • My only input was text: "soprano aria in the style of Richard Strauss" and the specific lyrics. In a typically odd AI manner, it outputted a duet (0:33 to 1:06)! I then asked for an instrumental intro (it added 0:00 to 0.33), an extension of the original with some additional words (1:06 to 1:39) and finally an instrumental outro (1:39 to the end). Notably the outro includes voices even though I asked it to be instrumental and provided no text.

    If you know Richard Strauss, the ending is very reminiscent of the duet at the end of Der Rosenkavalier. Similarly, the intro has Straussian horn rips very similar to the beginning of the same opera. I do wonder if you kept prompting it similarly, all of its Straussian outputs would sound similar to those iconic moments.

    It also struck me that the peaking harmony of "how marvelous!" is very similar to the peaks in the famous Act 2 duet in Wagner's Tristan & Isolde (not Strauss, of course, but close cousins).

    What's most remarkable to me is the confident text-setting, something that even some of the greats struggled with.

    Also the fact that the lines of the duet are staggered blows my mind ("Can you hear them?"). Since I assume the algorithm operates by generating a sequence of discreet sound moments in time, its ability to mimic contrapuntal sung text lines is one of the few examples in AI that feels like human thinking. Even if it's not "thinking", does it even matter??

    • yes, the ending did remind me of Rosenkavalier -- at any rate it definitely seems quite Straussian. I've never regarded Wagner as a close cousin so won't comment there. But if this software -- supposedly programmed by some of the same team that managed the genius that is AlphaZero chess -- can do what it's doing with virtually no input, then you can certainly colour me impressed. Especially as it's still in beta and has only be around three days as far as I can see! perhaps I can get it to write my next work for me.....


    • on the other hand, it clearly hasn't the slightest idea about Janacek even though he obviously in their database. When I asked for a potted version of the "Cunning Little Vixen" it replied with something which sounded a bit like Handel.....


    • on the other other hand, although this has little to do with the intended composer, this is something of a masterpiece of poetry if nothing else.


      Udio | The Vixen's Ballet by Lunar22
      Make your music
  • The technology is incredibly impressive to me; especially the way it can convincingly set almost any text to listenable, melodic content- and the realism of the instrumentals and vocals is pretty convincing too. The most unsettling part is that it's likely only going to improve and become more sophisticated from here. I am extremely curious how this software generates its sounds- is it sampled somehow, or likely, all modeled sound? 

    I anticipate several changes to the future of music as a result. I strongly believe that platforms like this one (and likely many more to come) are going to reshape the future of music.

    To start:

    Music libraries will likely become obsolete. Now, music directors, production companies and so forth, will have the ability to generate their own music tailored to whatever fashion they desire in a matter of minutes. Assuming the level of sophistication grows (and it will), a music supervisor for a film could simply enter something like "A 3-minute electronic cue, fast, with a dramatic swell that peaks at the 1:54 mark," and the software will certainly be able to handle and produce a satisfactory result.  This will be cheaper and faster than working with a music library and will likely produce more musically interesting results just because it could be so specific.  It will allow last minute editing and cutting without having to re-record any music or have a composer make last minute adjustments. Additionally, I'm sure that in the near future, software platforms like this will be able to produce stems for the tracks the user wants to use in their final product and allow the ability to tweak smaller details, mixing elements, and perhaps even just with a prompt such as "remove horn solo at 0:54 and replace with an oboe," or "add more low end to the mix," etc. Remember, AI is only in its infancy, and this already yields rather impressive results after just a few years of AI development.

    Further in the future, I see the ability for even stronger optimization where AI will have the ability to analyze a motion picture, determine what is happening based upon analyzing the picture and dialogue on screen, and write its own score based on a predetermined set of parameters from a human. It will almost certainly be able to outdo most human composers, given we can assume it will have extreme advantages that no human composer could expect to consciously weigh or consider on their mind at one given bar of music (The AI could analyze the psychological, mathematical, tonal, harmonic, textural variables for every given moment and have such forward sight and thinking it will go well beyond what any human composer could compete with).

    The long-term effects of such technology are hard for me to visualize or predict. I can see a world where mainstream celebrities or idolized music becomes mostly obsolete, with a general preference towards individuals creating their own personalized music. There might be celebrity "bands" or "artists" that don't actually exist in a physical sense, or people just make up their own and listen to that. (Japan kind of already has this) No other artists music, if users are generating their own, will be as relatable or preferable as it's the own individual hand tailoring what they want to hear...because even with Udio in its current state, this possibility is somewhat achievable, though as of now, the quality won't be the best and the results will only be cherry-picking from "the best" of the given genre, so to speak. It is still in a state of 'effective emulating' rather than true originality. For now...

    Nonetheless, it almost certainly has to mean less work for composers. Maybe not now, but soon. A well-written film score takes months to write and record. If (or rather when) the AI software can produce adequate results that are comparable or superior in a matter of minutes, it will be the favorable choice.

    But at some point comes the matter of originality. Will the software at any given time be able to "think" well enough to produce something truly original, rather than analyzing aspects of music from various genres that have proven pleasing to humans, determining why, and then just replicating this? Or will it perhaps be able to find ways to be unique and individualistic while still being listenable? Music is still an art, and unlike chess; there isn't any strict way it can be "solved" mathematically in terms of changing human preferences or waning interest in hearing the same material over and over throughout the centuries.






    • I remain an AI skeptic.

      The technology isn't new; it has been around for decades now. The difference really is in (1) the scale of data available to train the model, thanks to the internet and social media harvesting millions of data points from users worldwide, and (2) the hype is making it more well-known, whereas people were unaware of these algorithms that date from decades ago.

      Contrary to the hype, there has not been that big of a breakthrough. At its core, it's just an oversized interpolation engine. Interpolation in the sense that it takes some set of existing data points (to some extent directed by the user prompts), i.e., ingested samples of past music performances, and, based on probabilitistic interpolations, outputs something that's a blend of said existing data points.  Note the existing data points.  Don't deceive yourself; current AI does not, contrary to popular belief, create anything new, much less "analyse" the psychological effects of anything. (It's an interpolation engine; it does not even begin to tackle the question of what analysis means. It's not "thought" in any reasonable definition of the word. It's literally a glorified version of drawing lines between some points and finding their intersection.) It merely takes existing art and mixes-n-matches it and presents it as if the work were its own.

      Existing art, I might add, whose true authors, in all likelihood, are not credited or recompensed in any way.  Not to mention the legal implications of an AI company appropriating all of this prior art and feeding it into their model without due credit. (I'll let you work out yourself what that means for artists whose creative efforts are to thank for this.)  Consider this: without Richard Strauss' existing, prior art being fed as part of the input to this model, it would have been completely incapable of doing what it does now.  Without the input of countless composers, artists, performers, etc., that the creators of the model likely scoured off the internet, the model would be as interesting as a blank sheet of paper.  Where are the credits and dues to be paid to these people -- the true creators of the music?  In all likelihood, conveniently swept under the rug in order to keep up with the charade that "oh AI is now soooo advanced!!!111%*&(#$@(#rotflmaobbq".

      Not to mention, current AI algorithms are literally unable to create anything new. The algorithms have no concept of extrapolating beyond the input data set; don't expect any eureka moments to come from this. As I've said repeatedly, it's basically just drawing lines between existing points. It's unable to draw new points that lie outside of what it's been fed. The underlying algorithms are incapable of doing this.  It's unable to truly create.  It just appears to be so because it happens to have access to huge amounts of data (gleaned from the work of others, might I repeat) that probably a lot of us only have exposure to a small subset thereof.

      As my dad used to tell me, with a screen that has enough pixels (the resolution is high enough) you can create a pretty convincing picture, say, of a person's face.  However, all it is, is a bunch of colored pixels. Magnify it enough, and you'll see that it's merely a bunch of colored dots. It's merely an illusion (albeit a rather convincing one, if done right).  It's far, far cry from the person's actual face in physical reality. It does not even begin to model the actual structures of the physical face, or understand the underlying cell and bone structures that compose that face. It's a mere 2D flat approximation.  Just like the seemingly convincing AI outputs of today, that are no more than high-resolution approximations of interpolations of prior art. There is no real comprehension of the underlying structures of the art or its essence. It's just an elaborate bootleg copy of the real thing.

      Until the algorithms underlying AI actually exhibit a real breakthrough that goes beyond mere interpolation of existing data, I will remain an AI skeptic.  The AI king has no clothes, and I'm calling BS on the ruse.

This reply was deleted.

Topics by Tags

Monthly Archives