Hi everyone! Aside from being a casual composer and long time lurker here, I am a PhD student at Yale University's Department of Computer Science and have been doing research in computer music. Within that field, it is unfortunately very common for musical models to be proposed but never formally tested – in fact, there is no consensus on what procedures or metrics to use in some cases. In an attempt to bridge that gap, I have been developing a participant study to test various hypotheses about algorithm performance and music perception. I am currently running a pilot study version of the experiment to test the robustness of the system (lag, audio playback problems, etc.) on various platforms and internet connections. If you would like to participate, the pilot study is here:
If you decide to participate, you will be asked to listen to short clips of music and guess whether the music clip was created by a human composer or computer algorithm. You will also be asked questions about your musical background (i.e. how many years of musical training you have had). In all, the study should take no more than 20 minutes to complete.
If there are any questions, I am happy to discuss the nature of my work in general terms here, although obviously I can't give away some details about this particular experiment until all data from the final study has been collected. Also, the examples are randomized, so there is no way to do a direct comparison of answers from one run to another.
If you experience any technical problems, please let me know the conditions under which they occurred, such as what browser/device you were using at the time. Some browsers (particularly older versions) do not support the parts of HTML5 that the study uses to deliver audio. The study has also not been tested on any mobile devices, so I have no idea whether it will display/perform correctly on tablets, phones, etc.
My research more broadly has a few of different purposes really. In the dry sense, I am a computer scientist trying to solve various computational and mathematical problems: working with multidimensional data, new categories of grammars, machine learning algorithms, and so on. For the specific area of music, there are two reasons that I'm having machines write musical scores. The first reason is that it holds a sort of inexplicable fascination for a decent portion of the population – myself included, even though I am also happy to compose the normal way and do so regularly. Some people are drawn to poking at the musical machine and some aren't. Similarly, if you put a robot in a room, some people will immediately talk to it and try to make it emulate human behaviors, while others will have no interest in that sort of activity.
The second and really more important reason for machine-made scores is to do with establishing whether a mathematical model did what it was supposed to do. It's all very well to propose a generative model of something, but people usually only analyze music with those models. Analysis is important, of course, but if you run the same model generatively and the output is garbage...that means it didn't capture much. Unfortunately, it is incredibly easy to have a generative model that “successfully” analyzes a corpus but captures none of its structure. This problem also isn't unique to music and is a thorn in the side of other fields like computational linguistics. If I was trying to create a generative model of spoken language, I would want to look at machine-made sentences to see if I've captured the intended language features. Any ramifications of what it means when the machine gets it right is another matter (and obviously one full of furious debates). But, for music, quantitatively evaluating a model's performance in a reasonable way is a big enough can of worms by itself. That is the type of problem study is trying to address.
Most of the work to date in measuring music perception is to do with very small features like pitch perception (looking for perfect pitch, etc.), whether a chord is dissonant, and so on, while the models being proposed range over a much grander scale. Because of the unfortunate qualitative relationship between a score and its many possible performances, the performances must be normalized across cases if there is any hope at testing more basic features of a score. One way to avoid that, which is the approach taken by this study, is to have everything performed by a computer to ensure uniformity even though it yields a sub-par performance. But, with normalized computer performances, the human examples are then a sanity check and a baseline measure against which further comparisons can be made.
As already stated, the details of exactly what I'm testing for and what would constitute a successful or failed study are something I can't give out in public until the final study runs, so all I can really say is that some musical models are being tested in the way described above. The first target audience of the study is actually much broader than just musicians, and the full study will initially be run on a pretty random sampling. Both musicians and non-musicians need to be tested to determine whether musical experience is a factor in how people respond, since that is another thing that has been insufficiently examined in the field. However, it is possible that there may be few skilled musicians in the pool for the full study, which is one of the reasons I am seeking pilot study participation from musicians. If the pilot study shows big differences (I've also asked other people to take it that I know are not musicians), then it impacts what is done next if the full study doesn't pull in very many musicians.
As for atonal or traditional music...if you just want to hear a smattering of style examples but don't want to do the full study, you'll have to go through a couple screens and click the volume test button, but after that there is a screen with 4 buttons giving examples of the styles. No data is recorded until you go past that to the actual experimental trials.
Hi Donya, I for one applaud your experimental enterprise. Regardless of any earlier studies and 'prior art' in the field, you will either confirm what
may already be established on discover something new. Neither is failure.
Theories evolve until they resonate the truth. The true nature of anything is displayed in itself.
The understanding of the nature of musical tones and the architecture of frequencies has unlimited diversity.
Have you tried reverse engineering any of great works of the past, analysing Mozart,Bach or LVB ? lol
Bob makes some good points about 'who cares' - so just have fun with it. ps- I'm curious too... harmonics/frequency and human consciousness is
to quote Mr. Spock- fascinating RS
In some respects, yes, although not with any conclusions yet since it's ongoing work; I was only able to start crunching data for something similar to that within the last few weeks.
roger stancill said:
There are good reasons for testing the general public, musically educated or not, although I can't elaborate on what the exact reasons are. In case the planning/design-related doubt is stemming from my field (computer science), I should perhaps add that people in both music theory and psychology have been involved in constructing and vetting the experiment.
michael diemer said:
Which areas/terms are sticking points? That will help me to better explain.
Bob Porter said:
I tried to take this, expecting to hear a variety of of samples in different styles and instrumentation, but every piece sounded the same to me and I did not complete it because it doesn't seem like a reasonable test to me
this is a puzzle to me also- how often should I feed my mouse, lol :-)
michael diemer said:
Donya, are you, simply put, trying to see if software can write what we could discern as
intelligible music? Or as with Chess, by programmed format , compete with humans?
Are you attempting to find math sequences and 'formulas' underlying generally favorable
tone patterns that would be indistinguishable from human inventions?
I understand you said 'developing'-. i.e. trial and error- and then test whats produced
on Human ears. Correct ?
not sure at this point in time whether it's the NSA or NASA but something is
screwin' w/ my computator. By the way, would someone please pass the ketchup....
you humans taste like &#+%!
michael diemer said:
I once vatched Bill Clinton play zhe zachzephone in Stalingrad. He had zhe tightest algorhythm I have ever gehoert since the marchink band of Oestereich durink zhe Great Weltskrieg.