Figure Recognition
Separating musical expression from harmony
Recorded or imported static MIDI notes need to be translated to dynamic Phrases in order to be useful for Music Prototyping. To accomplish this, musical expression is separated from harmony and the resulting phrase is annotated with hints, so a subsequent reversal of this process can be as accurate as possible. Every step in this process is ambiguous because the required information is nowhere to be found and needs to be guessed from context.
-
Estimate keys, chords and scales underlying a MIDI take.
-
Estimate the exact positions of chord changes.
-
Remove inaccuracies without sacrificing expression.
-
Identify separate voices.
-
Guess a useful grouping of melodic fragments, chords, bass lines.
-
Add hints where a phrase might encounter difficulties when it is rendered against different harmony.
-
Clean up and optimize the result.
For the average recorded take, there are millions, sometimes billions of possible solutions to figure recognition. Only a few hundred are musically plausible. These need to be assessed and sorted out. To accomplish this, Synfire draws on an extensive knowledge base supported by artificial intelligence algorithms.
Limitations
Due to the ambiguity and guessing that is involved, figure recognition can't possibly be accurately reversible. That is, a Figure rendered against its own estimated Harmony will most likely produce MIDI notes that are slightly different from the original.
This is no fault of the algorithm, but a general limitation when music is approached with mathematics. Music is not an exact science, but an artifact of human culture. There is only so much a software can do to formally represent and process all aspects of music.Figure recognition is still good enough to be useful more than 80% of the time. It is a fantastic tool for collecting reusable phrases from recorded performances. After all, Music Prototyping is about making new original music, not faithfully recreating existing compositions.