Improved score-performance matching using both structural and temporal information from MIDI recordings (20/11/07)
In order to study score-based music performance, one has to determine the corresponding score note for every performance note, a process called score-performance matching. Since a typical performance may contain thousands of notes, researchers have developed algorithms that automate this procedure. Such algorithms are called matchers. Automated matching is a complex problem due to the use of expressive timing by performers and the presence of notes that are unspecified in the score, such as performance errors and ornaments. Automated matchers typically use performance data extracted from MIDI recordings. In the last two decades, several scholars, such as Puckette & Lippe (1992), Large (1993), and Heijink (1996), have developed such matchers. For the most part, these algorithms use structural information, such as pitch and chronological succession, but do not use timing information. As a result, most matchers cannot deal satisfactorily with ornamented performances or performances that exhibit extreme variations in tempo. In an attempt to solve these issues, the author developed a matcher that relies both on structural information and on a temporal representation of the performance, which is obtained by sequentially tracking local tempo changes on a note-by-note basis and mapping performance events to the corresponding score events. This allows the matcher to generate an accurate match even for heavily ornamented performances. Furthermore, this matcher can identify and categorize all common types of errors and ornaments. Most existing algorithms are designed to find a solution that maximizes the number of matched performance notes, regardless of the perceptual relevance of such an approach. In order to increase the music-theoretical and perceptual validity of its output, the proposed matcher instead favours solutions that preserve the structural and temporal coherence of the individual voices. A comparison with human-made score-performance matches realized by the author (a music theorist) on a corpus of 80 MIDI recordings of organ performances, which were used as ground truth data for this purpose, shows a near-perfect agreement between the solutions found by the matcher and the human matches. Finally, in contrast to existing matchers which focus on piano performance, this matcher is designed to accommodate multi-channel MIDI recordings of performances from keyboard instruments with multiple manuals, such as organ or harpsichord, and could thus potentially be used to study recordings of ensemble performances of MIDI instruments, thereby providing a valuable tool for music performance research, as well as a significant improvement over previous algorithms.