October 2015 codemusic

My undergraduate research project comparing harmonic similarity in songs and visualizing it with The Beatles’ songs that I presented at the International Society for Music Information Retrieval (ISMIR) conference.

Quick links:

Full abstract

We show that traditional music information retrieval tasks with well-chosen parameters perform similarly using computationally extracted chord annotations and groundtruth annotations. Using a collection of Billboard songs with provided ground-truth chord labels, we use established chord identification algorithms to produce a corresponding extracted chord label dataset. We implement methods to compare chord progressions between two songs on the basis of their optimal local alignment scores. We create a set of chord progression comparison parameters defined by chord distance metrics, gap costs, and normalization measures and run a black-box global optimization algorithm to stochastically search for the best parameter set to maximize the rank correlation for two harmonic retrieval tasks across the ground-truth and extracted chord Billboard datasets. The first task evaluates chord progression similarity between all pairwise combinations of songs, separately ranks results for ground-truth and extracted chord labels, and returns a rank correlation coefficient. The second task queries the set of songs with fabricated chord progressions, ranks each query’s results across ground-truth and extracted chord labels, and returns rank correlations. The end results suggest that practical retrieval systems can be constructed to work effectively without the guide of human ground-truthing.

A piano keyboard distorted in the shape of a circle with keys highlighted red at even intervals and note and interval labels in the middle. A radial graph of red and blue curved lines connecting song titles.
A three-dimensional histogram with orange columns protruding high at opposite ends of a square plane. The x- and y-axes are labeled 'Extracted chord ranking' and 'Ground-truth ranking.' The z-axis is labeled 'Occurrence Frequency.' A series of stacked colored boxes containing numbers. Lines connect subsequent stacks of boxes. The diagram shows how a sequence of numbers can be sorted, then sorted and ranked, and finally ranked.