In this post we move a step up from the Intervals Table and we describe what we call the Interval Matrix.

The Intervals Matrix has been described in our London Information Retrieval meetup, in October 2019 (here the slides): it is an intermediate data structure used in the Cover Song Detection approach we are investigating in.

We already described in a previous post what an interval is and its central role in our approach; however, it’s worth to briefly repeat: it allows to think in terms of “relativeness” without caring about absolute frequencies or pitches.

Brief recap: what is an Interval?

The music theory defines an interval as the difference in pitch between two sounds.

An interval can be horizontal (or linear or melodic), if it represents the distance between two successively sounding tones, and vertical or harmonic if it pertains to simultaneously sounding tones, such as in a chord.

Horizontal and Vertical Intervals

In Western music, an interval represent the distance between two notes of a diatonic scale. The smallest of these intervals is a semitone.

The diatonic scale is a musical scale that has

  • seven pitches per octave
  • five double semitones (a.k.a. tones) intervals
  • two semitones intervals separated by two of three tones intervals

The C-Major scale is a perfect example of a diatonic scale (1/2 = semitone, 1 = Tone):

The Diatonic Scale

Brief recap: The Intervals Table

The Intervals Table is a matrix which defines the possible intervals between each note of the chromatic scale:

CC#DD#EFF#GG#AA#B
C01234567891011
C#11012345678910
D10110123456789
D#91011012345678
E89101101234567
F78910110123456
F#67891011012345
G56789101101234
G#45678910110123
A34567891011012
A#23456789101101
B12345678910110

Although in Music Theory each interval is denoted using a name (e.g. 0 semitones => unison) the table above uses the number of semitones as a measure for indicating the distance.

Using the Intervals Table above we can say that:

  • the distance between A and D# is 6 semitones
  • the distance between B and B is 0 semitones (unison)

The Intervals Matrix

Before introducing the central concept of this post, it’s definitely better to start from the input, the chroma features / chroma matrix, and then see the manipulation needed for getting an intervals matrix.

Chroma Features

The chroma features is a representation, along an interval composed by subsequent instants {t0 – tn}, where for each instant the audio signal is decomposed using the energy associated to 12 classes representing the 12 distinct semitones (or chroma) of the musical octave used in Western music notation. Here’s a sample matrix:

where you can see 12 * t (t=6 in this case, t6 is partially hidden) matrix where each vector represents the twelve pitches strengths for a given instant t.

The following is another example: the bass riff of “Zombie”, by Cranberries. On the upper staff you can see the symbolic notation of the riff, and then below there’s the corresponding chromagram; the coloured boxes highlight the compounding parts of the riff across the two different representations.

From Chroma Matrix to Intervals Matrix

For each Chroma Vector we keep the top k most intense pitch energies and we replace them with their corresponding ordinal position:

We will end up having a matrix of integers the with the same columns of the original matrix but only k rows. Each vector contains a ranked list of the k stronger pitch classes, sorted in descending order. Note the underlying assumption is that in order to represent and summarise a chroma vector, the stronger a class energy is, the better.

Almost done: the Interval Matrix is obtained applying the Intervals Table as a function between cells (ranks) having the same index and belonging to subsequent chroma vectors.

In this way we compute a distance between adjacent vectors (x[n] – x[n-1]) and those measures compose an output matrix that has:

  • the same number of rows of the input matrix
  • m-1 columns: where m is the number of the columns of the input matrix

The implementation

As usual, let’s first define the behaviour we want to achieve:

The Intervals Matrix

  • should use the IntervalsTable in order to compute the distance vectors
  • should be a zero/null matrix if there’s no distance between the input vectors
  • should throw an exception in case the input k parameter is greater than 12
  • should throw an exception in case the input consists of an empty chroma matrix
  • should filter out all vectors having a cardinality smaller than the input k parameter
  • given in input an m * n chroma matrix should have a dimension equals to (m – 1) * n
  • should return an interval vector, whose size is k, for each subsequent pair of chroma vectors in the input matrix

We are going to use the same BDD approach for expressing those requirements as code (full code is available here):

class IntervalsMatrixSpecs extends FlatSpec {

  "The Intervals Matrix" should "be a zero/null matrix if there's no distance between chroma vectors" in {
    val table = new IntervalsTable

    ...
  }

  it should "throw an exception in case the input k parameter is greater than 12" in {
    val table = new IntervalsTable
    val inputMatrix = chromaMatrix(31)

    assertThrows[IllegalArgumentException] {
      IntervalsMatrix(inputMatrix, 13, table)
    }

    assertThrows[IllegalArgumentException] {
      IntervalsMatrix(inputMatrix, 15, table)
    }
  }

  ...
}

And finally, the IntervalsMatrix is also available as a gist. As you can read, it is a basic wrapper around a matrix of integers. The constructor takes a chroma matrix and creates the internal data structure which captures the ranked intervals.

Great! Our test suite is growing:

to be continued in the next episode…

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.