In this post we move a step up from the **Intervals Table** and we describe what we call the **Interval Matrix**.

The **Intervals Matrix** has been described in our London Information Retrieval meetup, in October 2019 (here the slides): it is an intermediate data structure used in the **Cover Song Detection** approach we are investigating in.

We already described in a previous post what an **interval** is and its central role in our approach; however, it’s worth to briefly repeat: it allows to think in terms of **“relativeness”** without caring about **absolute frequencies** or pitches.

## Brief recap: what is an Interval?

The music theory defines an interval as the **difference** in pitch between **two sounds**.

An interval can be **horizontal** (or **linear** or **melodic**), if it represents the distance between two **successively** sounding tones, and **vertical** or **harmonic** if it pertains to **simultaneously** sounding tones, such as in a chord.

In **Western music**, an interval represent the **distance** between **two notes** of a **diatonic scale**. The smallest of these intervals is a **semitone**.

The **diatonic scale** is a **musical scale** that has

**seven**pitches per octave**five**double semitones (a.k.a.**tones**) intervals**two**semitones intervals separated by**two**of**three**tones intervals

The **C-Major** scale is a perfect example of a diatonic scale (1/2 = semitone, 1 = Tone):

## Brief recap: The Intervals Table

The **Intervals Table** is a **matrix** which defines the **possible intervals** between each note of the **chromatic scale**:

C | C# | D | D# | E | F | F# | G | G# | A | A# | B | |

C | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |

C# | 11 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |

D | 10 | 11 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |

D# | 9 | 10 | 11 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |

E | 8 | 9 | 10 | 11 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |

F | 7 | 8 | 9 | 10 | 11 | 0 | 1 | 2 | 3 | 4 | 5 | 6 |

F# | 6 | 7 | 8 | 9 | 10 | 11 | 0 | 1 | 2 | 3 | 4 | 5 |

G | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 0 | 1 | 2 | 3 | 4 |

G# | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 0 | 1 | 2 | 3 |

A | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 0 | 1 | 2 |

A# | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 0 | 1 |

B | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 0 |

Although in **Music Theory** each interval is denoted using a name (e.g. 0 semitones => unison) the table above uses the **number of semitones** as a measure for indicating the distance.

Using the **Intervals Table** above we can say that:

- the distance between A and D# is 6 semitones
- the distance between B and B is 0 semitones (unison)

## The Intervals Matrix

Before introducing the central concept of this post, it’s definitely better to start from the input, the **chroma features / chroma matrix**, and then see the manipulation needed for getting an **intervals matrix**.

### Chroma Features

The **chroma features** is a **representation**, along an interval composed by subsequent instants {t0 – tn}, where for each instant the audio signal is decomposed using the **energy** associated to **12 classes** representing the **12 distinct semitones** (or **chroma**) of the musical octave used in Western music notation. Here’s a sample matrix:

where you can see **12 * t** (t=6 in this case, t6 is partially hidden) matrix where each **vector represents** the **twelve pitches strengths** for a **given instant t**.

The following is another example: the bass riff of “Zombie”, by Cranberries. On the upper staff you can see the symbolic notation of the riff, and then below there’s the corresponding chromagram; the coloured boxes highlight the compounding parts of the riff across the two different representations.

### From Chroma Matrix to Intervals Matrix

For **each Chroma Vector** we keep the **top k **most intense pitch energies and we replace them with their corresponding **ordinal position**:

We will end up having a **matrix of integers** the with the **same columns** of the original matrix but only **k** rows. Each vector contains a ranked list of the k stronger pitch classes, sorted in descending order. Note the **underlying assumption** is that in order to **represent** and **summarise** a chroma vector, the **stronger** a class energy is, the **better**.

Almost done: the **Interval Matrix** is obtained applying the **Intervals Table** as a function between cells (**ranks**) having the **same index** and belonging to **subsequent chroma vectors**.

In this way we compute a **distance between adjacent vectors** (x[n] – x[n-1]) and those measures compose an output matrix that has:

**the same number of rows**of the input matrix**m-1 columns**: where**m**is the number of the columns of the input matrix

## The implementation

As usual, let’s first define the behaviour we want to achieve:

The **Intervals Matrix**

- should use the
**IntervalsTable**in order to**compute the distance vectors** - should be a
**zero/null matrix**if there’s**no distance**between the input vectors - should throw
**an exception**in case the input**k parameter**is**greater than 12** - should throw
**an exception**in case the input consists of an**empty chroma matrix** - should
**filter****out**all vectors having a cardinality**smaller****than**the input**k parameter** - given in input an
**m * n****chroma matrix**should have a dimension equals to**(m – 1) * n** - should return an interval vector, whose size is k, for each subsequent pair of chroma vectors in the input matrix

We are going to use the same BDD approach for expressing those requirements as code (full code is available here):

```
class IntervalsMatrixSpecs extends FlatSpec {
"The Intervals Matrix" should "be a zero/null matrix if there's no distance between chroma vectors" in {
val table = new IntervalsTable
...
}
it should "throw an exception in case the input k parameter is greater than 12" in {
val table = new IntervalsTable
val inputMatrix = chromaMatrix(31)
assertThrows[IllegalArgumentException] {
IntervalsMatrix(inputMatrix, 13, table)
}
assertThrows[IllegalArgumentException] {
IntervalsMatrix(inputMatrix, 15, table)
}
}
...
}
```

And finally, the **IntervalsMatrix** is also available as a gist. As you can read, it is a basic wrapper around a matrix of integers. The constructor takes a chroma matrix and creates the internal data structure which captures the ranked intervals.

Great! Our test suite is growing:

*to be continued in the next episode…*

Andrea Gazzarini is a curious software engineer, mainly focused on the Java language and Search technologies.

With more than 15 years of experience in various software engineering areas, his adventure with the search domain began in 2010, when he met Apache Solr and later Elasticsearch.