4. Clip spectral analysisΒΆ

A Clip is used to work with a soundfile within the context of maelzel.core. It can be converted to and from a maelzel.snd.audiosample.Sample and can be subjected to different analysis strategies, both in the time domain (rms, autocorrelation, silence detection, onsets, etc) and the frequency domain (fundamental analysis and transcription, spectral analysis, etc).

One of the most common use cases is to determine the most prominent spectral contents of a sound at a given time. When analyzing a sound, particularly an inharmonic one (a bell, a multiphonic, a low piano note) it might be interesting to analyze its overtones. Extracting overtones of a voice can give important information of its formant structure.

This can be performed with the method chordAt, which analyzes a fragment of the clip and extracts the most prominent frequencies and their corresponding amplitudes

[2]:
from maelzel.core import *
from pitchtools import *
from maelzel.snd.audiosample import Sample
from maelzel.snd import amplitudesensitivity
import numpy as np
import os

The pitch attribute is chosen arbitrarily and is only used for notation, it has no implications for playback

[3]:
cl = Clip(os.path.abspath("snd/colours-german-male.flac"), pitch="4E")
cl
[3]:
Clip(source=/home/em/dev/python/maelzel/docs/notebooks/snd/colours-german-male.flac, numChannels=1, sr=44100, dur=6047665397573105/562949953421312, sourcedursecs=10.743secs)

A Clip can be converted to a Sample

[4]:
cl.asSample()
[4]:
Sample(duration=10.7, sr=44100, numchannels=1)

When displaying such big chords it is important to customize some settings. In particular displaying cents deviations for all notes in a chord can be visually distracting. Also to make rendering faster we disable enharmonic respelling

[5]:
cfg = CoreConfig()
cfg['show.respellPitches'] = False
cfg['show.centsDeviationAsTextAnnotation'] = False
cfg['chordAdjustGain'] = False
cfg['show.voiceMaxStaves'] = 3
cfg.activate()
[ ]:

4.1. Chord sequence based on overtonesΒΆ

The soundfile is analyzed 16 times per second (see dt). Only components louder than -55dB are taken into consideration. The number of components is further limited by the frequency range. From those components only the 8 loudest are selected and converted to a Chord

[6]:
%timeit cl.asSample().partialTrackingAnalysis(50)
560 ms Β± 57.2 ms per loop (mean Β± std. dev. of 7 runs, 1 loop each)
[7]:
dt = 1/16
times = np.arange(0, cl.durSecs(), dt)
items = [cl.chordAt(t, mindb=-55, dur=dt, maxcount=8, ampfactor=10, maxfreq=m2f(126), minfreq=40) or Rest(dt) for t in times]
chain = Chain(items)
chain.show()

../_images/notebooks_clip-chords_11_0.png
[8]:
from maelzel.core import _plot
[9]:
v = chain.asVoice()
[10]:
_plot.plotVoices([v])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[10], line 1
----> 1 _plot.plotVoices([v])

File ~/dev/python/maelzel/maelzel/core/_plot.py:258, in plotVoices(voices, axes, realtime, colors, eventHeadAlpha, eventLineAlpha, eventStartAlpha, eventStartLuminosityFactor, eventHeight, eventLineHeight, maxwidth, accidentalColor, accidentalScale, linkedEventDesaturation, accidentalSize, accidentalShift, eventStartLineWidth, accidentalFixedScale, staffLineWidth, drawHeadForTiedEvents, ledgerLineColor, barlines, barlineColor, barlineWidth, barlineAcrossAllStaffs, scorestruct, timeSignatures, chordLink, setLimits)
    251         # chordHead = Rectangle((x0, minpos - eventHeight / 2),
    252         #                       width=min(float(event.dur), maxwidth / 4),
    253         #                       height=maxpos - minpos + eventHeight,
    254         #                       color=_eventcolor, edgecolor=None, linewidth=0)
    255         # axes.add_patch(chordHead)
    257 for pitch, target in zip(pitches, targets):
--> 258     npitch = pt.notated_pitch(pitch)
    259     cleffpos = _verticalPosToCleffPos(npitch.vertical_position)
    261     yoffsetFactor = npitch.diatonic_alteration * 0.5

File ~/.virtualenvs/maelzel/lib/python3.11/site-packages/pitchtools/__init__.py:1781, in notated_pitch(pitch, semitone_divisions)
   1779 if isinstance(pitch, (int, float)):
   1780     pitch = _roundres(pitch, 1/semitone_divisions)
-> 1781 return _notated_pitch_notename(pitch)

File ~/.virtualenvs/maelzel/lib/python3.11/site-packages/pitchtools/__init__.py:1786, in _notated_pitch_notename(notename)
   1784 @_cache
   1785 def _notated_pitch_notename(notename: str) -> NotatedPitch:
-> 1786     parts = split_notename(notename)
   1787     diatonic_index = 'CDEFGABC'.index(parts.diatonic_name)
   1788     chromatic_note = parts.diatonic_name + parts.alteration

File ~/.virtualenvs/maelzel/lib/python3.11/site-packages/pitchtools/__init__.py:1260, in split_notename(notename, default_octave)
   1258     cents = _parse_centstr(deviation)
   1259     if cents is None:
-> 1260         raise ValueError(f"Could not parse cents '{deviation}' while parsing note '{notename}'")
   1261 else:
   1262     # 4C#-10
   1263     octave = int(notename[0])

ValueError: Could not parse cents 'est' while parsing note 'Rest'
../_images/notebooks_clip-chords_14_1.png

Synthesizing the chords with a sine tone results in a quite understandable if β€˜lo-fi’ rendition

[8]:
chain.rec(gain=0.2, instr='sin', fade=(0.05, 0.05), sustain=0.05, position=0.5)
[8]:
OfflineRenderer(outfile="/home/em/.local/share/maelzel/recordings/rec-2023-11-28T13:17:10.644.wav", 2 channels, 10.35 secs, 44100 Hz)

The same sequence but rendered with a piano as instrument

[41]:
chain.rec(gain=0.2, instr='piano', fade=(0.01, 0.1), sustain=0.2)
[41]:
OfflineRenderer(outfile="/home/em/.local/share/maelzel/recordings/rec-2023-03-21T11:17:00.291.wav", 2 channels, 10.51 secs, 44100 Hz)

A clearer result can be achieved by applying an inverse A-curve amplitude compensation. This makes the sound less saturated and more distinct

[73]:
acurve = ampcomp.AmpcompA()
items2 = [item.copy() for item in items]
for item in items2:
    if isinstance(item, Chord):
        for n in item.notes:
            n.amp *= 1 - acurve.level(n.freq)
            n.amp *= 2

chainA = Chain(items2)
chainA.rec(gain=0.2, instr='piano', fade=(0.01, 0.1), sustain=0.2)
[73]:
OfflineRenderer(outfile="/home/em/.local/share/maelzel/recordings/rec-2023-03-21T11:34:14.942.wav", 2 channels, 10.51 secs, 44100 Hz)

To validate the analysis we can play the generated chords along the original soundfile

[74]:
with render() as r:
    chainA.play(gain=0.5, instr='piano', fade=(0.01, 0.1), sustain=0.2, position=0.75)
    cl.play(position=0.25, gain=0.5, delay=0.05)
r
[74]:
OfflineRenderer(outfile="/home/em/.local/share/maelzel/recordings/rec-2023-03-21T11:34:31.070.wav", 2 channels, 10.81 secs, 44100 Hz)

4.1.1. Chromatic versionΒΆ

It is possible to make a version quantized to the nearest semitone. Notice when listening to the quantized version how the missing glissandi in the voice render the result much further away from the original

[75]:
chain2 = chain.quantizePitch(step=1)
chain2
[75]:
Chain([Rest:0.062β™©, Rest:0.062β™©, Rest:0.062β™©, Rest:0.062β™©, Rest:0.062β™©, Rest:0.062β™©, β€Ή2A 4Bb 5Db 5F 5G 0.0625β™©β€Ί, β€Ή2Ab 4Bb 5F 0.0625β™©β€Ί, β€Ή2Bb 3Bb 4F 4Bb 5D 5F 5G 5Bb 0.0625β™©β€Ί, β€Ή2Bb 3Bb 4F 4Bb 5D 5F 5Ab 5Bb 0.0625β™©β€Ί, …], dur=10.75)

[76]:
chain2.rec(gain=0.2, instr='piano', fade=(0.01, 0.1), sustain=0.2)
[76]:
OfflineRenderer(outfile="/home/em/.local/share/maelzel/recordings/rec-2023-03-21T11:34:51.088.wav", 2 channels, 10.51 secs, 44100 Hz)

It is also possible to modify the time resolution, to produce other kinds of pixelation. In this case reducing the analysis to 8 times per second makes the rendition hardly recognizable

[77]:
dt = 1/8
times = np.arange(0, cl.durSecs(), dt)
chords = [cl.chordAt(t, mindb=-55, dur=dt, maxcount=8, ampfactor=10, maxfreq=m2f(126), minfreq=40) or Rest(dt) for t in times]
chain3 = Chain(chords)
chain3 = chain3.quantizePitch(step=0.5)
chain3.show()
../_images/notebooks_clip-chords_27_0.png

Just for the sake of variation we can try rendering using an accordion soundfont

[85]:
chain3.rec(gain=0.2, instr='accordion', fade=(0.01, 0.3), sustain=0.3)
[85]:
OfflineRenderer(outfile="/home/em/.local/share/maelzel/recordings/rec-2023-03-21T11:37:05.643.wav", 2 channels, 10.68 secs, 44100 Hz)

[ ]: