4. Clip spectral analysisΒΆ
A Clip
is used to work with a soundfile within the context of maelzel.core. It can be converted to and from a maelzel.snd.audiosample.Sample
and can be subjected to different analysis strategies, both in the time domain (rms, autocorrelation, silence detection, onsets, etc) and the frequency domain (fundamental analysis and transcription, spectral analysis, etc).
One of the most common use cases is to determine the most prominent spectral contents of a sound at a given time. When analyzing a sound, particularly an inharmonic one (a bell, a multiphonic, a low piano note) it might be interesting to analyze its overtones. Extracting overtones of a voice can give important information of its formant structure.
This can be performed with the method chordAt
, which analyzes a fragment of the clip and extracts the most prominent frequencies and their corresponding amplitudes
[2]:
from maelzel.core import *
from pitchtools import *
from maelzel.snd.audiosample import Sample
from maelzel.snd import amplitudesensitivity
import numpy as np
import os
The pitch attribute is chosen arbitrarily and is only used for notation, it has no implications for playback
[3]:
cl = Clip(os.path.abspath("snd/colours-german-male.flac"), pitch="4E")
cl
[3]:
Clip(source=/home/em/dev/python/maelzel/docs/notebooks/snd/colours-german-male.flac, numChannels=1, sr=44100, dur=6047665397573105/562949953421312, sourcedursecs=10.743secs)
A Clip
can be converted to a Sample
[4]:
cl.asSample()
[4]:
10.7
, sr=44100
, numchannels=1
)When displaying such big chords it is important to customize some settings. In particular displaying cents deviations for all notes in a chord can be visually distracting. Also to make rendering faster we disable enharmonic respelling
[5]:
cfg = CoreConfig()
cfg['show.respellPitches'] = False
cfg['show.centsDeviationAsTextAnnotation'] = False
cfg['chordAdjustGain'] = False
cfg['show.voiceMaxStaves'] = 3
cfg.activate()
[ ]:
4.1. Chord sequence based on overtonesΒΆ
The soundfile is analyzed 16 times per second (see dt). Only components louder than -55dB
are taken into consideration. The number of components is further limited by the frequency range. From those components only the 8 loudest are selected and converted to a Chord
[6]:
%timeit cl.asSample().partialTrackingAnalysis(50)
560 ms Β± 57.2 ms per loop (mean Β± std. dev. of 7 runs, 1 loop each)
[7]:
dt = 1/16
times = np.arange(0, cl.durSecs(), dt)
items = [cl.chordAt(t, mindb=-55, dur=dt, maxcount=8, ampfactor=10, maxfreq=m2f(126), minfreq=40) or Rest(dt) for t in times]
chain = Chain(items)
chain.show()
[8]:
from maelzel.core import _plot
[9]:
v = chain.asVoice()
[10]:
_plot.plotVoices([v])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[10], line 1
----> 1 _plot.plotVoices([v])
File ~/dev/python/maelzel/maelzel/core/_plot.py:258, in plotVoices(voices, axes, realtime, colors, eventHeadAlpha, eventLineAlpha, eventStartAlpha, eventStartLuminosityFactor, eventHeight, eventLineHeight, maxwidth, accidentalColor, accidentalScale, linkedEventDesaturation, accidentalSize, accidentalShift, eventStartLineWidth, accidentalFixedScale, staffLineWidth, drawHeadForTiedEvents, ledgerLineColor, barlines, barlineColor, barlineWidth, barlineAcrossAllStaffs, scorestruct, timeSignatures, chordLink, setLimits)
251 # chordHead = Rectangle((x0, minpos - eventHeight / 2),
252 # width=min(float(event.dur), maxwidth / 4),
253 # height=maxpos - minpos + eventHeight,
254 # color=_eventcolor, edgecolor=None, linewidth=0)
255 # axes.add_patch(chordHead)
257 for pitch, target in zip(pitches, targets):
--> 258 npitch = pt.notated_pitch(pitch)
259 cleffpos = _verticalPosToCleffPos(npitch.vertical_position)
261 yoffsetFactor = npitch.diatonic_alteration * 0.5
File ~/.virtualenvs/maelzel/lib/python3.11/site-packages/pitchtools/__init__.py:1781, in notated_pitch(pitch, semitone_divisions)
1779 if isinstance(pitch, (int, float)):
1780 pitch = _roundres(pitch, 1/semitone_divisions)
-> 1781 return _notated_pitch_notename(pitch)
File ~/.virtualenvs/maelzel/lib/python3.11/site-packages/pitchtools/__init__.py:1786, in _notated_pitch_notename(notename)
1784 @_cache
1785 def _notated_pitch_notename(notename: str) -> NotatedPitch:
-> 1786 parts = split_notename(notename)
1787 diatonic_index = 'CDEFGABC'.index(parts.diatonic_name)
1788 chromatic_note = parts.diatonic_name + parts.alteration
File ~/.virtualenvs/maelzel/lib/python3.11/site-packages/pitchtools/__init__.py:1260, in split_notename(notename, default_octave)
1258 cents = _parse_centstr(deviation)
1259 if cents is None:
-> 1260 raise ValueError(f"Could not parse cents '{deviation}' while parsing note '{notename}'")
1261 else:
1262 # 4C#-10
1263 octave = int(notename[0])
ValueError: Could not parse cents 'est' while parsing note 'Rest'
Synthesizing the chords with a sine tone results in a quite understandable if βlo-fiβ rendition
[8]:
chain.rec(gain=0.2, instr='sin', fade=(0.05, 0.05), sustain=0.05, position=0.5)
[8]:
"/home/em/.local/share/maelzel/recordings/rec-2023-11-28T13:17:10.644.wav"
, 2
channels, 10.35
secs, 44100
Hz)The same sequence but rendered with a piano as instrument
[41]:
chain.rec(gain=0.2, instr='piano', fade=(0.01, 0.1), sustain=0.2)
[41]:
"/home/em/.local/share/maelzel/recordings/rec-2023-03-21T11:17:00.291.wav"
, 2
channels, 10.51
secs, 44100
Hz)A clearer result can be achieved by applying an inverse A-curve amplitude compensation. This makes the sound less saturated and more distinct
[73]:
acurve = ampcomp.AmpcompA()
items2 = [item.copy() for item in items]
for item in items2:
if isinstance(item, Chord):
for n in item.notes:
n.amp *= 1 - acurve.level(n.freq)
n.amp *= 2
chainA = Chain(items2)
chainA.rec(gain=0.2, instr='piano', fade=(0.01, 0.1), sustain=0.2)
[73]:
"/home/em/.local/share/maelzel/recordings/rec-2023-03-21T11:34:14.942.wav"
, 2
channels, 10.51
secs, 44100
Hz)To validate the analysis we can play the generated chords along the original soundfile
[74]:
with render() as r:
chainA.play(gain=0.5, instr='piano', fade=(0.01, 0.1), sustain=0.2, position=0.75)
cl.play(position=0.25, gain=0.5, delay=0.05)
r
[74]:
"/home/em/.local/share/maelzel/recordings/rec-2023-03-21T11:34:31.070.wav"
, 2
channels, 10.81
secs, 44100
Hz)4.1.1. Chromatic versionΒΆ
It is possible to make a version quantized to the nearest semitone. Notice when listening to the quantized version how the missing glissandi in the voice render the result much further away from the original
[75]:
chain2 = chain.quantizePitch(step=1)
chain2
[75]:
Chain([Rest:0.062β©, Rest:0.062β©, Rest:0.062β©, Rest:0.062β©, Rest:0.062β©, Rest:0.062β©, βΉ2A 4Bb 5Db 5F 5G 0.0625β©βΊ, βΉ2Ab 4Bb 5F 0.0625β©βΊ, βΉ2Bb 3Bb 4F 4Bb 5D 5F 5G 5Bb 0.0625β©βΊ, βΉ2Bb 3Bb 4F 4Bb 5D 5F 5Ab 5Bb 0.0625β©βΊ, β¦], dur=10.75)
[76]:
chain2.rec(gain=0.2, instr='piano', fade=(0.01, 0.1), sustain=0.2)
[76]:
"/home/em/.local/share/maelzel/recordings/rec-2023-03-21T11:34:51.088.wav"
, 2
channels, 10.51
secs, 44100
Hz)It is also possible to modify the time resolution, to produce other kinds of pixelation. In this case reducing the analysis to 8 times per second makes the rendition hardly recognizable
[77]:
dt = 1/8
times = np.arange(0, cl.durSecs(), dt)
chords = [cl.chordAt(t, mindb=-55, dur=dt, maxcount=8, ampfactor=10, maxfreq=m2f(126), minfreq=40) or Rest(dt) for t in times]
chain3 = Chain(chords)
chain3 = chain3.quantizePitch(step=0.5)
chain3.show()
Just for the sake of variation we can try rendering using an accordion soundfont
[85]:
chain3.rec(gain=0.2, instr='accordion', fade=(0.01, 0.3), sustain=0.3)
[85]:
"/home/em/.local/share/maelzel/recordings/rec-2023-03-21T11:37:05.643.wav"
, 2
channels, 10.68
secs, 44100
Hz)[ ]: