Prototype of a Vocoder in Sonic Pi

xavierriley · June 18, 2018, 8:38pm

Someone requested this on GitHub so I’m dusting this off and posting it as more of a progress report in case anyone was interested (issue was here https://github.com/samaaron/sonic-pi/issues/1930)

I’ve also written this up as a gist here with code samples, including the code for the implementation in SuperCollider: https://gist.github.com/xavriley/0907002649d6b6b2ac2bcbe739d96761

Prototype of vocoder on Sonic Pi

Demo here: https://www.dropbox.com/s/qiktze3ml7bz5iq/autotune_the_shipping_forecast.wav?dl=0
Original voice input here: https://soundcloud.com/jb_uk/neil-nunes-bbc-radio-4-and

This is a demo of a simple effects synth using the Vocoder.ar UGen from SuperCollider. This is a fairly primitive vocoder implementation made of a bunch of bandpass filters which are “tuned” to various frequencies.

The results are mixed, but it’s difficult to get a decent reproduction of the words from this.

To get T-Pain/Imogen Heap style auto-tune it would be necessary to write a different algorithm probably using a technique called PSOLA which is described at the following links:

Part of the problem is that the algorithm is probably under copyright which makes distribution of an open source version more difficult.

xavierriley · August 10, 2018, 12:54pm

Update on this - I’ve found a PSOLA based pitch shifter implemented as a SuperCollider quark (plugin) here https://github.com/dyfer/PitchShiftPA It soundsway better than the Vocoder.ar synth. Just need time to implement it properly…

siimphh · June 12, 2019, 7:07am

I don’t suppose you’ve had time to look into this?

xavierriley · February 3, 2020, 11:14pm

Status update:

//PitchShiftPA is based on formant preserving pitch-synchronous overlap-add re-synthesis, as developed by Keith Lent
//based on real-time implementation by Juan Pampin, combined with non-real-time implementation by Joseph Anderson
//This synthdef is based on the pseudo-UGen by Marcin Pączkowski, using GrainBuf and a circular buffer at https://github.com/dyfer/PitchShiftPA

(
SynthDef('sonic-pi-fx_vocoder', {|
		pitch_ratio = 1, formant_ratio = 1,
		min_freq = 10, max_formant_ratio = 10, grains_period = 2,
		out_bus=0, in_bus=0, time_dispersion|

		var in, localbuf, grainDur, wavePeriod, trigger, freqPhase, maxdelaytime, grainFreq, bufSize, delayWritePhase, grainPos, snd, freq;
		var absolutelyMinValue = 0.01; // used to ensure positive values before reciprocating
		var numChannels = 1;

		//multichanel expansion
		[pitch_ratio, formant_ratio].do({ arg item;
			item.isKindOf(Collection).if({ numChannels = max(numChannels, item.size) });
		});

		in = In.ar(in_bus,1).asArray.wrapExtend(numChannels);
	    freq = Pitch.kr(in)[0];
	    //freq = freq.asArray.wrapExtend(numChannels);
		pitch_ratio = pitch_ratio.asArray.wrapExtend(numChannels);

		min_freq = min_freq.max(absolutelyMinValue);
		maxdelaytime = min_freq.reciprocal;

		freq = freq.max(min_freq);

		wavePeriod = freq.reciprocal;
		grainDur = grains_period * wavePeriod;
		grainFreq = freq * pitch_ratio;

		if(formant_ratio.notNil, { //regular version

			formant_ratio = formant_ratio.asArray.wrapExtend(numChannels);

			max_formant_ratio = max_formant_ratio.max(absolutelyMinValue);
			formant_ratio = formant_ratio.clip(max_formant_ratio.reciprocal, max_formant_ratio);

			bufSize = ((SampleRate.ir * maxdelaytime * max_formant_ratio) + (SampleRate.ir * ControlDur.ir)).roundUp; //extra padding for maximum delay time
			freqPhase = LFSaw.ar(freq, 1).range(0, wavePeriod) + ((formant_ratio.max(1) - 1) * grainDur);//phasor offset for formant shift up - in seconds; positive here since phasor is subtracted from the delayWritePhase

		}, { //slightly lighter version, without formant manipulation

			formant_ratio = 1 ! numChannels;

			bufSize = ((SampleRate.ir * maxdelaytime) + (SampleRate.ir * ControlDur.ir)).roundUp; //extra padding for maximum delay time
			freqPhase = LFSaw.ar(freq, 1).range(0, wavePeriod);
		});

		localbuf = numChannels.collect({LocalBuf(bufSize, 1).clear});
		delayWritePhase = numChannels.collect({|ch| BufWr.ar(in[ch], localbuf[ch], Phasor.ar(0, 1, 0, BufFrames.kr(localbuf[ch])))});
		grainPos = (delayWritePhase / BufFrames.kr(localbuf)) - (freqPhase / BufDur.kr(localbuf)); //scaled to 0-1 for use in GrainBuf
		if(time_dispersion.isNil, {
			trigger = Impulse.ar(grainFreq);
		}, {
			trigger = Impulse.ar(grainFreq + (LFNoise0.kr(grainFreq) * time_dispersion));
		});
		snd = numChannels.collect({|ch| GrainBuf.ar(1, trigger[ch], grainDur[ch], localbuf[ch], formant_ratio[ch], grainPos[ch])});

		Out.ar(out_bus, snd.dup)
	}
).writeDefFile("/Users/xavierriley/Downloads/Sonic Pi.app/Contents/Resources/etc/synthdefs/compiled/")
)

# in synthinfo.rb
    class FXVocoder < FXInfo
      def name
        "Vocoder"
      end

      def introduced
        Version.new(3,2,0)
      end

      def synth_name
        "fx_vocoder"
      end

      def doc
        ""
      end

      def arg_defaults
        super.merge({
          :pitch => 440,
          :pitch_ratio => 1.0,
          :formant_ratio => 1.0,
          :min_freq => 10,
          :max_formant_ratio => 10,
          :grains_Period => 2.0,
        })
      end
    end
...
        :fx_vocoder => FXVocoder.new,

# Sonic Pi code

# harmonises a vocal sample as a major chord

load_synthdefs
sn = "~/Downloads/acappella.wav"

sample sn

in_thread do
  with_fx :vocoder, pitch_ratio: 1.5, formant_ratio: 0.5 do
    sample sn
    sleep sample_duration(sn)
  end
end

in_thread do
  with_fx :vocoder, pitch_ratio: 1.25, formant_ratio: 0.5 do
    sample sn
    sleep sample_duration(sn)
  end
end

The API could do with tweaking to make it more intuitive but the noises are there.

Internally it pitch tracks the input sound, and then pitch shifts against the tracked pitch. That’s why this example file has those moments of “distortion” as it’s not able to track a clear pitch at those points. With a good input source the effect should be super smooth.

This also opens the door to a T-Pain style autotune but I need to do a bit more work to get that.

ethancrawford · February 3, 2020, 11:27pm

Nice work @xavierriley! Maybe I should get back to working on those Synths and FX that I started a while ago too

robin.newman · February 5, 2020, 3:51pm

Just tried this out. Very cool. added it on my Mac, and tried it with direct voice input

with_fx :autotuner,mix: 0.7  do |c|
  set :ct,c
  live_audio :min,stereo: true,amp: 4
  
  # now start setting target pitch to get robot voice behaviour
  live_loop :robot do
    control get(:ct), target_pitch: scale(:a2, :minor_pentatonic, num_octaves: 2).choose
    sleep 0.4
  end
end

xavierriley · February 12, 2020, 9:49pm

This has landed in 3.2 beta!

Topic		Replies	Views
More thoughts for potential vocal synthesis in Sonic Pi [LONG] Creations & Ideas	9	1538	February 17, 2022
Is autotuner still working? Support, Help & Resources	9	170	May 24, 2024
Making SuperCollider's 'PingPong' available as part of a Sonic Pi synth design Development [obsolete]	4	1413	November 2, 2017
Creating a Pitch Tracking SynthDef Educators	7	2820	October 31, 2018
We’re Making an AI Assistant for Sonic Pi and We’re Looking for Code Submissions Creations & Ideas	2	816	December 18, 2021

Prototype of a Vocoder in Sonic Pi

Prototype of vocoder on Sonic Pi

Related topics