Prototype of a Vocoder in Sonic Pi

Someone requested this on GitHub so I’m dusting this off and posting it as more of a progress report in case anyone was interested (issue was here

I’ve also written this up as a gist here with code samples, including the code for the implementation in SuperCollider:

Prototype of vocoder on Sonic Pi

Demo here:
Original voice input here:

This is a demo of a simple effects synth using the UGen from SuperCollider. This is a fairly primitive vocoder implementation made of a bunch of bandpass filters which are “tuned” to various frequencies.

The results are mixed, but it’s difficult to get a decent reproduction of the words from this.

To get T-Pain/Imogen Heap style auto-tune it would be necessary to write a different algorithm probably using a technique called PSOLA which is described at the following links:

Part of the problem is that the algorithm is probably under copyright which makes distribution of an open source version more difficult.


Update on this - I’ve found a PSOLA based pitch shifter implemented as a SuperCollider quark (plugin) here It soundsway better than the synth. Just need time to implement it properly…


I don’t suppose you’ve had time to look into this?

Status update:

//PitchShiftPA is based on formant preserving pitch-synchronous overlap-add re-synthesis, as developed by Keith Lent
//based on real-time implementation by Juan Pampin, combined with non-real-time implementation by Joseph Anderson
//This synthdef is based on the pseudo-UGen by Marcin Pączkowski, using GrainBuf and a circular buffer at

SynthDef('sonic-pi-fx_vocoder', {|
		pitch_ratio = 1, formant_ratio = 1,
		min_freq = 10, max_formant_ratio = 10, grains_period = 2,
		out_bus=0, in_bus=0, time_dispersion|

		var in, localbuf, grainDur, wavePeriod, trigger, freqPhase, maxdelaytime, grainFreq, bufSize, delayWritePhase, grainPos, snd, freq;
		var absolutelyMinValue = 0.01; // used to ensure positive values before reciprocating
		var numChannels = 1;

		//multichanel expansion
		[pitch_ratio, formant_ratio].do({ arg item;
			item.isKindOf(Collection).if({ numChannels = max(numChannels, item.size) });

		in =,1).asArray.wrapExtend(numChannels);
	    freq =[0];
	    //freq = freq.asArray.wrapExtend(numChannels);
		pitch_ratio = pitch_ratio.asArray.wrapExtend(numChannels);

		min_freq = min_freq.max(absolutelyMinValue);
		maxdelaytime = min_freq.reciprocal;

		freq = freq.max(min_freq);

		wavePeriod = freq.reciprocal;
		grainDur = grains_period * wavePeriod;
		grainFreq = freq * pitch_ratio;

		if(formant_ratio.notNil, { //regular version

			formant_ratio = formant_ratio.asArray.wrapExtend(numChannels);

			max_formant_ratio = max_formant_ratio.max(absolutelyMinValue);
			formant_ratio = formant_ratio.clip(max_formant_ratio.reciprocal, max_formant_ratio);

			bufSize = (( * maxdelaytime * max_formant_ratio) + ( *; //extra padding for maximum delay time
			freqPhase =, 1).range(0, wavePeriod) + ((formant_ratio.max(1) - 1) * grainDur);//phasor offset for formant shift up - in seconds; positive here since phasor is subtracted from the delayWritePhase

		}, { //slightly lighter version, without formant manipulation

			formant_ratio = 1 ! numChannels;

			bufSize = (( * maxdelaytime) + ( *; //extra padding for maximum delay time
			freqPhase =, 1).range(0, wavePeriod);

		localbuf = numChannels.collect({LocalBuf(bufSize, 1).clear});
		delayWritePhase = numChannels.collect({|ch|[ch], localbuf[ch],, 1, 0,[ch])))});
		grainPos = (delayWritePhase / - (freqPhase /; //scaled to 0-1 for use in GrainBuf
		if(time_dispersion.isNil, {
			trigger =;
		}, {
			trigger = + ( * time_dispersion));
		snd = numChannels.collect({|ch|, trigger[ch], grainDur[ch], localbuf[ch], formant_ratio[ch], grainPos[ch])});, snd.dup)
# in synthinfo.rb
    class FXVocoder < FXInfo
      def name

      def introduced,2,0)

      def synth_name

      def doc

      def arg_defaults
          :pitch => 440,
          :pitch_ratio => 1.0,
          :formant_ratio => 1.0,
          :min_freq => 10,
          :max_formant_ratio => 10,
          :grains_Period => 2.0,
        :fx_vocoder =>,
# Sonic Pi code

# harmonises a vocal sample as a major chord

sn = "~/Downloads/acappella.wav"

sample sn

in_thread do
  with_fx :vocoder, pitch_ratio: 1.5, formant_ratio: 0.5 do
    sample sn
    sleep sample_duration(sn)

in_thread do
  with_fx :vocoder, pitch_ratio: 1.25, formant_ratio: 0.5 do
    sample sn
    sleep sample_duration(sn)

The API could do with tweaking to make it more intuitive but the noises are there.

Internally it pitch tracks the input sound, and then pitch shifts against the tracked pitch. That’s why this example file has those moments of “distortion” as it’s not able to track a clear pitch at those points. With a good input source the effect should be super smooth.

This also opens the door to a T-Pain style autotune but I need to do a bit more work to get that.


Nice work @xavierriley! Maybe I should get back to working on those Synths and FX that I started a while ago too :thinking:

Just tried this out. Very cool. added it on my Mac, and tried it with direct voice input

with_fx :autotuner,mix: 0.7  do |c|
  set :ct,c
  live_audio :min,stereo: true,amp: 4
  # now start setting target pitch to get robot voice behaviour
  live_loop :robot do
    control get(:ct), target_pitch: scale(:a2, :minor_pentatonic, num_octaves: 2).choose
    sleep 0.4
1 Like

This has landed in 3.2 beta! :tada: