Sonic Pi's evolution in 40 minutes - Ever listened to a Git history?

Thinking about which kind of data might be suitable to render in an audible way, I came up with the idea that “audioalize” (like visualize) the Git histrories of software projects.

The script below does exactly that. For example, you can listen to the history of Sonic Pi, rendered into a 40 minutes piano piece. For those not familiar with using Git, I recorded the first few minutes (starting at approx. commit 1000):

Each commit produces one note. Timing is proportional to commit time and date, each day represented by one second. The note pitch depends on the number of files changed in the commit, while amp depends on the number of lines changed (actually, the logarithm).

You may try to change note_selection to 1, which results in a note associated to each contributor rather than judged by the number of changed files.

There are many more possibilities which data associated with each commit to use to create the sounds, and how. Your ideas on that are highly appreciated!

Sonic Pi’s history is not the most exciting to listen to (due to many single-file commits), but I think it fits well for this forum posts, and some passages are IMO still quite interesting. You may try any other Git repository by cloning it and setting path to the local repo’s folder. Note that the script requires Git to be installed on your machine!

When playing (with) the code, be aware that the pre-processing is faily slow! On my laptop, processing the 9000 commits of Sonic Pi takes around 30 seconds, before any sound is played. So you need to be a bit patient.

############ SETTINGS ################################################################################

# Path of the git repository
path = 'path/to/local/repo'

# Notes to play
notes = scale(:f3, :minor_pentatonic, num_octaves: 2)

# How to select notes
# 0 ... randomly
# 1 ... fixed note per author
# 2 ... by number of changed files
note_selection = 2

# Playback speed
seconds_per_day = 1
# play the tick sound every x days
days_per_tick = 1
# should ticks (e.g. per day) be audible?
play_ticks = false

# Maximum number of (most recent) commits.
# Use this on repositories with a long histroy, as retrieving and processing data is fairly slow.
# Use -1 to load the entire history.
# For the Sonic Pi repo, we just skip the first few hundred commits until the project gained momentum.
max_commits = 8000

set_sched_ahead_time! 2

######################################################################################################

# gets the history by calling `git log`, and processes it into an array of commits, with:
# [time, author id, #files changed, #insertions, #deletions]
# path: path to repo
# spd: seconds per day
# n: maximum number of (most recent) commits
define :get_history do |path, spd: 1, n: -1|
  # get the git log in a parsable format
  out = %x|git -C #{path} log --all --pretty=format:"%ad;%an" --date=iso --shortstat -n #{n}|
  # modify into a CSV-like format
  lines = out.gsub(/\n /, ';').split(/\n/)
  out = nil
  
  authors = {}
  table = []
  # iterate over entries and parse into an array of arrays
  for line in lines
    parts = line.split(';')
    if parts.length > 0
      # parse commit date to timestamp
      timestamp = Time.parse(parts[0]).to_i
      author = parts[1]
      author_id = 0
      if authors.key?(author)
        author_id = authors[author]
      else
        author_id = authors.length
        authors[author] = author_id
      end
      
      files = 0
      ins = 0
      del = 0
      # add counts of changed files, insertions and deletions (#lines)
      if parts.length > 2
        pp = parts[2].split(',')
        for p in pp
          if p.include? 'file'
            files = p.split(' ')[0].to_i
          elsif p.include? 'insertion'
            ins = p.split(' ')[0].to_i
          elsif p.include? 'deletion'
            del = p.split(' ')[0].to_i
          end
        end
      end
      # addpen "table" row
      table.append([timestamp, author_id, files, ins, del])
    end
  end
  lines = nil
  
  # time scale to convert history time into playback time
  time_scale = spd / Float(60 * 60 * 24)
  # sort by time
  table = table.sort_by { |row| row[0] }
  # adjust and scale time
  if table.length > 0
    t0 = table[0][0]
    start = Time.at(t0)
    # we want time 0 to be midnight before the first commit, not the actual time of the first commit
    midnight = Time.new(start.year, start.month, start.day, 0, 0, 0, start.utc_offset)
    t0 = midnight.to_i
    for row in table
      row[0] = (row[0] - t0) * time_scale
    end
  end
  
  table
end

# Get the data and measure the time it takes, to delay the loops in order to avoid timing errors
start = Time.now
table = get_history(path, spd: seconds_per_day, n: max_commits)
total_time = table[-1][0]
duration = Time.now - start

live_loop :tick, delay: duration + 0.5 do
  if play_ticks then sample :elec_plip, rate: 2, amp: 0.2 end
  sleep seconds_per_day * days_per_tick
end

idx = 0
time = 0

live_loop :commits, sync: :tick do
  puts "#{time} / #{total_time}"
  
  with_fx :reverb, room: 1 do |fx|
    with_synth :piano do
      while true
        row = table[idx]
        t = row[0]
        # schedule commits one beat ahead, otherwise break
        if t > time + 1 then break end
        
        # use the scaled logarithm of the number of changed lines to modulate the volume
        changes = row[3] + row[4]
        amp = 0.25 * Math.log10(changes + 1)
        
        # select the note to play, based on parameter note_selection
        note =
        if note_selection == 0
          notes.choose
        elsif note_selection == 1
          notes[row[1]]
        else
          notes[row[2]]
        end
        
        # play the note at the desired time in the future
        time_warp t - time do
          play note, amp: amp
        end
        
        # increment row/commit index, stop if no more rows
        idx += 1
        if idx >= table.length
          stop
        end
        
      end
    end
  end
  sleep 1
  time += 1
end
7 Likes

What a wonderful idea! Looks like they’ll be calling you up to make the soundtrack for the Sonic Pi documentary :wink:

As for how it sounds. If steady progress through perseverance were a vibe, I think this would be it. I like it quite a bit actually :+1:

My critique? It sounds like a solo project, maybe because it’s all on the piano. Perhaps if you gave each author a different instrument, it would feel more like teamwork!

Keep it up!

2 Likes

Thanks, @d0lfyn! I was also thinking about different instruments/synths/effects per author. However I did not really realize that this might potentially much better convey the collaborative nature of Git/software projects. I should definitely give it a try as an additional dimension!

2 Likes

Although I live in my own little closed space of ‘music’, it constantly suprises me how other people’s perceptions
of ‘music’ could be so interesting.

I’ll happily admit, my focus here in the SP forums is on basic code anyone can understand, but I really do appreciate it
when @d0lfyn and @Nechoj and yourself take me out of my ‘safe zone’.

Respect,

Eli…

2 Likes

As a first trial, I simply used a few different synths for differnt authors. However, I think this does not sound better then the previous version. You can try it by setting synth_per_author to true.

Further, I now use pan for authors. In the code below, authors are sorted by their contributions (number of commits). The author with the most commits is played with pan: 0. With decreasing contributions, authors are shifted more and more to the left or right, with a randomly selected direction, but consistent for each author. I.e. if an author’s sound appeared to the left once, it will do so for all commits of that author. It is not a particularly sensational effect, but makes it sound a bit more interesting IMO.

############ SETTINGS ################################################################################

# Path of the git repository
path = 'path/to/local/repo'

# Notes to play
notes = scale(:f3, :minor_pentatonic, num_octaves: 2)

# should different synths be asociated to different authors?
synth_per_author = false

# synth used if synth_per_author = false
default_synth = :piano
# synths to use for different authors
synths = (ring :piano, :pluck, :saw, :dsaw, :tri, :dtri, :pulse, :dpulse, :fm, :pretty_bell, :beep)

# How to select notes
# 0 ... randomly
# 1 ... fixed note per author
# 2 ... by number of changed files
note_selection = 2

# Playback speed
seconds_per_day = 1
# play the tick sound every x days
days_per_tick = 1
# should ticks (e.g. per day) be audible?
play_ticks = false

# Maximum number of (most recent) commits.
# Use this on repositories with a long histroy, as retrieving and processing data is fairly slow.
# Use -1 to load the entire history.
# For the Sonic Pi repo, we just skip the first few hundred commits until the project gained momentum.
max_commits = 8000

set_sched_ahead_time! 2

######################################################################################################

# gets the history by calling `git log`, and processes it into an array of commits, with:
# [time, author id, #files changed, #insertions, #deletions]
# path: path to repo
# spd: seconds per day
# n: maximum number of (most recent) commits
define :get_history do |path, spd: 1, n: -1|
  # get the git log in a parsable format
  out = %x|git -C #{path} log --all --pretty=format:"%ad;%an" --date=iso --shortstat -n #{n}|
  # modify into a CSV-like format
  lines = out.gsub(/\n /, ';').split(/\n/)
  out = nil
  
  authors = {}
  table = []
  # iterate over entries and parse into an array of arrays
  for line in lines
    parts = line.split(';')
    if parts.length > 0
      # parse commit date to timestamp
      timestamp = Time.parse(parts[0]).to_i
      
      files = 0
      ins = 0
      del = 0
      # add counts of changed files, insertions and deletions (#lines)
      if parts.length > 2
        pp = parts[2].split(',')
        for p in pp
          if p.include? 'file'
            files = p.split(' ')[0].to_i
          elsif p.include? 'insertion'
            ins = p.split(' ')[0].to_i
          elsif p.include? 'deletion'
            del = p.split(' ')[0].to_i
          end
        end
      end
      
      author = parts[1]
      author_id = 0
      if authors.key?(author)
        a = authors[author]
        a[:commits] += 1
        a[:files] += files
        a[:changes] += ins + del
        author_id = a[:id]
      else
        author_id = authors.length
        authors[author] = {id: author_id, commits: 1, files: files, changes: ins + del}
      end
      
      # append "table" row
      table.append({time: timestamp, author: author_id, files: files, ins: ins, del: del})
    end
  end
  lines = nil
  
  # change author IDs to index in descending number of commits order
  # calculate pan based on author index
  idx = 0
  indices = {}
  max_index = authors.length
  for a in authors.to_a.sort_by { |a| -a[1][:commits] }
    indices[a[1][:id]] = [idx, ((1.0 - 1.0 / (idx + 1)) ** 10) * [-1, 1].choose]
    idx += 1
  end
  
  # time scale to convert history time into playback time
  time_scale = spd / Float(60 * 60 * 24)
  # sort by time
  table = table.sort_by { |row| row[:time] }
  # adjust and scale time
  if table.length > 0
    t0 = table[0][:time]
    start = Time.at(t0)
    # we want time 0 to be midnight before the first commit, not the actual time of the first commit
    midnight = Time.new(start.year, start.month, start.day, 0, 0, 0, start.utc_offset)
    t0 = midnight.to_i
    for row in table
      row[:time] = (row[:time] - t0) * time_scale
      ind = indices[row[:author]]
      row[:author] = ind[0]
      row[:pan] = ind[1]
    end
  end
  
  table
end

# Get the data and measure the time it takes, to delay the loops in order to avoid timing errors
start = Time.now
table = get_history(path, spd: seconds_per_day, n: max_commits)
total_time = table[-1][:time]
duration = Time.now - start

live_loop :tick, delay: duration + 0.5 do
  if play_ticks then sample :elec_plip, rate: 2, amp: 0.2 end
  sleep seconds_per_day * days_per_tick
end

idx = 0
time = 0

live_loop :commits, sync: :tick do
  puts "#{time} / #{total_time}"
  
  with_fx :reverb, room: 1 do |fx|
    while true
      row = table[idx]
      t = row[:time]
      # schedule commits one beat ahead, otherwise break
      if t > time + 1 then break end
      
      # use the scaled logarithm of the number of changed lines to modulate the volume
      changes = row[:ins] + row[:del]
      amp = 0.25 * Math.log10(changes + 1)
      
      # select the note to play, based on parameter note_selection
      note =
      if note_selection == 0
        notes.choose
      elsif note_selection == 1
        notes[row[:author]]
      else
        notes[[row[:files], notes.length-1].min]
      end
      
      # get the synth to use (by author, or default)
      synth = synth_per_author ? synths[row[:author]] : default_synth
      
      # play the note at the desired time in the future
      time_warp t - time do
        with_synth synth do
          if synth_per_author
            play note, amp: amp, pan: row[:pan], sustain: 0, release: 0.5
          else
            play note, amp: amp, pan: row[:pan]
          end
        end
      end
      
      # increment row/commit index, stop if no more rows
      idx += 1
      if idx >= table.length
        stop
      end
      
    end
  end
  sleep 1
  time += 1
end
1 Like

Ah, I was thinking that maybe the multiple instruments would be incoherent. So that experiment didn’t go so well :sweat: I like what you came up with instead!

Thanks for joining me on my musical adventures @Eli! :blush: I thought I’d found my voice in algorithms, but today I broke out the DAW and tried crafting something by hand again. I feel like exploring hybrid compositions that are only partly algorithmic. My objectives are to reduce busywork (if such a thing exists) and to make better ideas. Hopefully you’ll find the results to your liking!

I saw in the other thread that you aren’t feeling well this evening. Please take care :worried: Dosage changes are the worst.

Respect!
d0lfyn

@Eli thanks, and great that you appreciate it although it is not you “style” of SP use.

I think the code is not really hard to understand. You can basically ignore the function :get_history, as it only parses the log output created by Git, which is the source of the sonified data. It looks quite complicated due to the format that needs to be parsed, but all you need to know is that it creates a list of changes made to Sonic Pi’s code base. Each entry has a date, an ID for the author, and some statistics about changes made to the code.

Almost everything of real interest is in the live loop :commits. In my second code posting, I changed data rows from arrays with integer indexing to dictionaries/‘hashes’ which allow for retrieving data columns/variables by name. I hope that makes the code more comprehensible.

Embedded track in the original post instead of linking only, now that this is possible. Also replaced the recording by the new version with different pan for different authors.

1 Like