Req: a function that indexes all Markov matrices

People from the other thread discussing Markov chains look much more adept than me at producing algorithms, so I thought I’d try asking for this because I think it’s beyond me.

I can create Markov chains by hand, then tweak them for what sounds best to me with Sonic Pi’s randomization features. It would be much more practical to be able to call up such matrices via an index rather than limit myself to trying the small collection I can tweak by hand. This can be done if I restrict all elements of an n x n Markov matrix to a 1-decimal place number.

For example, if n = 5, then we could concatenate in row order each single digit to the right of each decimal point to produce an integer of 25 places that uniquely represents that matrix. The 5x5 Markov matrices compose a small fraction of those (10 ** 25) 5x5 matrices, but I can’t think of a way to index them separately. In other words, the subset of 5 x 5 matrices with 1-decimal place elements, and whose row elements add up to 1.0 can be indexed, and then I could try out melodies in my Markov Player simply by passing an index without having to construct any matrix by hand. Can anyone index this limited type of Markov matrix?

Here is a first idea:

define :get_mat do |idx, dim|
  mat = Hash.new
  row = 0
  row_sum = 0
  idx.length.times do |i|
    
    col = i%dim + 1
    if col == 1
      # new row
      row += 1
      row_sum = 0
    end
        
    if idx[i] == "a"
      p = 1.0
    else
      p = idx[i].to_f/10
    end
    
    mat[[row, col]] = p
    row_sum += p
    
    if col == 4
      # add last element to get row_sum = 1
      mat[[row, col + 1]] = 1.0 - row_sum
    end
  end
  # return mat
  mat
end

idx1 = "135023221112233000a0"
mat = get_mat(idx1, 4)

puts mat[[1,1]]
puts mat[[1,2]]
puts mat[[1,3]]
puts mat[[1,4]]
puts mat[[1,5]]
puts mat[[2,1]]
puts mat[[2,2]]
puts mat[[2,3]]
puts mat[[2,4]]
puts mat[[2,5]]

The index for a matrix is a string with digits. For each row of 5 there are 4 digits, the last one is computed. For a 5x5 matrix you will need 20 digits. The constraint is that the sum of the 4 digits must be less or equal 10.

The function builds the matrix and returns it. The values inside the matrix are the given digits/10. The parameter dim is the dimension of the matrix - 1.

The example builds such a Markovian matrix from an index-string and puts the first 2 rows.

ADDITION: If you need single entries with probabilities equal 1.0, then I would include the letter a in the string and set the probability to 1.0 when a occurs. I have updated the example code accordingly.

1 Like

Just another comment in that context: Ruby is not the ideal language to do complicated math or even machine learning. If I would do a project to analyze midi/audio tracks of real musicians and to learn a Morkov process from their notes, I would do that in Python and its very powerful machine learning libraries (numpy, Tensorflow). And then simply create a SPI ruby file as an output of that program. This output file would contain only the stuff that’s needed to run the synths.

This is fantastic, thank you so much. I was intuiting that it could be done in a straightforward algorithm for someone clever at doing math puzzles, which I never was.

Yes, second order Markov chains are much more musically useful. Those melodies arising from first order chains are never very good melodies at all, but I think I have a limited use for them, plus I can use your algorithm on other variables, like clock division, which works better.

I did come to understand quickly, though, that constructing even one second order chain by hand for more than a few notes is not something you ever want to do. In that case, you want to get the second order chain by learning from MIDI input. I did find a Java program to do that here, and it’s a matter of converting its output to an SPI rb file.

You’re welcome :slightly_smiling_face:

Here is a second version where the index contains relative probabilties. It is only the relative strength of the numbers in a row that define the probs. Furthermore, I have added a period for better readability. This can be added to the first version as well, if you like it.


define :get_mat do |idx, dim|
  mat = Hash.new
  row = 1
  col = 0
  row_sum = 0
  idx.length.times do |i|
    
    if idx[i] == "."
      next
    end
    
    col += 1
    p = idx[i].to_f
    mat[[row, col]] = p
    row_sum += p
    
    if col == dim
      # end of row reached, make row_sum = 1
      dim.times do |c|
        mat[[row, c+1]] = mat[[row, c+1]] / row_sum
      end
      # new row / col indices
      row += 1
      col = 0
      row_sum = 0
    end
  end
  # return mat
  mat
end


idx1 = "10000.01001.02121.32130.11111"
mat = get_mat(idx1, 5)

5.times do |r|
  r +=1
  puts "row #{r}"
  puts mat[[r,1]]
  puts mat[[r,2]]
  puts mat[[r,3]]
  puts mat[[r,4]]
  puts mat[[r,5]]
end
1 Like

Interesting piece of work, thank you for the hint!

Sorry, I think I misunderstood your creation here. I only just now had time to experiment with it, and it appears you have to determine the probabilities by hand, code that into the idx1 parameter, and then mat() takes that and builds a matrix out of it. But determining the chain of probabilities in advance is exactly the work I want to save. What I meant by an index for (maybe I should say “indexing of”) the Markov matrices was a map from the counting numbers onto the set of Markov matrices for a given dimension. In other words,

mat(1, 4) would give the first 4 x 4 Markov matrix of the indexing.
.
.
.
mat(TotalNumberOf4x4MarkovMatrices, 4) would give the last 4 x 4 Markov matrix of the indexing.

I could just plug in indices and get a new Markov matrix without knowing in advance (barring a superhuman act of memorization) which index maps to which matrix.

Well, that you can do with the first version: Take a number with up to 20 digits and it will give you a Markov matrix that is uniquely defined by that number.

Here is how to build the string-index from the number n=99 given (with correction to avoid row_sums > 1 and now the parameter dim is 5)


define :get_mat do |idx, dim|
  mat = Hash.new
  nd = dim - 1  # number of digits in idx defining 1 row
  row = 0
  row_sum = 0
  
  idx.length.times do |i|
    
    col = i%nd + 1
    if col == 1
      # new row
      row += 1
      row_sum = 0.0
    end
    
    p = idx[i].to_f/10
    mat[[row, col]] = p
    row_sum += p
    
    if col == 4
      # add last element to get row_sum = 1
      if row_sum <= 1
        mat[[row, col + 1]] = 1.0 - row_sum
      else
        mat[[row, col + 1]] = 0.0
        dim.times do |c|
          mat[[row, c+1]] = mat[[row, c+1]] / row_sum
        end
      end
    end
  end
  # return mat
  mat
end


n = 99
idx2 = "%020d" % n
puts idx2
mat = get_mat(idx2, 5)

5.times do |r|
  r +=1
  puts "row #{r}"
  puts mat[[r,1]]
  puts mat[[r,2]]
  puts mat[[r,3]]
  puts mat[[r,4]]
  puts mat[[r,5]]
end

Example with random number:

n = rrand_i(0, 99999999999999999999)
idx3 = "%020d" % n
puts idx3
mat = get_mat(idx3, 5)
1 Like

What’s the meaning of the variable n in this context?

And what exactly is "%020d"? If that’s a 5-character string, then using other 5-character strings there doesn’t result in a full matrix. For example “00001” and “12345”.

When I build the string to pass to the function for dim=5, I have to ensure by hand that it has exactly 20 places, including showing all leading zeros. If I change dimensions to, say 7, then I have to ensure every parameter sent to the function has exactly 42 places, correct? Am I right that longer melodies will require entering by hand and tracking by eye strings of hundreds of characters?

This means: if the number n has less than 20 digits, then fill up with leading 0s to achieve exactly 20. So, if you choose a number n of less than 20 digits, then the string will have exactly 20.

If you want a 7x7 matrix, you would choose 7x7-7=42 as the length of the index string, and then you would choose %042d as the string formatting tag.

n is the number that represents your matrix. n = 0, 1, 2 … 99999999999999999999. Same n gives same matrix.

Are you describing "%020d" here, or are you describing "%020d" % n here?

You need both: an integer n and the line

idx = "%020d" % n

This line converts any integer n into a string idx of length 20 that can be used by get_mat. Of course, you can also write down an index manually:

idx = "01102201010302010104"

But I understood you wanted an integer n that represents the matrix?

1 Like

Cool, thanks. It was that "%020d" % n that was confusing the hell out of me. Pardon my Ruby inexperience. It looked like you’re taking a string modulo an integer, and made no sense to me at all.

Well, that’s really confusing. % is modulo between two integers, but formating operation between string and integer :upside_down_face:

1 Like

I used to program extensively, but not in forever, so being at the For Dummies level with Ruby syntax has been messing me up pretty good. You’d be amazed how I can misconstrue things, or fit something unfamiliar into a familiar paradigm. I won’t even tell you how many hours got wasted discovering that variable names can’t begin with a capital letter.

But thanks again for the amazingly concise algorithm that does exactly what I asked for and the patience to explain it: I provide an index (n) and the dimension, and it spits back a Markov matrix based on a human-writable, memorable (if I choose) number.

A note of interest: there are more integers n than Markov matrices of one decimal place, so your mapping must somehow be many-to-one. I wonder if there’s a pattern to the overlaps.

Yes it is. Those indices would give the same matrix:

"33333333333333333333"
"44444444444444444444"
...
"99999999999999999999"

The reason is that at the end the algorithm has to ensure that the row_sum is 1. Higher numbers at the 4 positions for a row might then lead to the same row after norming the row_sum to 1. Patterns like 1234 and 2468 will as well give the same row.

Btw, if you want to have single entries to be 1.0, you probably want to alter this line:

p = idx[i].to_f/10

change it to

p = idx[i].to_f/9

Then, the digit 9 will give the 1.0 probability at any position where a 9 is. The consequence is, that you no longer have 11 steps of probs: 0.0, 0.1, …, 0.9, 1.0 but 10: 0.0, 1/9, 2/9, …, 8/9, 1.0.

1 Like

Out of interest, can you estimate for a given dimension of these 1-decimal point element matrices, what proportion of all possible matrices are Markov matrices?

There are infinitely many matrices of a given dimension because there are infinitely many numbers. If you limit the number of possible values in a Markov matrix to e.g. 10, then there are only finitely many Markov matrices. Then, the fraction is zero. :sunglasses:

I should have clarified:

In these 1-decimal element matrices, there are only 10 possible values per slot, namely {.0, .1, …, .9}. That means there are (10 ** n) of these matrices for dimension = n. In fact, we can index these (10 ** n) matrices (bijective map of the counting numbers onto the matrices) by dropping the decimal points, then concatenating the n rows of n integers from bottom to top. What I’m curious about is the fraction of these (10 ** n) matrices that have rows summing to 1.

Ah, I see…
If dimension is d and values are restricted to 10 different one, then there are 10*d*d different matrices. To determine the number of different matrices with row sum exactly 10, we will need to count how many 4-tuples of 4 numbers there are with sum less or equal to 10 (because the 5th number will be calculated from these 4 to make up a sum of 10).

This problem is complicated. First, we would need to count how many sets of 4 numbers there are with sum <= 10, then sets with 3 different numbers, then with 2 and then with 1. Then, calculate the number of different permutations of 4-tuples made up of the numbers from these sets. And then, from all those different 4-tuples of numbers with sum <= 10 choose 5 to build a matrix. And then sum everything up.

Quite some work to do …

1 Like