How to diagnose midi timing problems?

Hi @BN1701 ,

just to let you know, MIDI timing variations in the order of 6ms is definitely expected on Windows due to the timing granularity of the BEAM VM - the virtual machine used by our MIDI/OSC scheduler which is written in Erlang. These variations are much less on Linux and macOS. I’m hoping this will be improved in future versions of Erlang running on Windows. I’ve at least made the BEAM VM team aware of the benefits of improving this (they have already said that at least the timestamp granularity will be improved in the next version on Windows).

Also, the audio and MIDI schedulers are completely independent (although both should be running off the system clock and therefore drift together with the system). The MIDI scheduler is Erlang based (as described above) and the audio scheduler is part of the SuperCollider system.

2 Likes

Still turning over stones …

I read through the Erlang scheduling code and see that ultimately it calls into RtMidi to do actual midi IO. I see that RtMidi can be built either with __LINUX_ALSA__ or __UNIX_JACK__, but I can’t find where in the SPi build tree either these is defined. I’ll assume for now that it’s using ALSA because in QjackCtl I see lots of connections set up on the ALSA tab but none on the MIDI tab. (This may or may not be relevant to the problem, but I’m mostly just curious to know.)

However, I have noticed in QjackCtl that when SPi is running that there’s a continuous stream of ALSA connection graph change messages, two or three per second. I’ve looked for ways of getting more info about these, for instance turning on the Verbose messages in QjackCtl Settings, or searching Google for debug/log level for Jack (or SPi) but so far I’m drawing a blank again. Any ideas, anyone?

I’ve seen those, and don’t know what they mean either. Just saying that I’ve got them, and I don’t have the timing problems you have, so maybe a red herring. Or it may flag something important with the developers.

1 Like

How interesting!

@robin.newman - have you observed this too?

Yes. It’s definitely there. I’ve checked it wasn’t in 3.2.2 I think it is as a result of the changes with the new midi setup.

More about the ALSA connection graph change messages …

I looked at the QjackCtl source and saw that these messages were coming from the ALSA announce port. You can monitor these directly (and with a little more information) with aseqdump -p 0:1. There’s a continuous stream of these messages:

0:1   Client start               client 164
0:1   Client exit                client 164

Running cat /proc/asound/seq/clients shows that (in my case) the last registered client is Client 163 : "RtMidi Input Client" [User]. I can’t find a way of finding out what Client 164 might be as it doesn’t get as far as being logged in asound/seq/clients, but given that these messages stop when SPi shuts down, I’d guess it’s part of the SPi setup process when it creates a bunch of RtMidi ports mapped each of the physical midi ports on the system. (Where in the codebase does this happen, by the way? I’ve been looking, but so far haven’t tracked it down.)

However, I’m also thinking that is problem is not related to the timing wobbles that I still haven’t tracked down, as so far the two other people who reported the graph change messages have not seen any timing problems.

BN

1 Like

The rt-midi code stuff is contained in the sp_midi library which is built in the Sonic Pi source tree in the external folder. It also works closely with the erlang code in the server/erlang folder. I think your analysis is correct as the code is continuously monitoring for the addition or removal of midi devices. You will need to understand some erlang and c++ to follow it.

I’ve had many years of C++, though I’ve not used it in earnest for maybe three years now. I’m new to Erlang but I worked through a tutorial this week - because of this - and I can at least follow what’s written now, though I doubt I could write something from scratch, yet :slight_smile:

1 Like

Great. Have fun investigating.

Right … Another few days hacking and what have I learned? I’ve found out how to build a debug build of libsp_midi independently of SPi. (I may have a decade or so of C++ experience behind me but complex build systems, especially on Linux, have always been beyond my comprehension.) And I’ve learned how to run the SPi server in the Erlang debugger without even running SPi at all.

Anyway …

It seems that the timing glitches I’m experiecing happen when pi_server_midi:loop() processes an update_midi_ports event. This happens every five seconds (though not every event produces a glitch).

At this point the handler calls into libsp_midi.so to retrieve a list of inputs and outputs and their respective state info. In my setup with the iConnectMidi4+ there are sixteen inputs and sixteen outputs. If I add the MidiSport 8x8, there are eight more pairs. (Unfortunately I don’t have a class compliant midi interface with just one input and one output!) If I prevent the call to retrieve the new state info but leave the five-second timer call active, the timing glitches pretty much go away.

And … if I unplug all midi interfaces and just run @robin.newman’s test code with the ALSA loopback device alone, no more timing glitches either.

(By the way, I’m defining a glitch now as a variation greater than +/- 0.01 seconds.)

But wait, there’s more …

The ALSA annoucements about connection graph changes are associated with the SPi server polling the midi ports’ status. And there are many more announcements than the one every five seconds from pi_server_midi:loop(). The additional announcements are actually triggered by libsp_midi itself, which creates an instance of HotPlugThread class, which has its own polling thread checking for changes in the available port list two times a second.

Turning off HotPlugThread as well as the poll from pi_server_midi completely eliminates the ALSA announcements and also gives the best timing experience I’ve heard so far. (I left a midi loop playing for eighteen hours, checking in now and again. It sounded perfect every time.)

I still don’t know why we get the stream of announcements when running on an RPi and not, for instance, on my Ubuntu Studio PC, but although they are associated with the events that lead to timing glitches, I don’t think they’re part of the cause and I’m probably going to stop worrying about them for now.

As for the timing glitches themselves, we’ve already seen that we get pretty good timing even wth HotPlugThread firing every half second. This could be because it does its thing purely in C++ land, but may also be because it doesn’t do any processing beyond comparing list sizes having retrieved the latest sets of port info. When pi_server_midi.erl compares current and new lists of state info, it uses =:=, exactly equal to, which I suspect just take a significantly larger amount of time.

(Edit: I just tried commenting out successive lines in the update_midi_ports handler and the =:= isn’t the problem. I have to explicitly remove calls to sp_midi:midi_ins() and sp_midi:midi_outs() before I see or hear any reduction in the timing glitches.)

So why do we need both polling mechanisms? And if we do need both mechanisms, could someone with more experience of thinking in Erlang suggest a way to make that list comparison code more efficient?

BN

3 Likes

This is an excellent piece of research and I’ll make sure that it’s considered by the development team. It’s in all our interests to try and improve this. EDIT it is already being looked at :slight_smile:

1 Like

Brilliant investigative work @BN1701 - cheers, seems that’s been very helpful :grinning_face_with_smiling_eyes:

1 Like

@BN1701, great analysis, thank you for that. I wanted to let you know that I have started to work on a solution to this issue. Will get back to it again after work today.

1 Like