Chord Shape as Input

February 23rd, 2018
ideas, music, tech
When I'm playing mandolin, sometimes it would be nice to have a bass player. Nothing fancy, something like hitting every downbeat and playing the same chords as I am. What's interesting is I'm already providing the information necessary for this:

  • The shape of my fingers on the mandolin fretboard indicates what chord I'm about to play.
  • I tap my foot on downbeats.

That is, I'd like to infer the bass note from hand position, and then get the timing and intensity from foot tapping.

I already know how to do the foot portion: I regularly play with something that converts foot tapping to midi, so the tricky part is figuring out which note to play.

There's a lot of research on figuring out chords from audio, but that's not useful to me here. Consider switching from one chord to the next: the soonest you might hear of the new chord is the downbeat, but by that point it's too late to make the bass come out on the right note. Plus, often you want the bass on the downbeat and the mandolin on the upbeat.

What I think could work better is using a camera. The chord shapes I play are very straightforward, and I switch shapes far enough in advance that something could interpret the shape and infer the appropriate bass note to play before I hit the foot trigger.

This might be the sort of thing that deep learning has made a lot easier? Here's how I think training this could work:

  • Set up a camera pointing at the mandolin fretboard.
  • Record a video where you only play variations of the same chord. Say, lots of different things that should all trigger the bass note 'G'. Move around as you do this, trying to get a range of fretboard angles.
  • Record another video where you do the same thing with a different chord.
  • Pull out the frames and label them with which video they came from, train on that.

Alternatively, you could train two networks, one to identify the hand shape, and another to identify the fret that the lowest string is being played at.

People who know more ML than me: is this the sort of thing that actually is relatively straightforward with entry-level tools these days? With enough accuracy that it's not going to play the wrong notes live in an unfamiliar environment?

Comment via: google plus, facebook

Recent posts on blogs I like:

Somewhat Against Trans-Inclusive Language About Biological Sex

"People with vaginas"? Well, maybe

via Thing of Things April 25, 2024

Clarendon Postmortem

I posted a postmortem of a community I worked to help build, Clarendon, in Cambridge MA, over at Supernuclear.

via Home March 19, 2024

How web bloat impacts users with slow devices

In 2017, we looked at how web bloat affects users with slow connections. Even in the U.S., many users didn't have broadband speeds, making much of the web difficult to use. It's still the case that many users don't have broadband speeds, both …

via Posts on March 16, 2024

more     (via openring)