• Posts
  • RSS
  • ◂◂RSS
  • Contact

  • Whistle-based Synthesis

    October 14th, 2019
    music, whistling  [html]
    I'm reasonably happy with my Bass Whistle, where I can whistle and have it come out as a decent sounding bass. I've been using it when playing mandolin in a duo, and it fits well there. When playing piano or with a piano player, however, there's already bass so something that falls in a different place in the overall sound would be better. That could be melody, though I can't whistle fast enough for everything, but probably something simpler: harmonies, riffs, horn lines.

    When I take my current software, optimized for bass, and tell it to synthesize notes a few octaves up, it sounds terrible:

    • Raw whistled input:
      (mp3)
    • Bass version (needs headphones or good speakers):
      (mp3)
    • Treble version:
      (mp3)

    I'm using simple additive synthesis with the first four harmonics, which means adding together four sine waves. I think what's going on is that higher notes need more complexity to sound good? Playing around with distortion and fading the harmonics at different rates it sounds a bit more interesting:

    • Adding distortion:
      (mp3)
    • Adding fade:
      (mp3)
    • Adding both:
      (mp3)

    I'm still not very happy with it, though. It sounds artificial and silly. There are good synthesizers, the product of decades of work on turning "play this note at this time" into good sounding audio, so perhaps I could use my pitch detection to drive a standard synthesizer?

    I made some stand-alone open source software that pipes the pitch detection through to MIDI. This was kind of tricky: MIDI doesn't have a way to say "play this frequency". Instead you just have "play this note" and "bend the note by this much". How to interpret pitch bend is up to the synthesizer, but generally the range is ±2 half steps. So we need some math:

    in: wavelength
    in: sample_rate
    in: current_note
    
    # Convert from "this wave is 23.2 samples long"
    # to "the frequency is 1896.6 HZ".
    frequency = sample_rate / wavelength
    
    # MIDI is equal tempered, with each octave divided
    # into twelve logarithmically equal pieces.  Take
    # A440 as a reference point, so represent our
    # 1896.6 HZ as "25.29 half steps above A440":
    distance_from_a440 = 12 * log2(frequency / 440)
    
    # A440 is A4, or midi note 69, so this is 94.29.
    fractional_note = 69 + distance_from_a440
    
    # MIDI uses a note + bend encoding.  Stay on the
    # same note if possible to avoid spurious attacks.
    if (current_note and
        current_note - 2 < fractional_note
                         < current_note + 2)
      integer_note = current_note
    else
      integer_note = round(fractional_note)
    
    # To compute the pitch bend, we first find the
    # fractional part of the note, in this case 0.29:
    fractional_bend = fractional_note - integer_note
    
    # The bend will always be between -2 and +2, a
    # whole tone up or down.  MIDI uses 14 bits to
    # represent the range between -2 and +2, so -2 is 0
    # and +2 is 2^14.  The midpoint is 2^13, 8192:
    integer_bend = round((1 + fractional_bend / 2)
                          * 8192 - 1)
    
    # The bend is 14bits which gets split into two 7-bit
    # values.  We can do this with masking and shifting.
    bend_least_significant = integer_bend & 0b1111111
    
    bend_most_significant =
        (integer_bend & 0b11111110000000) >> 7
    
    out: integer_note
    out: bend_least_significant
    out: bend_most_significant
    

    Initially I screwed this up, and thought pitch bend was conventionally ±1 semitone, and didn't end up catching the bug until I wrote up this post.

    I have this working reasonably well, except that when I bend more than a whole note I get spurious attacks. Say I slide from A to C: the slide from A to Bb to B can all be done with pitch bend, but then once I go above B the system needs to turn off bent A and start working with a new note. I would love to suppress the attack for that note, but I don't know any way to communicate that in MIDI. I don't know what people with existing continuous pitch electronic instruments do?

    A second problem I've run into is that what sounds like a steady whistled pitch actually has a lot of tiny variations. Consider this input: (mp3)

    This sounds reasonably steady to me, but it isn't really. Here are some successive zero crossings:

    wavelength (samples) frequency (hz) midi note
    39.02 1130.07 49.3300
    39.26 1123.41 49.2277
    38.66 1140.68 49.4918
    39.25 1123.62 49.2309
    38.90 1133.71 49.3857
    38.85 1135.21 49.4087

    My synth doesn't mind, and just passes the variability through to the listener, where it's not a problem. I track where in the wave we are, and slowly adjust the rate we move through the wave to match the desired frequency:

    • Bass output:
      (mp3)
    • Pure sine treble output:
      (mp3)
    When I pass that variability into some regular synths, however, even when I don't cross note boundaries I get a wavery output. I think this may be another artifact of using a synth that isn't designed for continuous pitch input? Or possibly the problem is that real MIDI pitch wheels don't just suddenly jump from +23% to +49% over a 40ms period, and so they haven't needed to design for it?

    I can fix this some on my end by averaging the most recent pitches to smooth out the variability, but then it stops feeling so responsive (and quick slides don't work, and if I start a note slightly sour it takes longer to fix it). I think the answer is probably "find a better synth" but I'm not sure how to figure out what to use.

    Still, I like this a lot, and I think there's something here. If you have a mac and want to play with this, the code is on github.

    Comment via: facebook, lesswrong

    Recent posts on blogs I like:

    How to Build High-Speed Rail with Money the United States Has

    The bipartisan infrastructure framework (BIF) just passed the Senate by a large margin, with money for both roads and public transportation. Unlike the 2009 Obama stimulus, the BIF has plenty of money for high-speed rail – not just $8 billion as in the 20…

    via Pedestrian Observations July 31, 2021

    Collections: The Queen’s Latin or Who Were the Romans, Part V: Saving And Losing an Empire

    This is the fifth and final part (I, II, III, IV) of our series asking the question ‘Who were the Romans?’ How did they understand themselves as a people and the idea of ‘Roman’ as an identity? Was this a homogeneous, ethnically defined group, as some ver…

    via A Collection of Unmitigated Pedantry July 30, 2021

    Songs about terrible relationships

    [Spoilers for several old musicals.] TV Tropes lists dozens of examples of the “I want” song (where the hero of a musical sings about their dream of escaping their small surroundings). After watching a bunch of musicals on maternity leave, I’m wondering h…

    via The whole sky July 17, 2021

    more     (via openring)


  • Posts
  • RSS
  • ◂◂RSS
  • Contact