• Posts
  • RSS
  • ◂◂RSS
  • Contact

  • Evaluating Opus

    August 8th, 2020
    audio, singing, tech  [html]
    Several months ago, I proposed a bucket brigade approach to singing with friends over the Internet. Recently, Glenn and I have been working on an implementation (prototype source code). We can't use WebRTC (I think) because we need fine grained control over latency, so we're doing everything manually. Which raises the question of compression.

    CD quality audio, which your browser is happy to give you, represents a second of audio with 44,100 ("44.1kHz") 16bit samples. It's possible to send this raw over the internet, but that's a lot of bandwidth: 88,200 bytes per second in each direction, or 1.4 Mb/s. That's not completely nuts, but many people on Wi-Fi are in environments that won't be able to handle that consistently.

    One way to do better is to just throw away data. Instead of taking 44,100 samples every second, just take a quarter of that (11,025). Instead of using sixteen bits per sample, just send the eight most important ones. This is a factor of eight smaller at 175 kb/s round trip, but compare:

    (CD Quality: 44.1kHz, 16bit)

    (Reduced: 11kHz, 8bit, undithered)

    This doesn't sound great: there are some weird audio artifacts that come from rounding in a predictable way. Instead, people would normally dither it:

    (Reduced: 11kHz, 8bit, dithered)

    This sounds much more realistic, but it adds a lot of noise. Either way, in addition to sounding awkward this approach also uses more bandwidth then would be ideal. Can we do better?

    I wrote to Chris Jacoby, and he suggested I look into the Opus codec. I tried it out, and it's pretty great:

    (Opus: 64 kb/s)

    (Opus: 32 kb/s)

    (Opus: 24 kb/s)

    (Opus: 16 kb/s)

    All of these are smaller than the reduced version above, and all of them except the 16 kb/s sound substantially better.

    In our use case, however, we're not talking about sending one large recording up to the server. Instead, we'd be sending batches of samples off every, perhaps, 200ms. Many compression systems do better if you give them a lot to work with; how efficient is opus if we give it such short windows?

    One way to test is to break the input file up into 200 ms files, encode each one with opus, and then measure the total size. The default opus file format includes what I measure as ~850 bytes of header, however, and since we control both the client and the server we don't have to send any header. So I count for my test file:

    format size
    CD Quality 706 KB
    11kHz, 8bit 88 KB
    Opus, 32 kb/s 44 KB
    Opus, 32 kb/s, 200ms chunks 51 KB

    I was also worried that maybe splitting the file up into chunks would sound bad, either because they wouldn't join together well or because it would force opus to use a lower quality encoding. But it sounds pretty good to me:

    (200ms chunks)

    Another question is efficiency: how fast is opus? Here's a very rough test, on my 2017 Macbook Pro:

    $ time for i in {{1..1000}}
        do opusenc --quiet \
           --music --bitrate 32 \
           row-row-row.wav output.opus32
        done
    $ time for i in {{1..1000}}
        do opusdec --quiet \
          row-row-row.opus32 output.wav
        done
    

    The test file is 8.2 seconds of audio. Encoding 1000 copies took 68s, decoding took 43s, for a total of 111s. This is 74x realtime, and is the kind of embarrassingly parallel computation where you can just add more cores to support more users.

    Running on the server it looks like we can use the python bindings, and on the client we can use an emscripten web assembly port.

    Overall, this seems like a solid chunk of work, but also a good improvement.

    Comment via: facebook, lesswrong

    Recent posts on blogs I like:

    Collections: Iron, How Did They Make It? Part I, Mining

    This week we are starting a four-part look at pre-modern iron and steel production. As with our series on farming, we are going to follow the train of iron production from the mine to a finished object, be that a tool, a piece of armor, a simple nail, a w…

    via A Collection of Unmitigated Pedantry September 18, 2020

    Learning Game

    I came up with this game. In the game one person thinks of something and then gives the other person a clue. And the other person writes a guess down on a blackboard or a piece of paper. Or really anything you have that's laying around that's av…

    via Lily Wise's Blog Posts September 17, 2020

    Hong Kong Construction Costs

    I think we have found the #2 city in urban rail construction costs, behind only New York. This is Hong Kong, setting a world record for the most expensive urban el and encroaching on Singapore for most expensive non-New York subway. As we look for more da…

    via Pedestrian Observations September 16, 2020

    more     (via openring)


  • Posts
  • RSS
  • ◂◂RSS
  • Contact