|August 4th, 2014|
Researchers at MIT, Microsoft, and Adobe have developed an algorithm that can reconstruct an audio signal by analyzing minute vibrations of objects depicted in video. In one set of experiments, they were able to recover intelligible speech from the vibrations of a potato-chip bag photographed from 15 feet away through soundproof glass. moreNow, this was at 2k-6k frames per second, which is much faster than most cameras, but they were also able to get some data "even from video recorded at a standard 60 frames per second":
even from video recorded at a standard 60 frames per second. While this audio reconstruction wasn't as faithful as it was with the high-speed camera, it may still be good enough to identify the gender of a speaker in a room; the number of speakers; and even, given accurate enough information about the acoustic properties of speakers' voices, their identities.They briefly mention the Nyquist limit, implying that to reconstruct audio from video you need video frames at twice the frequency of the highest frequency sound you want to record. This would be true if we had only one pixel of data, but we have a whole 2d image. Compare this to looking at waves in a pool. With two timed samples at a point we can tell almost nothing about the wave pattern, but if we take a photograph of the whole pool we can really understand a lot about the waves.
Which then makes me wonder: in many places audio recording must be known to everyone involved but video recording can be secret. Does the technical possibility of audio reconstruction change this? Does this make it illegal to record video of someone without their consent, because a video camera is now just another kind of microphone? Will people in MA have to post signs about their surveillance camera? Or is the illegal act when someone later interprets the video into audio?