{"items": [{"author": "Al", "source_link": "https://www.facebook.com/jefftk/posts/625526308142?comment_id=625528004742", "anchor": "fb-625528004742", "service": "fb", "text": "Back in the '80's I worked for a company that used gzip and ccitt to produce a hardware compression of 10/1.   Amazing that s/w compression is faster  and more efficient ...", "timestamp": "1377375199"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/625526308142?comment_id=625530923892", "anchor": "fb-625530923892", "service": "fb", "text": "@Al: gzip isn't usually anywhere near 1032x though; that's the theoretical maximum.", "timestamp": "1377377595"}, {"author": "Jan-Willem", "source_link": "https://plus.google.com/100580955183019057735", "anchor": "gp-1377383927454", "service": "gp", "text": "You've seen \nhttp://research.swtch.com/zip", "timestamp": 1377383927}, {"author": "Jan-Willem", "source_link": "https://plus.google.com/100580955183019057735", "anchor": "gp-1377392626538", "service": "gp", "text": "Whoops, that got truncated. \u00a0I had a colleague who had a \"zip bomb\" file on his site for a long time, but I think that bug got fixed. \u00a0The reason you're seeing less than the theoretical maximum I think is that gzip does some framing on the file which has a few bytes of overhead. \u00a0This allows the compressor to do stuff like decide not to encode chunks of the file if it contains a mix of already-compressed and compressible content. \u00a0It's why you can sometimes usefully gzip metadata-heavy image files even though gzipping an ordinary image file generally makes it bigger.", "timestamp": 1377392626}, {"author": "Eric", "source_link": "https://plus.google.com/113202109784097860410", "anchor": "gp-1377452188279", "service": "gp", "text": "Have you looked at SDCH? I haven't been able to find the details of how it works, so I have looked at xdelta, which it is loosely based on. \u00a0With a 1k dictionary of zeros a 1GB zero file compresses down to\u00a04243 for a ratio of\u00a0203862X. \u00a0The other data points I used suggest that it reaches a limit near 240,000X But I don't have the disk space to test fully.", "timestamp": 1377452188}, {"author": "Daniel", "source_link": "https://www.facebook.com/jefftk/posts/625526308142?comment_id=625695873332", "anchor": "fb-625695873332", "service": "fb", "text": "Hmmm, I would have attributed the point about using short codes for common outcomes and long codes for rare outcomes to Shannon; it seems like the basic insight of information theory. Huffman's idea was a particular algorithm for how to do the assignment of bit strings to outcomes. In other words, Shannon figured out that the encoding method needed to have L(x)=-logP(x) to get optimal codelengths, and Huffman developed an actual algorithm for building a encoding function out of a probability table that had this property.", "timestamp": "1377489695"}]}