IOCCC image by Matt Zucker

The International Obfuscated C Code Contest

1990/jaw - Best entropy-reducer

compress and atob standins

Authors:

To build:

    make all

Bugs and (Mis)features:

The current status of this entry is:

STATUS: known bug - please help us fix

For more detailed information see 1990/jaw in bugs.html.

Try:

To test the official C entry, one might try:

    echo "Quartz glyph jocks vend, fix, BMW." | compress | ./btoa | ./jaw

which should apply the identity transformation to a minimal holoalphabetic sentence.

Also try:

    ./try.sh

Judges’ remarks:

The program, in its base form, implements two useful utilities:

Included with this entry is a shell script (with comments edited down to reduce it to 1530 bytes) which implements the complete shark utility. The script, shark.sh, contains a jaw.c embedded within it!

The sender must have compress and btoa. To send, try:

    ./shark.sh jaw.* README.md > receive

The resulting file, receive, unpacks the input files even if the receiver lacks uncompress and btoa:

    mkdir -p test
    cd test
    sh ../receive
    cmp ../jaw.c jaw.c
    cmp ../README.md README.md

NOTE: a limitation is that the script should be in the parent directory of the directory it is run from.

Authors’ remarks:

ABSTRACT

Minimal, Universal File Bundling (or, Functional Obfuscation in a Self-Decoding Unix Shell Archive)

“Use an algorithm, go to jail.”

– anon., circa 1988

HISTORICAL NOTE: We started working on this just before the Morris worm era.

Myriad formats have been proposed for network-mailable data. A major difficulty undermining the popularity of most file/message bundlers is that the sender assumes prior installation of the computational dual of such bundling software by the receiver. Command shell archives alleviate this problem somewhat, but still require standardization for the function of data compression and mail-transparency encoding. On Unix, these coding format quandaries are overcome by planting a novel Trojan Horse in the archive header to perform negotiation-less decoding.

Specifically, we outline the development of an extraordinarily compact portable bundler/extractor to (dis)assemble data-compressed, binary-to-ASCII-converted, length-split, and checksummed directory structures using standard Unix tools. Miniature versions of counterparts to a Lempel-Ziv coder (compress or squeeze) and an efficient bit packetizer (btoa) are compiled on-the-fly at mail destination sites where they may not already exist. These are written in purposefully obfuscated-C to accompany similarly-shrunk shell command glue. This resulting shell archiver is dubbed shark.

shark procedure overhead consumes as few as three dozen shell commands (or ~1100 bytes), commensurate with the size of many Internet mail headers; it amortizes favorably with message size. shark is portable across Unix variants, while the underlying technique is inherently generalizable to other encoding schemes.

In the function-theoretic sense of minimal Chaitin/Kolmogorov complexity, and within a modified Shannon model of communication, the ‘shark’ effort aims to construct a “shortest program” for source decoding in the Turing-universal Unix environment.

Oh, the shark has pretty teeth, dear–
And he shows them pearly white
Just a jackknife has Macheath, dear–
And he keeps it out of sight.

– Bertolt Brecht, Threepenny Opera

Portability

We have ported this program to a wide variety of systems. Among these are:

Obfuscation

We (the authors) feel this program is obfuscated for the following reasons:

  1. This is one of the few programs you’ll see WHOSE VERY UTILITY DEPENDS ON ITS OBFUSCATION!

  2. The contest entry may be used to send its wonderful self to anyone in the Unix world! Virus writers need not apply…

  3. The basic idea is twisted enough to be patentable, but is, out of the kindness of our hearts (as well as to maintain eligibility for the large IOCCC prize fund), dedicated to the public domain. Claude Shannon, meet Alan Turing.

  4. Meta-obfuscation is via obfuscated description (see ABSTRACT).

  5. “Literary” allusion. Production code contains a reference to self-reference, preserved at amazing cost for sheer perversity.

  6. Many, many micro obfuscations below, honed over three years time, in shell as well as C. Ask about the ‘tar’ pit escape, the argv[0] flip, Paul’s &4294967295 portability hack, the “void where prohibited by flaw” fix, the scanf() space-saver, shift shenanigans, signal madness, exit()ing stage left, and source-to-source transformations galore.

For extra credit:

Construct sharkmail, to auto-split sharkives into mailable segments and mail them. Here’s a simple one, which could be extended to enable auto-reassembly with one shell cmd at the far end.

    #!/bin/sh
    m=$1; shift
    shark $* | split -800 - /tmp/shark$$
    n=`ls /tmp/shark$$* | wc -l | sed 's/ *//'`
    p=0
    for f in `ls /tmp/shark$$*`
    do
         p=`expr $p + 1`
         mail -s "bundle ($p of $n) from '`whoami`'" $m < $f
    done
    rm /tmp/shark$$*

Shark history:

To which we add:

Inventory for 1990/jaw

Primary files

Secondary files


Jump to: top