IOCCC image by Matt Zucker

The International Obfuscated C Code Contest

2001/herrmann1 - Best abuse of the C preprocessor

A Turing machine using preprocessor

Author:

To build:

    make

Bugs and (Mis)features:

The current status of this entry is:

STATUS: known bug - please help us fix

For more detailed information see 2001/herrmann1 in bugs.html.

To use:

    ./herrmann1.sh 'prg=file'

NOTE: this entry seems to rely on older versions of gcc for the compilation stage and as such if you wish to see that part like it was designed you will need an older compiler perhaps gcc 2.95.

Try:

    ./try.sh

Judges’ remarks:

There are some clear hints in the source code as to the purpose of this program. Perhaps the only way to figure out what is going on is to look at the differences between the runs of the C preprocessor! Computer Science students should recognize what is going on.

Author’s remarks:

Many cool things have been done with the preprocessor, e.g. calculation of prime numbers or the Tower of Hanoi. However, up to now, no one proved that the preprocessor is Turing complete, and I wondered why. Well, the answer is simply that the preprocessor is not Turing complete, at least not if the program is preprocessed only once. This is true even if the program is allowed to include itself. (The reason being that for a given program, the preprocessor has only a finite number of states, plus a stack consisting of the places which the file has been included from. This is only a push-down automaton.)

Well, this entry is a Turing machine implemented in the preprocessor; so of course, it has to be preprocessed several times. Here are the main features:

How to use it

For the impatient

The info files herrmann1.times2 and herrmann1.gcd are sample programs for the Turing machine. For example, type

    ./herrmann1.sh 'prg=herrmann1.times2'

and watch the animated Turing machine multiply 8 by two (in unary representation). Here’s a screen shot during compilation:

    /*
    *
    * State: st2.
    *
    * _ _ _ _ _ * _ _ _ _ _ _ _ _ _ _ _ _ _
    * ... O O I I I I I O I I I I I O O ...
    * ~ ~ ~ ~ ~ * ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
    *
    *
    */

(As on real videos, you’ll see a bit of “white noise” at the beginning and at the end.;-) After compilation has terminated, type ./herrmann1 to see the final result. The output will be

Final tape:

                      *
    ..OOIIIIIIIIIIIIIIIIOOOO..
                      *

(Note that the last frame of the animation is not the final result.)

In the previous example, the Turing machine started on the default tape of the program herrmann1.times2. To provide your own tape, type (for example)

    ./herrmann1 prg=herrmann1.times2 tape="O O O I I I O O O"

(The Os and Is are letters, not digits.) You might prefer to provide only a finite portion of the tape, like I did above. In this case, the remainder of the tape is filled with Os. ;-)

If you just type:

    ./herrmann1

and run the program, you’ll get a usage message (and the return code will be 2).

Note: To get a really nice animation, you’ll need a terminal which is able to scroll fast enough. xterm is not very good. The KDE terminal konsole or the text console under linux are much better. Simply try some out.

Writing your own programs

Now, let’s assume you want to write a program which adds 1 to a number in unary representation. (That is, a row of I’s is expected, and one more I is appended to the row.) Let’s call it plus1.turing. Here’s what it could look like:

    /* Search beginning of number. */
    #define start_O O, right, start
    /* Number found. Move one step to the left. */
    #define start_I I, left, st2
    /* Write a I and stop */
    #define st2_O   I, left, stop

Some explanations: - The tape alphabet is always O and I. - The states can be anything starting with “st”. “start” and “stop” are the start and stop states, respectively. - To define a state, use two lines: one for each letter on the tape. You may omit one of the lines if the Turing machine will never enter the state reading that letter. (In the example, the line defining st2_I has been left out.) - The order of the lines is irrelevant. - You may use C-style comments. (Would you have guessed it?)

Now, the program can be used with

    ./herrmann1.sh 'prg=plus1.turing' 'tape="I I I"'

But what happens if no tape is provided? Up to now, the Turing machine would start on an empty tape (that is, a tape filled with Os). For many programs, this doesn’t make much sense (as is the case for plus1.turing). So you might want to provide a default tape by adding a line like

    #define tape O O O O O I I I I

to your program.

Debugging

When writing long programs, the most common errors are to forget to define a state or to misspell one. Let’s assume you forgot the line

    #define start_I I, left, st2

Remove it and type

    ./herrmann1 'prg=plus1.turing'

The output will be

Undefined state start.

Tape:

            *
    ..OOOOOOOOIIIIOO..
            *

and the return code will be 1. As you can see, the Turing machine moved over the Os, but when it arrived at the I (still in state start), the error occurred.

Note that other kinds of errors can cause compilation to fail. If this happens, the build script will write the error message of gcc into a file called error and then exit.

What the build script does

In principle, the build script simply runs the preprocessor over and over again, until nothing changes anymore; then, the program is compiled. After each preprocessing step, the script outputs the beginning of the program; this is where the animation takes place (except in the first and last few steps).

However, there’s a technical problem to overcome: Some compilers (including gcc) insert a lot of white space during the preprocessing. This is bad for two reasons. First, the animated picture won’t look right. And second, after some time, the file will grow too big for your hard disk. (The amount of white space grows exponentially.) So after each preprocessing, the first thing the build script does is remove all the unnecessary white space using sed.

(gcc would even insert #line directives if I didn’t suppress them with the -P option.)

Some more details

Some technical details / obfuscations

    #define a 0&0
    #if !a

Of course, it is a bit more subtle, and even useful.

Compatibility

    -e "/^#line/ d"

to the sed parameters. Or, if your #line directives look like

    # 15 "filename"

add

    -e "/^# [0-9]/ d"

instead.

Inventory for 2001/herrmann1

Primary files

Secondary files


Jump to: top