IOCCC image by Matt Zucker

The International Obfuscated C Code Contest

1984/mullender - Grand Prize

hardcoded vax and pdp-11 code to move :-) across the screen

Authors:

To build:

    make alt

NOTE: the original code will not work on any system other than VAX-11 and PDP-11 and this is why we encourage you to use the alt version instead. See original code below for the original version.

Bugs and (Mis)features:

The current status of this entry is:

STATUS: INABIAF - please DO NOT fix

For more detailed information see 1984/mullender in bugs.html.

To use:

    ./mullender.alt [microseconds]

Hit ctrl-c/intr to exit the program.

The default microseconds is 190000 as this is approximately how long it slept in the original entry but it can be no lower than 5000 as any lower doesn’t work very well. This feature is so you can experiment with different speeds in between writes. It can be useful if your CPU is too fast :-)

The author stated that the original version also had a delay but the difference is it required one to hit enter for it to print another line; the alternate code will start over once it times out or if one hits a key.

Note that the microseconds is argc and it uses atoi() which does NOT check for overflow!

BTW: is there such a thing as too fast a CPU ? :-) Actually yes for certain code which is probably not as uncommon as you think :-).

Try:

    ./mullender.alt

    ./mullender.alt 5000        # wait for 5000 microseconds and see what happens

    ./mullender.alt 20000       # wait for 20000 microseconds and see what happens

    ./mullender.alt 100000      # wait for 100000 microseconds and see what happens

What happens if you hit enter after it reaches the end of the line? Why? What happens after some time of waiting?

Original code:

This original code will only execute correctly if your machine is a VAX-11 or PDP-11. In the following years, 1985 on, machine dependent code was discouraged.

Original build:

    make all

Original use:

    ./mullender

Judges’ remarks:

Without question, this C program is the one of the most obfuscated C program that has ever been received! Like all great contest entries, it resulted in a change of rules for the following year. To prevent a flood of similar programs, the rules were changed, requesting non-machine specific code.

This program was selected for the 1987 t-shirt collection.

The C startup routine, via crt0.o, transfers control to a location named main. In this case, main just happens to be in the data area. The array of shorts, which has been further obfuscated by use of different data types, just happens to form a meaningful set of PDP-11 and VAX instructions.

The first word is a PDP-11 branch instruction that branches to the rest of the PDP code.

On the VAX main is called with the calls instruction which uses the first word of the subroutine as a mask of registers to be saved. So on the VAX the first word can be anything. The real VAX code starts with the second word.

This small program makes direct calls to the write() Unix system call to produce a message on the screen. Can you guess what is printed? We knew you couldn’t! :-)

What happens if you hit enter after it writes a line of output?

BTW: to this day, 2023, this remains one of my (Landon Curt Noll’s) all time favorite entries!

gentab.c

In 2023 remarks were discovered from Sjoerd Mullender, one of the authors, and so was the program that was used by the authors to generate the array that he referred to. Because a.out.h, which gentab.c uses, is not available in all systems (like macOS) and more importantly because we wanted it to be as close to as the original as possible we used a copy of https://raw.githubusercontent.com/dspinellis/unix-history-repo/Research-Release/usr/include/a.out.h in the fabulous Unix History Repo.

gentab build:

gentab.c can be built like:

    make gentab

gentab use:

    ./gentab file

gentab try:

    ./gentab gentab > g.c

NOTE: it is highly unlikely that you will be able to run the output of gentab but it should at least compile.

Author’s remarks:

Notes from the judges:

These remarks, found at https://lainsystems.com/posts/exploring-mullender-dot-c/, were provided by Sjoerd Mullender years later. We thank the author of the article for the quote! For a more detailed analysis, taken from the book Obfuscated C and Other Mysteries, see below. We hope that this is okay with the author of the book. Considering that the analysis is entirely the authors’ comments we don’t think this will be a problem. Unfortunately the excerpt was PDF and it did not copy paste well. We had to go back and forth to type so it’s possible we made a typo though we also fixed some typos found in the extract. As for the other comments:

Remarks from the author:

I have never known a lot about the VAX assembly, so we used the C compiler to create the VAX code. We didn’t write it ourselves from scratch as we did with the PDP code. This is the reason why the VAX code is more complex, including the extra data after the PDP code.

Robbert and I were students at the VU (Free University in Amsterdam) at the time (mathematics with CS as our major since there was no CS curriculum when we started). We had an assignment to create a pair of programs for the computer networks course. The programs were supposed to send data reliably from one program to the other over an unreliable channel. This channel was simulated with a pair of pipes.

We decided for fun to create an obfuscated set of programs, only for the PDP, to do this, but circumventing the channel (i.e. cheating, hence the needed obfuscation). Our programs worked and we handed them in.

Of course, the teacher had a good laugh and then rejected our submission. (We knew him well, so we could get away with this.)

Then the IOCCC came along. I don’t remember how we heard about it, but at the time there was a world-wide messaging network Usenet where we read a bunch of newsgroups. I’m sure it was announced there and we saw it.

Since we had just recently created these obfuscated programs we decided we could use the same technique for an obfuscated C program. We upped the ante a bit by making it “portable”.

To add to the obfuscation, we used different formats for the integers in the array: some in decimal, some in octal, some in hexadecimal, and when the value would fit, some as an ASCII character.

The rest is history.

Since this was the first contest, we hadn’t seen any old entries, nor had any of the other contestants. Of course we knew about #define and tricks you could do with that, but we didn’t need that for this program. In fact, we made it as “standard” as possible. At the time there was this program called cb or C beautifier which would re-indent your program to make the layout look better. Our program is idempotent under cb.

More detailed analysis

When this program is compiled, the compiler places the array somewhere in memory, just like it places any compiled code somewhere in memory. Usually, the C startup code, crt0.o, calls a routine named main(). The loader fills in the address in the startup code, but at least on the old systems where this program ran, it doesn’t know that the main() in this program isn’t code but data!

When the program is run, the C startup code transfers control to the location main. The contents of the array just happen to be machine instructions for both a PDP-11 and a VAX.

On the VAX, the routine main() is called with the calls instruction. This instruction uses the first (2-byte) word of the called routine as a mask of registers that are to be saved on the stack. In other words, on the VAX the first word can be anything. On the PDP, the first word is a branch instruction that branches over the VAX code. The PDP and VAX codes are thus completely separate.

The PDP and VAX codes implement the same algorithm:

    for (;;) {
        write(1, "  :-)\b\b\b\b", 9);
        delay();
    }

The result is that the symbols :-) move across the screen. delay is implemented differently on the PDP, where we used a nonexistent system call (sys 55), and on the VAX where we used a delay loop.

My co-author, Robbert, and I had earlier written a similar program for an assignment on the PDP-11. Along came the first Obfuscated C Code Contest, and we decided that we should write a program like this, but make it run on two different architectures.

We didn’t think long about what the program should do, so it does something very simple.

We started with writing the PDP code in assembly. We both knew PDP-11 assembly so that was no problem. The assembly code we came up with is as follows:

    pdp:
        mov pc,r4
        tst -(r4)
        sub $9, r4
        mov r4,0f
        mov $1, r0
        sys 4; 0:0; 9
        mov $1000, r2
    1:
        sys 55
        sob r2,1b
        br pdp

This is not the code we originally wrote, but it is the code that we ultimately used in the program. The string to be printed is shared by the VAX and the PDP code and is located between the two sections.

First, the program deals with figuring out the address of the string. Then the program counter is saved in a scratch register. Since the program counter points at the second instruction, we subtract 2 from the scratch register in the second instruction. Then we subtract the length of the string and store the result in the location with label 0. This has to do with the calling sequence of system calls on the PDP. Following the sys instruction is the system call number (4 for write()), the address of the buffer (pointed to by label 0), and the length of the buffer (9). The file descriptor is in register r0. The rest of the code implements a delay loop. In each iteration, a non-existing system call (55) slows things down.

We assembled this program and extracted the machine code from the resulting object file. We used this code in the VAX part. Since neither of us were fluent in VAX assembly, we wrote the VAX code in C and massaged the compiler output. The VAX assembly program that we came up with is as follows:

    vax: .word 0400 + (pdp - vax) / 2 - 1
    1:
        pushl $9
        pushal str
        pushl $1
        calls $3, write
        cvtwl $32767, r2

    2:
        decl r2
        jneq 2b
        jbr 1b

    write: .word 0
        chmk $4
        ret

    str: .ascii " :-)\b\b\b\b"

    pdp: .word 4548, 3044, 58820, 9, 4407, 6, 5568, 1, 35076, 0, 9, 5570, 512, 35117, 32386, 496

The first word (after the label vax) is the PDP branch instruction. PDP branch instructions are octal 400 + the distance divided by 2. The string that both the PDP and VAX programs use is after the str label, and the PDP code is after the pdp label.

On the VAX, the program pushes 9 (the length of the string), the address of the string and 1, the file descriptor on the stack, and calls write(2). Since we didn’t know the exact calling sequence for system calls, we just copied the source for the write(2) system call stub into our program. After write(2) finishes, the program executes a delay loop, after which it jumps back to the start of the program.

We assembled this program, and extracted the machine code from the object file. After this we only had to convert the machine code to ASCII and write a little bit of C to glue everything together. We wanted to use different formats for each constant in the resulting array, and we wanted to choose the format randomly. So we wrote a program to choose an appropriate format at random. The program we wrote for that follows. This program actually also extracted the machine code from the object file.

NOTE from judges: see gentab.c for a copy of this file that can be compiled in modern systems.

    #include <stdio.h>
    #include <a.out.h>

    main(argc, argv) char **argv;
    {
        register FILE *fp;
        register short pos = 0, c, n;
        register char *fmt;
        if (argc != 2) {
            fprintf (stderr, "Usage: %s file\n", argv[0]);
            exit (1);
        }

        if ((fp = fopen(argv[1], "r")) == NULL) {
            fprintf(stderr, "%s: can't open %s\n", argv[0], argv[1]);
            exit(2);
        }

        fseek (fp, (long) sizeof (struct exec), 0);
        printf("/* portable between VAX and PDP11 */\n\n");
        printf ("short main[] = {\n");
        for (;;) {
            if (pos == 0)
                printf("\t");

            c=getc(fp) & 0377;
            if (feof(fp)) break;
            n = getc(fp) << 8|c;

            switch (rand() % 5) {
                case 0:
                case 1:
                    fmt = "%d"; break;
                case 2:
                    fmt = "%u"; break;
                case 3:
                    fmt = "0%o"; break;
                case 4:
                    fmt = "0x%x"; break;
            }

            if (32 <= n && n < 127 && (rand() % 4)) fmt = "'%c'";
            printf(n < 8 ? "%d" : fmt, n);
            printf(",");
            if (pos++ == 8) {
                printf("\n");
                pos = 0;
            }
            else printf(" ");

            printf("};\n");
        }
    }

As can be seen, there is a slight preference for decimal, and also a character format is sometimes used, but only if the data is a printable ASCII character.

When we ran this program, we were almost completely satisfied with the result. The only problem we had was that the program had chosen an octal representation for the first word. Since everybody knows what a PDP-11 branch instruction looks like (everyone knows that the traditional magic word for an executable, 0407, is a PDP-11 branch, we changed that to decimal. After checking the size of the resulting program we saw that it was one byte too long. The limit was 512 bytes, and our program was 513 bytes. So we changed the word and in the comment to &&.

Inventory for 1984/mullender

Primary files

Secondary files


Jump to: top