Author:
- Name: Jens Schweikhardt
Location: DE - Federal Republic of Germany (Germany)
To build:
make all
There is an alternate version of this entry that will work with macOS. See the Alternate code section below. We recommend you look at the code at the last, even if you don’t have a Mac.
To use:
./schweikh1
Alternate code:
As noted above this entry will not work as it stands for macOS and there are some important notes as well as a description of how the fixed version works (the details of which are relevant to the original entry that no longer works as well as the fixed version, but are described in the context of the macOS adjustments). For details of what had to change, see macos.html. Unless you wish to figure it out yourself, we recommend that you read this even if you don’t have a Mac as it has some interesting details about the entry.
Alternate build:
make alt
Alternate build:
Use schweikh1.alt
as you would schweikh1
above.
Alternate try:
Bonus exercise: modify the code to provide a different compiler.
Judges’ remarks:
What does it do? It seems to print a list of system headers, perhaps with words after them. Curiously, if you look at the list of words defined in a given standard header, they are never printed directly after that header’s name. Aha! It’s a conformance test for the standard headers.
This code is a wonder; it’s a wonder that it compiles. I wonder
whether or not it should. The innovative realization that you can
use special characters, such as '\n'
, in #file
directives alone
merits some recognition.
I’ve included most of the author’s remarks; they’re fairly thorough. Do not read them if you want to figure this out yourself. “Amendment One” refers to “NA1”, the add-on to C89 which added some fairly crufty internationalization support.
Historical note:
Some non-gcc
compilers that were not fully ANSI standard did not compile this
entry correctly. Using cc by default was not helpful most of the time on this
entry, because the program had a hardcoded gcc
invocation anyway. Anyone who
uses egcs and has no plain gcc
will need to frob the source anyway and can be
expected to do the right thing with ${CC}
. This limitation was removed in 2023
but in the past one should have used gcc
.
Author’s remarks:
Important! This program, if it compiles at all, is mis-compiled by many compilers due to compiler bugs. It could be the “least likely to compile and execute” in conjunction with preprocessor abuse. When run, it is likely to uncover bugs in your system headers.
The program is run without arguments. If your compiler is buggy, the most likely result is no output and/or an exit status of 1.
If Amendment One headers are missing you will see something like:
<iso646.h>:
gcc: /usr/include/iso646.h: No such file or directory
gcc: No input files specified
<wctype.h>:
gcc: /usr/include/wctype.h: No such file or directory
gcc: No input files specified
<wchar.h>:
gcc: /usr/include/wchar.h: No such file or directory
gcc: No input files specified
Bug your vendor for an upgrade.
The format of the info file is straightforward and described in the file itself. Just load it in your favorite editor.
Requirements
gcc
(the GNU C compiler) must be available at runtime. It is assumed that your
C implementation keeps headers as files in the /usr/include
directory. The
info file must be readable and reside in the
current working directory. The current working directory must be writable in
order to create a temporary file (which is removed upon program termination). In
case you don’t have gcc at runtime, not all is lost if your compiler or
preprocessor can produce a list of defined macros in the format output by gcc -dM
, i.e. lines of the form #define MACRO value
.
Edit the source at line 55 in this case.
Background
ISO C has a very strict idea of visibility of identifiers. All possible
macros are explicitly enumerated. In a compliant implementation no other
macros can be defined, because you could write strictly conforming
programs that may fail to compile due to syntax errors: supposing that
<stdio.h>
defines PIPE_BUF
, then the conforming
#include <assert.h>
#include <stdio.h> /* <- or where the bogus macro is defined */
#include <string.h>
#define STR(x) #x
#define XSTR(x) STR(x)
int main (void)
{
int PIPE_BUF = 0;
assert (strcmp ("PIPE_BUF", XSTR (PIPE_BUF)) == 0);
return 0;
}
is expected to compile and meet the assertion. If it does not, your compiler compiles some other language than ISO C.
NOTICE to those who wish for a greater challenge:
If you want a greater challenge, don’t read any further: just try to understand the program via the source.
If you get stuck, come back and read below for additional hints and information.
Why I think my program is obfuscated
Not only is the program’s purpose a thorough standard conformance test, the source itself is a beast and has uncovered bugs in many C compilers.
- The token sequence after
#include
can consist of pp-tokens and “are processed just as in normal text.” [ISO 6.8.2] The result must resemble one of the canonical forms,"foo.h"
or<bar.h>
. This lets us write things like
#define HEADER "foo.h"
#include HEADER
I could not find a compiler rejecting this. However, if we’re torturing the preprocessor a little more, by using token pasting, at least one compiler falls over.
#define H(x) <st##x##.h>
#include H(dio) /* expands to <stdio.h> */
is rejected by tcc version 4.1.2, the TenDRA compiler:
"t.c", line 3: Error:<br> [ISO 6.8.3.3]: Invalid result for '##' operator in macro 'H'.
To be honest, I don’t know if the rules for pp-tokens and token pasting don’t forbid what I do (and thus tcc is correct in rejecting it). In this case, all other compilers I tried are buggy, or the Standard itself :-)
- The
#line
directive allows to set a line number, and optionally a file name, i.e. modify__LINE__
and__FILE__
. Strange things happen if the file name contains a NUL byte – which the Standard allows: in#line num "file"
the"file"
is an s-char-sequence per ISO 6.8.4. ISO 6.1.4 defines string literals as s-char-sequences, which may include octal escape sequences. Let’s look at
#include <stdio.h>
#line 42 "foo\0bar"
int main (void) {
printf ("%s %d\n", __FILE__, (int)sizeof __FILE__);
return 0;
}
This is supposed to output 'foo 8'
. Here’s what happens in RL:
foo 4 gcc 2.7.2.1, lcc
foo\0bar 9 tcc 4.1.2, Sunsoft cc turn "\0" into "\\0", ugh!
(I have written a bug report for gcc. Newer versions and egcs are ok.)
The
%:
and%:%:
digraphs test for conformance to Amendment One.Ask your local guru if C allows the same case label in the same switch statement to appear more than once. Ask him/her to think real hard. The answer will be “no”. The true guru will cite a constraint in ISO 6.6.4.2. Then make fun of the guru’s answer by waving
case __LINE__:
under his/her nose. Easy money from a bet! Make sure your guru has no chance to use Standardese weasel words as an escape: can I have
case <some_token>:
case <some_token>:
in the same switch? (Note that you need the [invisible] newline. You
can generously allow the additional constraint that no #undef
s or
#define
s are allowed between the two cases.)
Lots of integer constants and string literals come into the source
via __LINE__
and __FILE__
which are redefined at various places.
Sometimes the __FILE__
contains \0
bytes; different strings or
characters are then accessed with offsets, say 10+__FILE__
or
__LINE__+__FILE__
or 2[__FILE__]
. One integral constant in octal.
This makes the source and header input depend on ASCII.
A few old obfuscations, like one character identifiers, not too
many macros, a goto O
, “needless” assignments to satisfy lint
with its “function value ignored” warnings. My lint has nothing
to complain about.
There’s more whitespace in the Standard than just space, tab and newline. In
particular, there are vertical tab and form-feed that can be used in certain
places (outside of preprocessing directives, i.e. they are not allowed from just
after the #
up to the final newline. See 6.8 for details.) On many
implementations these appear as ^L
and ^K
. My source starts with an
extremely uncommon 32bit word consisting of ^L
, ^K
, %
and :
. I have
grepped old winning entries for ^K, none of them contains one, so this is
something new – after 14 years of IOCCC. Be sure to have enough paper in your
printer when you make a hard copy of the source…
While I’m at it, the rules state that only space, tab and newline
are ignored for the count (plus ‘{
’, ‘}
’, ‘;
’ followed by
whitespace). The mkentry.c program, however, uses isspace(3)
which
returns nonzero for \v
and \f
and other characters as well.
I could have used a lot more ^K
and ^L
probably undetected by your
counter but decided to err on the side of safety. I use a perl
script to compute the character count according to the rules.
The advantages of an independent clean room approach…
#!/usr/bin/perl -w
$/ = "\0";
$_ = <>;
s/[;{}][ \t\n]/\n/g;
s/[ \t\n]+//g;
print length ($_),"\n";
To use, try:
perl ./charcount.pl schweikh1.c
I have tried, as suggested in the guidelines, to let the
code look like ordinary C code. Apart from a few long lines I think I left the
indentation like I would in RL. You should however not try to use indent
on
the source. The code is extremely fragile because of the myriads of __LINE__
macros – indenting is a sure way to break the program. Don’t even think of
maintaining that beast; I’ve had my share of core dumps during development :-) A
test suite was used after every minor change to find out if the program still
does what it should.
From the goals: “To stress C compilers with unusual code.
”
That describes exactly my modest attempts…
Inventory for 1998/schweikh1
Primary files
- schweikh1.c - entry source code
- Makefile - entry Makefile
- schweikh1.alt.c - alternate source code
- schweikh1.orig.c - original source code
- charcount.pl - character counter in perl
- macos.html - explains how this entry was fixed to work with macOS
- info - file for C standards conformance test
Secondary files
- 1998_schweikh1.tar.bz2 - download entry tarball
- README.md - markdown source for this web page
- .entry.json - entry summary and manifest in JSON
- .gitignore - list of files that should not be committed under git
- macos.md - markdown source for macos.html
- .path - directory path from top level directory
- index.html - this web page