Skip to content

Instantly share code, notes, and snippets.

@fay59
Last active August 7, 2025 21:19
Show Gist options
  • Save fay59/5ccbe684e6e56a7df8815c3486568f01 to your computer and use it in GitHub Desktop.
Save fay59/5ccbe684e6e56a7df8815c3486568f01 to your computer and use it in GitHub Desktop.
Quirks of C

Here's a list of mildly interesting things about the C language that I learned over time.

  1. Combined type and variable/field declaration, inside a struct scope: https://godbolt.org/g/Rh94Go
  2. Compound literals are lvalues: https://godbolt.org/g/Zup5ZB
  3. Switch cases anywhere: https://godbolt.org/g/fSeL18 (also see: Duff's Device)
  4. Flexible array members: https://godbolt.org/g/HCjfzX
  5. {0} as a universal initializer: https://godbolt.org/g/MPKkXv
  6. Function typedefs: https://godbolt.org/g/5ctrLv
  7. Array pointers: https://godbolt.org/g/N85dvv
  8. Modifiers to array sizes in parameter definitions: https://godbolt.org/z/SKS38s
  9. Flat initializer lists: https://godbolt.org/g/RmwnoG
  10. What’s an lvalue, anyway: https://godbolt.org/g/5echfM
  11. Void globals: https://godbolt.org/z/k2sBJs
  12. Alignment implications of bitfields: https://godbolt.org/z/KmB4CB

Special mentions:

  1. The power of UB: https://godbolt.org/g/H6mBFT. This happens because:
    1. LLVM sees that side_effects has only two possible values: NULL (the initial value) or this_is_not_directly_called_by_main (if bar is called)
    2. LLVM sees that side_effects is called, and it is UB to call a null pointer
    3. UB is impossible, so LLVM assumes that bar will have executed by the time main runs rather than face the consequences
    4. Under this assumption, side_effects is always this_is_not_directly_called_by_main.
  2. A macro that tells you if an expression is an integer constant, if you can't use __builtin_constant_p: https://godbolt.org/g/a41gmx (from Martin Uecker, on the Linux kernel ML)
  3. You can make some pretty weird stuff in C, but for a real disaster, you need C++. Labels inside expression statements in really weird places: https://godbolt.org/g/k9wDRf.

(I have a bunch of mildly interesting in C++ too, but so does literally everyone who’s used the language for more than an hour, so it’s not as interesting.)

@Muffindrake
Copy link

Muffindrake commented Sep 10, 2018

Please avoid posting C program code that hasn't gone through thorough review (and had all warnings, including -Wall -Wextra -Wpedantic, fixed, as well as being compiled according to a C standard -std=c11). Even the pedantic warnings are there for good reasons. Many really speak for themselves.

Cherish the warnings that you are actually getting. The more subtle cases of UB (like a null pointer not being required to be a pattern of 0, which is why memset(a, 0, sz) is not strictly correct/technically UB) you will not hear about, and the compiler isn't required to warn you about other cases either.


2:

hello2.c:8:17: warning: ISO C forbids empty initializer braces [-Wpedantic]
     (struct foo){};
                 ^
hello2.c:11:18: warning: ISO C forbids empty initializer braces [-Wpedantic]
     ((struct foo){}).bar = 4;
                  ^
hello2.c:12:18: warning: ISO C forbids empty initializer braces [-Wpedantic]
     &(struct foo){};
                  ^

Empty initializer braces may be part of C++, but they're not allowed in C according to the standard.

The more interesting use cases for these compound literals is that you can pass them into functions, either their value or a pointer to them, without having to put them somewhere nearby in automatic storage, which is admittedly not that useful or unique, and more importantly, allowing you to reset a struct to 0, while correctly setting pointers they contain to null, which memset will not strictly do.

struct t {
        int b;
        char *ptr;
};

int
main(int argc, char **argv)
{
        struct t data = { .b = argc, .ptr = argv[0] };
        data = (struct t) {0};
       /* memset to 0 is not required to set pointers to null, this must */
}

4:

hello4.c:9:14: warning: initialization of a flexible array member [-Wpedantic]
     .elems = {32, 31, 30}
hello4.c:9:14: note: (near initialization for ‘f.elems’)
hello4.c:17:13: warning: invalid use of structure with flexible array member [-Wpedantic]
 struct flex g[2];

5:

hello5.c:1:13: warning: ISO C forbids zero-size array ‘empty_array_t’ [-Wpedantic]
 typedef int empty_array_t[0];
             ^~~~~~~~~~~~~
hello5.c:2:9: warning: struct has no members [-Wpedantic]
 typedef struct {} empty_struct_t;
         ^~~~~~
hello5.c:5:1: warning: ‘ext_vector_type’ attribute directive ignored [-Wattributes]
 typedef float vector_t __attribute__((ext_vector_type(4)));
 ^~~~~~~
hello5.c:8:20: warning: ISO C forbids empty initializer braces [-Wpedantic]
 empty_array_t ea = {};
                    ^
hello5.c:9:21: warning: ISO C forbids empty initializer braces [-Wpedantic]
 empty_struct_t es = {};
                     ^
hello5.c:10:13: warning: ISO C forbids empty initializer braces [-Wpedantic]
 array_t a = {};
             ^
hello5.c:11:14: warning: ISO C forbids empty initializer braces [-Wpedantic]
 struct_t s = {};
              ^
hello5.c:12:14: warning: ISO C forbids empty initializer braces [-Wpedantic]
 vector_t v = {};

hello5.c:12:14: error: empty scalar initializer
hello5.c:12:14: note: (near initialization for ‘v’)
hello5.c:13:11: warning: ISO C forbids empty initializer braces [-Wpedantic]
 void* p = {}; // <-- error
           ^
hello5.c:13:11: error: empty scalar initializer
hello5.c:13:11: note: (near initialization for ‘p’)
hello5.c:14:9: warning: ISO C forbids empty initializer braces [-Wpedantic]
 int i = {}; // <-- error
         ^
hello5.c:14:9: error: empty scalar initializer
hello5.c:14:9: note: (near initialization for ‘i’)
hello5.c:17:22: warning: excess elements in array initializer
 empty_array_t eaa = {0};
                      ^
hello5.c:17:22: note: (near initialization for ‘eaa’)
hello5.c:18:23: warning: excess elements in struct initializer
 empty_struct_t ess = {0};
                       ^
hello5.c:18:23: note: (near initialization for ‘ess’)

This is so wrong that gcc gives you a warning and an error for the same thing. Compiler extensions are strictly not part of the C language.


9:

hello9.c:16:34: warning: missing braces around initializer [-Wmissing-braces]
 struct lots_of_inits flat_init = {
                                  ^
     1, 2, 3, 4, 5, 6, 7
     {{  } {   }}{
 };
 }

While not strictly required, a warning is still printed, even giving you the correct braces, because it's so easy to introduce bugs otherwise.


12:
Nearly everything about bitfields is horrifyingly implementation-dependent, so results will vary from compiler to compiler, as such your paste is completely pointless and devoid of useful information.

To force a bitfield to be aligned "as you would expect", which is "overlaid over the basic integer type", one would use:

struct foo {
    char a;
    long:0;
    long b: 16;
    long:0;
    char c;
};

which is then laid out the same way as this struct (which is 24 bytes in size, with a 8 byte size 8 byte alignment long):

struct foo {
    char a;
    long b;
    char c;
};

(this is obviously still very implementation-dependent)


"special mentions 1":

This is not "interesting UB", this is just UB which is to be avoided at all times. Never ever write code this way.

@AbigailBuccaneer
Copy link

"Special mentions 3" fails to compile with -Werror=pedantic too, as the braced statement expression is a GNU extension.

@samliddicott
Copy link

It would be very nice if the compiler would be a help in avoiding undefined behaviour instead of effectively writing a different program behind your back.

@JohnDoneth
Copy link

@samliddicott I agree. Have you heard of Rust? One of my favorite features is how the compiler won't let you shoot yourself in the foot, even with threading.

@eighthjouster
Copy link

eighthjouster commented Sep 10, 2018

@samliddicott, I see your point in the sense that the compiler should probably emit a message indicating "whoa, this is undefined behavior!" if it's as clear as this example.

Having said that, eh, the C standard states that under undefined behavior, all bets are off. And as such, the compiler can do whatever. And that's exactly what happened here.

@jason-s
Copy link

jason-s commented Sep 10, 2018

the compiler should probably emit a message indicating "whoa, this is undefined behavior!"

That's not possible in general. Undefined behavior is often undefined at runtime and cannot be determined to be undefined using static analysis.

@BatmanAoD
Copy link

Having said that, eh, the C standard states that under undefined behavior, all bets are off.

That's precisely the problem, along with the fact that the C standard so freely declares so many parts of the language to be undefined.

Even with good tooling, determining which parts of a C program may cause undefined behavior is extremely nontrivial.

@Noxitu
Copy link

Noxitu commented Sep 11, 2018

@BatmanAoD

That's precisely the problem, along with the fact that the C standard so freely declares so many parts of the language to be undefined.

This is not a problem. Undefined behavior has simple reason: performance.

What should happen when you read outside of allocated memory? Should compiler always check for bounds?

How should integers overflow? Should it be standarized? Should it be defined as "whatever cpu does"?

The last one is really interesting, since it can be extended to question: Does this code operate on continuous chunk of memory?

int *array = ???; for(int index = start; index != end; ++index) { array[index]; }

Answering "yes" to previous allows for really nice optimizations. And invoke undefined behavior if overflow happens.

@andermoran
Copy link

andermoran commented Nov 6, 2019

1 ? ((void*)((x) * 0l)) : (int*)1
Can someone explain this ternary operator on special mention #2? It seems like it would always choose the first argument ((void*)((x) * 0l)) since 1 evaluates to true. This is confusing.

@fay59
Copy link
Author

fay59 commented Nov 8, 2019

@andermoran, the condition isn't important: the magic is that through C's loose interpretation of what constitutes a constant. When x is a constant (like 4), (void*)(4 * 0l) is understood by the C compiler to be the same as (void*)0, which is the null-to-pointer special case. The type of 1 ? NULL : (int*)1 is inferred to be int* because of the special nature of NULL. When x is not a constant (like y), (void*)(y * 0l) is interpreted as a regular int-to-pointer cast to void*, and in that case the type of the expression is coerced to void*.

@moon-chilled
Copy link

so, an lvalue is a value that:

  • can have its address taken...
    • unless it is a bitfield (still an lvalue)
    • unless it is a function (not an lvalue)

Also can't take the address of something that's 'register'-qualified

@ztane
Copy link

ztane commented Nov 20, 2022

As the C standard says, "an lvalue is an expression (with an object type other than void) that potentially designates an object". That's it. Why potentially? Because *p is an lvalue expression but if the value of p does not point to an object, then *p does not designate an object (its use has undefined behaviour).

@Jake-Jensen
Copy link

Y'all are far too worried about what the spec says and not the literal value of the topic. This is stuff you can do, not should or will.

@casual-engineer
Copy link

casual-engineer commented Jul 24, 2023

We can also create a main function of type void and forsake the ugly looking return 0 at the end of the code :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment