Skip to content

Instantly share code, notes, and snippets.

@rygorous
Last active October 22, 2024 05:58
Show Gist options
  • Select an option

  • Save rygorous/eda6d8cff3f3e2944fb3fa4764078c06 to your computer and use it in GitHub Desktop.

Select an option

Save rygorous/eda6d8cff3f3e2944fb3fa4764078c06 to your computer and use it in GitHub Desktop.
32-bit byte swap
Version a:
byteswap32(uint32_t x)
{
uint32_t y = (x >> 24) & 0xff;
y |= (x >> 8) & 0xff00;
y |= (x << 8) & 0xff0000;
y |= (x << 24) & 0xff000000u;
}
(or any reordering thereof, or substitute |= with +=)
Version b:
byteswap32(uint32_t x)
{
uint32_t y = (x >> 24) & 0xff;
y |= ((x >> 16) & 0xff) << 8;
y |= ((x >> 8) & 0xff) << 16;
y |= (x & 0xff) << 24;
}
(or any reordering thereof, or substitute |= with +=)
Version c:
byteswap32(uint32_t x)
{
// static_assert that sizeof(uint32_t) == 4 if you want.
uint8_t bytes[4];
uint32_t y;
memcpy(bytes, &x, 4);
std::swap(bytes[0], bytes[3]);
std::swap(bytes[1], bytes[2]);
memcpy(&y, bytes, 4);
return y;
}
(again, up to reordering. Or use type punning through a union. Or use two buffers and copy
with reordering instead of swapping in-place.)
Version d:
uint32_t byteswap32(uint32_t x)
{
return (byteswap16(x & 0xffff) << 16) | byteswap16(x >> 16);
}
(up to reordering. With multiple possible implementations for byteswap16.)
Version e:
uint32_t byteswap32(uint32_t x)
{
// This looks strange but happens to map directly to 3 PowerPC instructions
// (rlwinm, rlwimi, rlwimi) that form the standard byte reverse sequence on
// that target.
uint32_t y = (x << 24) | (x >> 8); // rlwinm
y = (y & ~0x00ff0000u) | ((x << 8) & 0x00ff0000u); // rlwimi
y = (y & ~0x000000ffu) | ((x >> 24) & 0x000000ffu); // rlwimi
}
I have seen all these basic variants (and many of the noted variations) in
production code. That's why "just puttern matching during instruction selection"
doesn't work: there is no canonical way this is always written. If you want to
handle this, you can either:
a) perform sufficient analysis to detect any of these patterns, or
b) introduce a canonical way to write it, make that fast, and recommend people use it.
Now, I've argued elsewhere that exposing byte swaps directly is kind of unfortunate
in the first place, since mostly byteswaps gets used when loading data with a known
endianness, on a target architecture that has a different endianness. The preferable
way to handle that is to state the target endian directly, not have logic to figure
out whether to swap or not. But the same concern applies to other constructs such as,
say, loading a little-endian value by doing
bytes[0] | (bytes[1] << 8) | (bytes[2] << 16) | (bytes[3] << 24)
It's great when you can agree people to always write it exactly that way, but there are
many variants floating around, and many such cases to catch. You can make pure
pattern matching work if you make clear from the outset that there is one blessed way
to do say an unaligned little endian load (say the code sequence above), and ensure that
all compilers handle that correctly. But with C/C++ that ship has sailed; there are many
variants in common use, and different compilers disagree on what the right thing to
pattern-match is, if they implement it at all!
Again, this is a lot simpler if there's a known construct that people are supposed to
use, and making that an official part of the language is pretty much the only way you
get both the compilers and the users to actually handle it well.
@hailinzeng
Copy link

hailinzeng commented Oct 22, 2024

line 8:

  y |= (x << 24) & 0xff00000u;

should be

  y |= (x << 24) & 0xff000000u;

@rygorous
Copy link
Author

Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment