Skip to content

Instantly share code, notes, and snippets.

@gphg
Last active November 14, 2025 14:00
Show Gist options
  • Select an option

  • Save gphg/4e850891b619b55971353bdb35fba87c to your computer and use it in GitHub Desktop.

Select an option

Save gphg/4e850891b619b55971353bdb35fba87c to your computer and use it in GitHub Desktop.
Sort by cached reference address.
-- cache definition
local cached_table_identifier = {}
-- weak table. THIS IS IMPORTANT!
-- This allow the cache string to be fred by GC as table no longer referenced.
-- Without this, you have to manually dereference the string OR memory leaking!
setmetatable(cached_table_identifier, { __mode = "k" }) -- the key (table) is weak
-- A counter just to see cache hits/misses
local cache_misses = 0
-- get the identifier (string) of passed table
local getID = function(t)
local id = cached_table_identifier[t]
if id then
-- Cache hit! Return the stored string address
return id
end
-- Cache miss:
cache_misses = cache_misses + 1
-- 1. Use string.format("%p", t)
-- This gets the pointer address and *ignores* __tostring
id = string.format("%p", t)
-- 2. Store the string itself, DO NOT convert to a number
cached_table_identifier[t] = id
return id
end
-- table sorting example:
local myItems = {
{ name = "Task A", priority = 2 },
{ name = "Task B", priority = 1 }, -- First item with priority 1
{ name = "Task C", priority = 2 },
{ name = "Task D", priority = 1 } -- Second item with priority 1
}
print("Sorting...")
table.sort(myItems, function(a, b)
-- 1. Check the primary key (priority)
if a.priority ~= b.priority then
return a.priority < b.priority
else
-- 2. Compare the unique address strings
-- This technique is useful if you need to establish a consistent,
-- albeit arbitrary, order among a collection of distinct table objects
-- based on their identity rather than their contents.
return getID(a) < getID(b)
end
end)
print("Sort complete. Cache misses:", cache_misses)
-- Print results
for i, item in ipairs(myItems) do
print(i, item.name, item.priority)
end
@gphg
Copy link
Author

gphg commented Nov 14, 2025

Cached Address

  • How: Use a weak table and string.format("%p", t) on demand.
  • Sort: return getID(a) < getID(b)
  • Cost: Medium. The first time the sort sees a table, it has to run string.format, which is much slower than reading a number. Every subsequent comparison for that same table in the sort is a fast cache lookup. The cache_misses in the code above will be low (at most, one per item in myItems).
  • Result: Not stable. It sorts by memory address, which is random but deterministic (it will sort the same way every time, but that order has no relation to insertion order).
  • Upside: Works on any list of tables, even if you can't modify them.

While in Lua 5.1, the table.sort is unstable (it get fixed in 5.2), LuaJIT table.sort is stable.

-- In LuaJIT, this sort is STABLE
table.sort(myItems, function(a, b)
  return a.priority < b.priority
end)

-- The output is GUARANTEED to be:
-- 1. { name="Task B", priority=1 }
-- 2. { name="Task D", priority=1 } (Original order of 1s is kept)
-- 3. { name="Task A", priority=2 }
-- 4. { name="Task C", priority=2 } (Original order of 2s is kept)

It eliminated additional memory allocation, it is faster, and obviously stable.

Performance gain by "cached" it immediately after creation

While it consume more memory for string storage, it eliminates the cache misses and string.format call during sorting parse. In conclusion: this does not stop JIT.

-- Typical table definition under function factory
local function newEntry(name)
  local entry = {
    name = name,
    priority = 5
  }
  getID(entry) -- Caches the unique string ID immediately.
  -- the rest of initialization
  -- ...

  return entry -- finally return the reference
end

Stable, faster, and JIT friendly, basically.

Why not tonumber?

While number comparation is faster than string comparation: casting the table identifier (hex) string into number is dangerous!!! In Lua 5.1 numbers are typically 64-bit floating-point doubles. A 64-bit memory address (like 0xF000000000000001) is a 64-bit integer. When you convert a very large integer to a double, it can lose precision. Two different large addresses might get rounded to the same number, making your sort unstable and incorrect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment