Last active
October 12, 2025 13:05
-
Star
(157)
You must be signed in to star a gist -
Fork
(34)
You must be signed in to fork a gist
-
-
Save mikesmullin/6259449 to your computer and use it in GitHub Desktop.
Revisions
-
mikesmullin revised this gist
Jul 4, 2025 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1109,7 +1109,7 @@ References: # Appendix: Miscellanous Tools & References - Another booklet like this one (written recently, published in Markdown on Github) https://github.com/0xAX/asm - The x86 Instruction Structure https://www.codeproject.com/articles/662301/x-instruction-encoding-revealed-bit-twiddling-fo -
mikesmullin revised this gist
Jul 4, 2025 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1109,6 +1109,8 @@ References: # Appendix: Miscellanous Tools & References - Another booklet like this one (written recently, published in Markdown on Github) https://github.com/0xAX/asm - The x86 Instruction Structure https://www.codeproject.com/articles/662301/x-instruction-encoding-revealed-bit-twiddling-fo - X86-64 Instruction Encoding -
mikesmullin revised this gist
Mar 10, 2025 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -939,6 +939,8 @@ References: http://cs.lmu.edu/~ray/notes/nasmtutorial/ - SASM: Simple crossplatform IDE for NASM, MASM, GAS, FASM assembly languages https://dman95.github.io/SASM/english.html - FASM (Flat Assembler) by Tomasz Grysztar has a minimalist approach and recent cult following https://github.com/tgrysztar/fasm --- -
mikesmullin revised this gist
Nov 29, 2024 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -151,7 +151,7 @@ which people continue to discover through reverse engineering. - **Primary Opcodes:** In the first release of x86, we had only `1`-byte opcodes. - **Secondary Opcodes:** Future opcodes made room by prefixing the escape byte `0x0f`. These are `2`-byte opcodes. - **Opcode Extension:** If the instruction does not require a second operand, then the `3`-bit `MODRM.reg` field is considered an extension of the opcode. Since it can only -
mikesmullin revised this gist
Nov 27, 2018 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1129,7 +1129,7 @@ References: http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/ - How Rust encodes exceptions and interrupts https://os.phil-opp.com/handling-exceptions/ - NeHe's famous OpenGL game dev tutorials incl. examples in Windows MASM http://nehe.gamedev.net/tutorial/creating_an_opengl_window_(win32)/13001/ - How Debuggers Work w/ Breakpoints https://eli.thegreenplace.net/2011/01/27/how-debuggers-work-part-2-breakpoints -
mikesmullin revised this gist
Nov 27, 2018 . 1 changed file with 0 additions and 26 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,31 +1,5 @@ # Mike's x86-64 Assembly (ASM) Notes ## Assembling Binary Machine Code ### Operating `Modes`: -
mikesmullin revised this gist
Nov 26, 2018 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -985,6 +985,8 @@ References: http://wwwcdf.pd.infn.it/localdoc/gdbint.pdf - Windbg Commands https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/commands - x86dbg is perhaps the best Windows debugger today https://x64dbg.com/ - Hex-Rays Interactive Disassembler (IDA); most professional, but expensive https://www.hex-rays.com/products/ida/index.shtml - Binary Ninja; less featureful but cheap, modern interactive disassembler -
mikesmullin revised this gist
Nov 25, 2018 . 1 changed file with 9 additions and 9 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -334,16 +334,16 @@ when referring to these registers, which describes both a) operand width, and b) While there are several places you may reference a register, including `MODRM.reg`, `MODRM.rm`, `SIB.index`, `SIB.base`, and `PO.reg`, you'll find they all use the same `3` or `4`-bit mapping convention, as follows: |Register<br>Reference|(`3`-bit / `4th`-bit=`0b1`)<br>Low `8`-bits³|<br>High `8`-bits¹ ³|<br>Low `16`-bits|<br>Low `32`-bits⁴|<br>Full `64`-bit Register -|-|-|-|-|- `0b000`|`AL`/`R8B` | |`AX`/`R8W` |`EAX`/`R8D` |`RAX`/`R8` `0b001`|`CL`/`R9B` | |`CX`/`R9W` |`ECX`/`R9D` |`RCX`/`R9` `0b010`|`DL`/`R10B` | |`DX`/`R10W`|`EDX`/`R10D`|`RDX`/`R10` `0b011`|`BL`/`R11B` | |`BX`/`R11W`|`EBX`/`R11D`|`RBX`/`R11` `0b100`|`SPL`²/`R12B`|`AH`|`SP`/`R12W`|`ESP`/`R12D`|`RSP`/`R12` `0b101`|`BPL`²/`R13B`|`CH`|`BP`/`R13W`|`EBP`/`R13D`|`RBP`/`R13` `0b110`|`SIL`²/`R14B`|`DH`|`SI`/`R14W`|`ESI`/`R14D`|`RSI`/`R14` `0b111`|`DIL`²/`R15B`|`BH`|`DI`/`R15W`|`EDI`/`R15D`|`RDI`/`R15` **NOTES:** -
mikesmullin revised this gist
Nov 22, 2018 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -989,6 +989,8 @@ References: https://www.hex-rays.com/products/ida/index.shtml - Binary Ninja; less featureful but cheap, modern interactive disassembler https://binary.ninja/ - Reversed non-standard opcode mappings which may confuse normal disassemblers https://github.com/XlogicX/irasm --- -
mikesmullin revised this gist
Nov 22, 2018 . 1 changed file with 29 additions and 5 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,6 +1,30 @@ # Mike's x86-64 Assembly (ASM) Notes Most CS graduates learned the bare minimum assembly to pass an exam. I am doing it after 15+ years web programming industry experience because I have always been self-taught, and it took a long time for it to became relevant to me, but now I harbor several motivations: - **Information Security**: - **Reverse Engineering**: Blue team malware analysis - **Penetration Testing**: Red team tooling and AV evasion - Writing my own [secure] [server and/or personal] **Operating System**, or contributing to a cool one that already enjoys broad support (ie. Linux kernel, hardening) - **Anonymity and Anti-Surveillance**: Understanding how undocumented features of commodity CPU vendors hardware lead to some of the most egregious vulnerabilities and exploits in history (ie. Spectre, Meltdown) - **Systems programming** (Windows assembly, Linux driver contribution, debugging) - **Game Development and Hacking** (WebAssembly, porting old games, cross-platform compatibility, DRM, shipping assets in tight binary packages, cheats and trainers) - **Productivity**: One day I might like to develop my own set of build tools including an assembler, linker, custom high-level language, and IDE (or again, contribute to a cool one that already exists like Golang or Rust) I feel these notes are compiled with a little more love and attention than the average graduate paper or recycled academic book from the 1990s, but a lot of this information hasn't changed since then, so I make good use of references, as well. When doing my own research, it was sad how much of it was ugly formatted (systems engineers are rarely also web developers) and suffering major link rot (many prideful assembly warriors proudly coded their own web servers and then got auto-pwned circa 2016 by vulnerability scanners and remote-code execution, and just went offline), so if that happens here just be sure to check archive.org for them. I tried to provide summaries as a mirror of the most important stuff, and then link to remaining works which are better or more thorough and exhaustive. ## Assembling Binary Machine Code @@ -646,7 +670,7 @@ a long array of floats to/from a block of memory in one operation. The integer part is encoded as a simple unsigned int. However, the fractional part is encoded as a base2 binary fraction, which commonly results in a [continued fraction](https://en.wikipedia.org/wiki/Continued_fraction) pattern, which gets truncated--and can lead to infamous FPU rounding errors if not handled carefully. ex: `3.1f` = `0b11` + `0b000 1100 1100 1100 1100 110...` _(the pattern would repeat infinitely if not truncated)_ This is stored little-endian so any zero-fill happens on the right side. @@ -988,7 +1012,7 @@ Here is some useful trivia about that: match the exact release version and compiler used. - Confusingly, there is no trivial way to tell static and dynamic `.lib` files apart, except that [dynamic] import libraries for DLLs will be much smaller than the matching static library would be. - `.lib` files may only be used at compile time to build statically linked binaries. - `.dll` files are intended to only be used at runtime to as dynamically linked binaries. - Technically `.dll` files contain enough information that a reverse engineer could @@ -1005,7 +1029,7 @@ Here is some useful trivia about that: code which uses the `.dll`, or via fuzz testing. - The version of `gcc` toolchain GNU linker (`ld`) ported to Windows can statically link using `.dll` inputs directly, which means it is able to implicitly synthesize the normally required but missing `.lib` stubs automagically! - **Decorated names** or **mangled names** are a symbol naming convention used in the COFF files. They are a series of ASCII prefix and suffixes which guarantee that each function is named uniquely when merged into the same flat COFF table format. @@ -1030,7 +1054,7 @@ References: https://docs.microsoft.com/en-us/windows/desktop/Debug/pe-format - MSDN Article from 2002 going into tremendous depth on history and intentions http://www.delphibasics.info/home/delphibasicsarticles/anin-depthlookintothewin32portableexecutablefileformat-part1 http://www.delphibasics.info/home/delphibasicsarticles/anin-depthlookintothewin32portableexecutablefileformat-part2 - MSDN Article from Mar 2002 detailing steps the Windows Loader takes with PE binaries https://www.cnblogs.com/binsys/articles/2711010.html - Peering Inside the PE: A Tour of the Win32 Portable Executable File Format -
mikesmullin revised this gist
Nov 22, 2018 . 1 changed file with 88 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -522,7 +522,10 @@ References: --- # Appendix: x86 Extensions As new models of the x86 family are released, the instruction set is extended with new features. Here we provide a chronologically ordered summary of what was added, when, and why. ## History of the FPU @@ -617,6 +620,84 @@ References: --- # Floating Point Numbers (IEEE-754) Floats come in various sizes. When serialized for compact transmission over the network, a clever dev may try to encode them as a string, or a tuple of 1-byte integers (integer and mantissa, optionally an exponent). But when you need the processor to do really quick, especially bulk, binary floating point math, the following is the standard form used everywhere. SIMD instructions operate almost exclusively on `ST0-7`, `MMX0-7`, and more recently the `XMM0-15` registers. When utilizing the high-precision `80`/`128`-bit values, you may need to perform multiple `MOV` and `PUSH` operations to fill the entire register, since the other registers and `immediate` operands are much smaller. As an optimization, some instructions accept a memory pointer operand to read/write a long array of floats to/from a block of memory in one operation. ### Data Structure: - **`1`-bit Sign** (`0`=positive) - **`8`-bit base2 Exponent add `+127` bias** ([why not signed two's compliment?](https://stackoverflow.com/a/2835476)) Take the whole number integer part, convert to binary, remove any [insignificant] 0 prefixes, count digits, minus one, that's the binary exponent convert that binary exponent (say, 8 digits) to binary and add +127 - **`23`-bit Mantissa a.k.a. Significand** This is the combination of the integer and fractional parts concatenated. The integer part is encoded as a simple unsigned int. However, the fractional part is encoded as a base2 binary fraction, which commonly results in a [continued fraction](https://en.wikipedia.org/wiki/Continued_fraction) pattern, which gets truncated--and can lead to infamous FPU rounding errors if not handled carefully. ex: `3.1f` = `0b11` + `0b000 1100 1100 1100 1100 110...` _(the pattern would repeat infinitely if not truncated)_ This is stored little-endian so any zero-fill happens on the right side. ## Let's manually encode `1.0f`! - **sign:** `0b0` = a positive number - **mantissa:** `0b1` + `0b0` zero-extended _(It is easier to calculate in this order because the mantissa value informs the exponent value.)_ - **exponent:** `0d0` + `0d127` = `0d127` = `0b01111111` ``` IEEE-754 32-bit (single precision) Floating Point (x86; little-endian) offset 0 1 10 32 single [0 0111 1111 1000 0000 0000 0000 0000 000] = 0x3f800000 = 1.0f | | | | | sign 1 | | | | exponent |<--8-->| | | mantissa |<-----------23----------->| ``` The structure is the same for `64`-bit (`double` precision) floats except the exponent has `11` bits, and a bias of `+1023`. The exponent bit has a four magic values which have reserved special meanings: Exponent | Mantissa | Meaning -|-|- `0b0` | `0b0` | zero (`0d0`) `0b0` | non-zero | denormalized all `0b1`'s | `0b0` | `Infinity` all `0b1`'s | non-zero | `NaN` ¹ **NOTES**: 1. You can hide data inside the mantissa of `NaN` structures. Some compilers use this to specify more precise reason codes (ie. if `NaN` resulted from failed computation.) References: - How to encode a float by hand https://www.youtube.com/watch?v=8afbTaA-gOQ - University lecture explaining the math https://www.youtube.com/watch?v=03fhijH6e2w - University lecture performing addition by hand https://www.youtube.com/watch?v=KiWz-mGFqHI - Interactive hosted calculator https://babbage.cs.qc.cuny.edu/ieee-754.old/decimal.html - Explaining floating point rounding errors https://www.youtube.com/watch?v=PZRI1IfStY0 --- # Appendix: Stack vs. Heap The stack is a data structure in memory the processor can understand and maintain, @@ -726,6 +807,8 @@ References: https://manybutfinite.com/post/anatomy-of-a-program-in-memory/ - Stack Frame Layout on x64 https://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64 - Microsoft __fastcall 64 ABI calling convention https://msdn.microsoft.com/en-us/library/ms235286.aspx --- @@ -1044,10 +1127,14 @@ References: http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/ - How Rust encodes exceptions and interrupts https://os.phil-opp.com/handling-exceptions/ - NeHe's famous OpenGL game dev tutorials incl. examples in Windows MASM http://nehe.gamedev.net/tutorial/creating_an_opengl_window_(win32)/13001/ - How Debuggers Work w/ Breakpoints https://eli.thegreenplace.net/2011/01/27/how-debuggers-work-part-2-breakpoints - Ralf Brown's BIOS Interrupt List http://www.ctyme.com/rbrown.htm - Agner Fog's books and blog, reknown for advanced assembly information https://www.agner.org/optimize/ - Intel® 64 and IA-32 Architectures Software Developer Manuals https://software.intel.com/en-us/articles/intel-sdm - AMD64 Architecture Programmer's Manual -
mikesmullin revised this gist
Nov 18, 2018 . 1 changed file with 4 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -724,6 +724,8 @@ References: https://en.wikipedia.org/wiki/C_dynamic_memory_allocation - Anatomy of a Program in Memory https://manybutfinite.com/post/anatomy-of-a-program-in-memory/ - Stack Frame Layout on x64 https://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64 --- @@ -846,6 +848,8 @@ References: https://www.pcjs.org/pubs/pc/software/tools/microsoft/masm/5.00/ - Third-party community support forums (anecdotal information and references) http://www.masm32.com/board/ - Art of Assembly (contains summary of MASM syntax) http://www.oopweb.com/Assembly/Documents/ArtOfAssembly/Volume/Chapter_8/CH08-1.html#top - Steve Gibson's MASM enthusiast page https://www.grc.com/smgassembly.htm - Netwide Assembler (NASM) Documentation -
mikesmullin revised this gist
Nov 17, 2018 . 1 changed file with 3 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -974,10 +974,12 @@ References: https://en.wikipedia.org/wiki/Address_space_layout_randomization - In 2017 "ASLR⊕Cache" attack demonstrated defeating ASLR from a web browser using JavaScript https://www.vusec.net/projects/anc/ - NTSTATUS values (Windows %errorlevel% codes) https://msdn.microsoft.com/en-us/library/cc704588.aspx - Official intro and reference for Windows-based graphical user interfaces https://docs.microsoft.com/en-us/windows/desktop/winmsg/windowing - Windows System Error Codes https://docs.microsoft.com/en-us/windows/desktop/Debug/system-error-codes --- -
mikesmullin revised this gist
Nov 17, 2018 . No changes.There are no files selected for viewing
-
mikesmullin revised this gist
Nov 17, 2018 . 1 changed file with 0 additions and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -813,8 +813,6 @@ Calculated in some of the following ways: the segment_selector refers to the GDT which refers to a protected memory page the offset is the address relative to that. References: - Using Short/Relative vs. Far Jumps https://thestarman.pcministry.com/asm/2bytejumps.htm -
mikesmullin revised this gist
Nov 17, 2018 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -978,6 +978,8 @@ References: https://www.vusec.net/projects/anc/ - NTSTATUS values (Windows `%errorlevel%` codes) https://msdn.microsoft.com/en-us/library/cc704588.aspx - Official intro and reference for Windows-based graphical user interfaces https://docs.microsoft.com/en-us/windows/desktop/winmsg/windowing --- -
mikesmullin revised this gist
Nov 17, 2018 . 1 changed file with 3 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -104,7 +104,7 @@ References: https://stackoverflow.com/questions/21165678/why-64-bit-mode-long-mode-doesnt-use-segment-registers - How much memory can a 64-bit machine address? (physically, logically, and theoretically) https://superuser.com/questions/168114/how-much-memory-can-a-64bit-machine-address-at-a-time - Open Security Training: Intermediate Intel x86: Architecture, Assembly, and Applications https://www.youtube.com/playlist?list=PL8F8D45D6C1FFD177 ##### REX Prefix Byte Data Structure (8 bits) @@ -854,6 +854,8 @@ References: https://www.nasm.us/doc/ - NASM Tutorial http://cs.lmu.edu/~ray/notes/nasmtutorial/ - SASM: Simple crossplatform IDE for NASM, MASM, GAS, FASM assembly languages https://dman95.github.io/SASM/english.html --- -
mikesmullin revised this gist
Nov 16, 2018 . 1 changed file with 3 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -873,7 +873,7 @@ References: - GDB Internals http://wwwcdf.pd.infn.it/localdoc/gdbint.pdf - Windbg Commands https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/commands - Hex-Rays Interactive Disassembler (IDA); most professional, but expensive https://www.hex-rays.com/products/ida/index.shtml - Binary Ninja; less featureful but cheap, modern interactive disassembler @@ -974,6 +974,8 @@ References: https://en.wikipedia.org/wiki/Address_space_layout_randomization - In 2017 "ASLR⊕Cache" attack demonstrated defeating ASLR from a web browser using JavaScript https://www.vusec.net/projects/anc/ - NTSTATUS values (Windows `%errorlevel%` codes) https://msdn.microsoft.com/en-us/library/cc704588.aspx --- -
mikesmullin revised this gist
Nov 15, 2018 . 1 changed file with 25 additions and 3 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -885,7 +885,8 @@ References: Windows executables (`*.exe`, `*.dll`) use **Portable Executable** (`PE`) format, which is a wrapper around and **Component Object File Format** (`COFF`), which is used by binary linker files (`*.obj`, `*.lib`). Technically Windows 64-bit uses a version internally called PE32+. A linker (ie. `link.exe`, `cl.exe`, `ld`, etc.) is basically designed to parse one or more `COFF` files, and wrap them into a single executable with a `PE` header. @@ -895,9 +896,12 @@ Here is some useful trivia about that: - Microsoft COFF is an extended version of the original by AT&T. - `.obj` and `.lib` files contain a simple table data structure mapping unique ASCII string symbol names to code or address offsets in another file. - `.lib` may include source code (static), but most of the time (e.g., in Visual Studio) they are just header stubs (dynamic) with pointers to address offsets in a `.dll` which must match the exact release version and compiler used. - Confusingly, there is no trivial way to tell static and dynamic `.lib` files apart, except that [dynamic] import libraries for DLLs will be much smaller than the matching static library would be. - `.lib` files may only be used at compile time to build statically linked binaries. - `.dll` files are intended to only be used at runtime to as dynamically linked binaries. - Technically `.dll` files contain enough information that a reverse engineer could @@ -912,6 +916,9 @@ Here is some useful trivia about that: when, and what effect they have on the `.dll` functions. Though a determined hacker could successfully guess them by looking at example code which uses the `.dll`, or via fuzz testing. - The version of `gcc` toolchain GNU linker (`ld`) ported to Windows can statically link using `.dll` inputs directly, which means it is able to implicitly synthesize the normally required but missing `.lib` stubs automagically! - **Decorated names** or **mangled names** are a symbol naming convention used in the COFF files. They are a series of ASCII prefix and suffixes which guarantee that each function is named uniquely when merged into the same flat COFF table format. @@ -934,6 +941,11 @@ Here is some useful trivia about that: References: - PE Format https://docs.microsoft.com/en-us/windows/desktop/Debug/pe-format - MSDN Article from 2002 going into tremendous depth on history and intentions http://www.delphibasics.info/home/delphibasicsarticles/anin-depthlookintothewin32portableexecutablefileformat-part1 http://www.delphibasics.info/home/delphibasicsarticles/anin-depthlookintothewin32portableexecutablefileformat-part2 - MSDN Article from Mar 2002 detailing steps the Windows Loader takes with PE binaries https://www.cnblogs.com/binsys/articles/2711010.html - Peering Inside the PE: A Tour of the Win32 Portable Executable File Format https://msdn.microsoft.com/en-us/library/ms809762.aspx - Portable Executable @@ -952,6 +964,16 @@ References: https://docs.microsoft.com/en-us/cpp/build/reference/decorated-names?view=vs-2017 - Linking Explicitly https://msdn.microsoft.com/en-us/library/784bt7z7.aspx - Creating the smallest possible PE executable https://web.archive.org/web/20101024125357/http://www.phreedom.org:80/solar/code/tinype/ - DLL search order https://docs.microsoft.com/en-us/windows/desktop/dlls/dynamic-link-library-search-order - CppCon 2017: James McNellis “Everything You Ever Wanted to Know about DLLs” https://www.youtube.com/watch?v=JPQWQfDhICA - Address Space Layout Randomization (ASLR) https://en.wikipedia.org/wiki/Address_space_layout_randomization - In 2017 "ASLR⊕Cache" attack demonstrated defeating ASLR from a web browser using JavaScript https://www.vusec.net/projects/anc/ --- -
mikesmullin revised this gist
Nov 13, 2018 . 1 changed file with 4 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -948,6 +948,10 @@ References: http://wjradburn.com/software/ - Difference between .lib and .dll http://www.differencebetween.net/technology/difference-between-lib-and-dll/ - Decorated/Mangled Names https://docs.microsoft.com/en-us/cpp/build/reference/decorated-names?view=vs-2017 - Linking Explicitly https://msdn.microsoft.com/en-us/library/784bt7z7.aspx --- -
mikesmullin revised this gist
Nov 13, 2018 . 1 changed file with 3 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1000,6 +1000,8 @@ References: https://www.youtube.com/watch?v=oaVwzYN6BP4 - Visual x86, x64, and ARM Emulator https://www.codeproject.com/Articles/478527/X86-ARM-Emulator - Build an 8-bit computer from scratch https://eater.net/8bit/parts - C to Linux x86-64 Assembly (ASM) examples https://gist.github.com/mikesmullin/6330894 - Linux x86_64 Syscall Table @@ -1013,4 +1015,4 @@ References: - Intel® 64 and IA-32 Architectures Software Developer Manuals https://software.intel.com/en-us/articles/intel-sdm - AMD64 Architecture Programmer's Manual https://www.amd.com/system/files/TechDocs/24594.pdf -
mikesmullin revised this gist
Nov 13, 2018 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -906,7 +906,7 @@ Here is some useful trivia about that: constants passed as function arguments. These are typically shared in the form of a C header (`*.h`) file, as part of an SDK (e.g, [windows sdk](https://docs.microsoft.com/en-us/windows/desktop/winmsg/wm-destroy) , [opengl sdk](https://www.khronos.org/registry/OpenGL/api/GLES2/gl2.h)), if the developer wants you to have them. The other thing you may not have is the documentation about what inputs are valid, when, and what effect they have on the `.dll` functions. -
mikesmullin revised this gist
Nov 13, 2018 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -906,7 +906,7 @@ Here is some useful trivia about that: constants passed as function arguments. These are typically shared in the form of a C header (`*.h`) file, as part of an SDK (e.g, [windows sdk](https://docs.microsoft.com/en-us/windows/desktop/winmsg/wm-destroy) , [opengl sdk](view-source:https://www.khronos.org/registry/OpenGL/api/GLES2/gl2.h)), if the developer wants you to have them. The other thing you may not have is the documentation about what inputs are valid, when, and what effect they have on the `.dll` functions. -
mikesmullin revised this gist
Nov 13, 2018 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -938,7 +938,7 @@ References: https://msdn.microsoft.com/en-us/library/ms809762.aspx - Portable Executable https://en.wikipedia.org/wiki/Portable_Executable - Official Microsoft PE/COFF Technical Specification for Rev 6., 1999 https://courses.cs.washington.edu/courses/cse378/03wi/lectures/LinkerFiles/coff.pdf - Handy Quick-Reference Posters https://github.com/corkami/pics/blob/master/binary/README.md#executables -
mikesmullin revised this gist
Nov 13, 2018 . 1 changed file with 2 additions and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -913,8 +913,8 @@ Here is some useful trivia about that: Though a determined hacker could successfully guess them by looking at example code which uses the `.dll`, or via fuzz testing. - **Decorated names** or **mangled names** are a symbol naming convention used in the COFF files. They are a series of ASCII prefix and suffixes which guarantee that each function is named uniquely when merged into the same flat COFF table format. The additional data mangled into the name includes: - The **function name**. - The **class name** that the function is a member of, if it is a member function. -
mikesmullin revised this gist
Nov 13, 2018 . 1 changed file with 18 additions and 24 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -65,6 +65,10 @@ References: https://www.cs.virginia.edu/~evans/cs216/guides/x86.html - Why is Displacement limited to 32-bits? https://stackoverflow.com/questions/31853189/x86-64-assembly-why-displacement-not-64-bits - Opcode Reference (Complex) http://ref.x86asm.net/ - Opcode Reference (Simple) http://www.felixcloutier.com/x86/ #### The Prefix @@ -120,7 +124,8 @@ Trivia: References: - Nice illustration of REX bits being prepended https://paul.bone.id.au/2018/09/26/more-x86-addressing/ - Good explanation of encoding the RAX prefix for Long mode 64-bit registers https://www.systutorials.com/72643/beginners-guide-x86-64-instruction-encoding/ ### The Operation Code (Opcode) @@ -338,7 +343,9 @@ References: http://lomont.org/Math/Papers/2009/Introduction%20to%20x64%20Assembly.pdf - x86 Oddities, Ange Albertini (Reverse Engineer), 2017 https://github.com/corkami/docs/blob/master/x86/x86.md - Good table of Registers https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/x64-architecture https://www.tortall.net/projects/yasm/manual/html/arch-x86-registers.html ### The `Memory` Address Operand @@ -436,6 +443,8 @@ References: https://stackoverflow.com/a/33328318 - Addressing modes https://en.wikipedia.org/wiki/Addressing_mode#Simple_addressing_modes_for_data - Memory Translation and Segmentation https://manybutfinite.com/post/memory-translation-and-segmentation/ --- @@ -713,6 +722,8 @@ References: https://stackoverflow.com/questions/21021223/how-does-the-gcc-determine-stack-size-the-function-based-on-c-will-use - C dynamic memory allocation https://en.wikipedia.org/wiki/C_dynamic_memory_allocation - Anatomy of a Program in Memory https://manybutfinite.com/post/anatomy-of-a-program-in-memory/ --- @@ -841,6 +852,8 @@ References: https://www.grc.com/smgassembly.htm - Netwide Assembler (NASM) Documentation https://www.nasm.us/doc/ - NASM Tutorial http://cs.lmu.edu/~ray/notes/nasmtutorial/ --- @@ -971,35 +984,18 @@ References: --- # Appendix: Miscellanous Tools & References - The x86 Instruction Structure https://www.codeproject.com/articles/662301/x-instruction-encoding-revealed-bit-twiddling-fo - X86-64 Instruction Encoding https://wiki.osdev.org/X86-64_Instruction_Encoding - CPU Rings Privilege and Protection https://manybutfinite.com/post/cpu-rings-privilege-and-protection/ - Interesting overview from Haskell to Machine code http://www.stephendiehl.com/posts/monads_machine_code.html - Intel: Introduction to x64 Assembly (an official guide) https://software.intel.com/en-us/articles/introduction-to-x64-assembly/ - Punching Cards (for FORTRAN programming) https://www.youtube.com/watch?v=oaVwzYN6BP4 - Visual x86, x64, and ARM Emulator @@ -1008,15 +1004,13 @@ References: https://gist.github.com/mikesmullin/6330894 - Linux x86_64 Syscall Table http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/ - How Rust encodes exceptions and interrupts https://os.phil-opp.com/handling-exceptions/ - How Debuggers Work w/ Breakpoints https://eli.thegreenplace.net/2011/01/27/how-debuggers-work-part-2-breakpoints - Ralf Brown's BIOS Interrupt List http://www.ctyme.com/rbrown.htm - Intel® 64 and IA-32 Architectures Software Developer Manuals https://software.intel.com/en-us/articles/intel-sdm - AMD64 Architecture Programmer's Manual https://www.amd.com/system/files/TechDocs/24594.pdf -
mikesmullin revised this gist
Nov 13, 2018 . 1 changed file with 3 additions and 5 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -900,11 +900,9 @@ Here is some useful trivia about that: Though a determined hacker could successfully guess them by looking at example code which uses the `.dll`, or via fuzz testing. - **Decorated names** or **mangled names** are a symbol naming convention used in the `.obj` files. They are a series of ASCII prefix and suffixes which guarantee that each function is named uniquely when merged into the same flat `.obj` table format. The additional data mangled into the name includes: - The **function name**. - The **class name** that the function is a member of, if it is a member function. This may include the class that encloses the class that contains the function, and so on. -
mikesmullin revised this gist
Nov 13, 2018 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -900,7 +900,7 @@ Here is some useful trivia about that: Though a determined hacker could successfully guess them by looking at example code which uses the `.dll`, or via fuzz testing. - **Decorated names** or **mangled names** are a symbol naming convention used in the `.obj` files. They are a series of ASCII prefix and suffixes which communicate additional detail about the function in order to reduce the likelihood that a function by the same name but different overloaded signature will not be clobbered when merged into the same flat `.obj` table format. The additional data -
mikesmullin revised this gist
Nov 13, 2018 . 1 changed file with 4 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -861,6 +861,10 @@ References: http://wwwcdf.pd.infn.it/localdoc/gdbint.pdf - Windbg Commands http://windbg.info/doc/1-common-cmds.html - Hex-Rays Interactive Disassembler (IDA); most professional, but expensive https://www.hex-rays.com/products/ida/index.shtml - Binary Ninja; less featureful but cheap, modern interactive disassembler https://binary.ninja/ --- -
mikesmullin revised this gist
Nov 13, 2018 . 1 changed file with 3 additions and 3 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -882,15 +882,15 @@ Here is some useful trivia about that: they are just headers with pointers to address offsets in a `.dll` which must match the exact release version and compiler used. - `.lib` files may only be used at compile time to build statically linked binaries. - `.dll` files are intended to only be used at runtime to as dynamically linked binaries. - Technically `.dll` files contain enough information that a reverse engineer could statically link them without a `.lib`, if they wanted to. - If you only have a `.dll`, you may be missing the compile-time constants passed as function arguments. These are typically shared in the form of a C header (`*.h`) file, as part of an SDK (e.g, [windows sdk](https://docs.microsoft.com/en-us/windows/desktop/winmsg/wm-destroy) , [opengl sdk](https://www.khronos.org/registry/OpenGL/api/GLES2/gl2.h)), if the developer wants you to have them. The other thing you may not have is the documentation about what inputs are valid, when, and what effect they have on the `.dll` functions. Though a determined hacker could successfully guess them by looking at example
NewerOlder