Skip to content

Instantly share code, notes, and snippets.

@mikesmullin
Last active October 12, 2025 13:05
Show Gist options
  • Save mikesmullin/6259449 to your computer and use it in GitHub Desktop.
Save mikesmullin/6259449 to your computer and use it in GitHub Desktop.

Revisions

  1. mikesmullin revised this gist Jul 4, 2025. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -1109,7 +1109,7 @@ References:

    # Appendix: Miscellanous Tools & References

    - Another booklet like this one (written recently, published in Markdown on Github)
    - Another booklet like this one (written recently, published in Markdown on Github)
    https://github.com/0xAX/asm
    - The x86 Instruction Structure
    https://www.codeproject.com/articles/662301/x-instruction-encoding-revealed-bit-twiddling-fo
  2. mikesmullin revised this gist Jul 4, 2025. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -1109,6 +1109,8 @@ References:

    # Appendix: Miscellanous Tools & References

    - Another booklet like this one (written recently, published in Markdown on Github)
    https://github.com/0xAX/asm
    - The x86 Instruction Structure
    https://www.codeproject.com/articles/662301/x-instruction-encoding-revealed-bit-twiddling-fo
    - X86-64 Instruction Encoding
  3. mikesmullin revised this gist Mar 10, 2025. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -939,6 +939,8 @@ References:
    http://cs.lmu.edu/~ray/notes/nasmtutorial/
    - SASM: Simple crossplatform IDE for NASM, MASM, GAS, FASM assembly languages
    https://dman95.github.io/SASM/english.html
    - FASM (Flat Assembler) by Tomasz Grysztar has a minimalist approach and recent cult following
    https://github.com/tgrysztar/fasm

    ---

  4. mikesmullin revised this gist Nov 29, 2024. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -151,7 +151,7 @@ which people continue to discover through reverse engineering.
    - **Primary Opcodes:**
    In the first release of x86, we had only `1`-byte opcodes.
    - **Secondary Opcodes:**
    Future opcodes made room by prefixing the escape byte `0xf0`. These are `2`-byte opcodes.
    Future opcodes made room by prefixing the escape byte `0x0f`. These are `2`-byte opcodes.
    - **Opcode Extension:**
    If the instruction does not require a second operand, then the `3`-bit
    `MODRM.reg` field is considered an extension of the opcode. Since it can only
  5. mikesmullin revised this gist Nov 27, 2018. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -1129,7 +1129,7 @@ References:
    http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/
    - How Rust encodes exceptions and interrupts
    https://os.phil-opp.com/handling-exceptions/
    - NeHe's famous OpenGL game dev tutorials incl. examples in Windows MASM
    - NeHe's famous OpenGL game dev tutorials incl. examples in Windows MASM
    http://nehe.gamedev.net/tutorial/creating_an_opengl_window_(win32)/13001/
    - How Debuggers Work w/ Breakpoints
    https://eli.thegreenplace.net/2011/01/27/how-debuggers-work-part-2-breakpoints
  6. mikesmullin revised this gist Nov 27, 2018. 1 changed file with 0 additions and 26 deletions.
    26 changes: 0 additions & 26 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -1,31 +1,5 @@
    # Mike's x86-64 Assembly (ASM) Notes

    Most CS graduates learned the bare minimum assembly to pass an exam.
    I am doing it after 15+ years web programming industry experience because I have always been self-taught, and it took a long time for it to became relevant to me, but now I harbor several motivations:

    - **Information Security**:
    - **Reverse Engineering**: Blue team malware analysis
    - **Penetration Testing**: Red team tooling and AV evasion
    - Writing my own [secure] [server and/or personal] **Operating System**,
    or contributing to a cool one that already enjoys broad support (ie. Linux kernel, hardening)
    - **Anonymity and Anti-Surveillance**: Understanding how undocumented features of commodity CPU vendors hardware lead to
    some of the most egregious vulnerabilities and exploits in history (ie. Spectre, Meltdown)
    - **Systems programming** (Windows assembly, Linux driver contribution, debugging)
    - **Game Development and Hacking** (WebAssembly, porting old games, cross-platform compatibility, DRM, shipping assets in tight binary packages, cheats and trainers)
    - **Productivity**: One day I might like to develop my own set of build tools including
    an assembler, linker, custom high-level language, and IDE (or again, contribute to a cool one that already exists like Golang or Rust)

    I feel these notes are compiled with a little more love and attention than the average
    graduate paper or recycled academic book from the 1990s, but a lot of this information
    hasn't changed since then, so I make good use of references, as well. When doing my
    own research, it was sad how much of it was ugly formatted (systems engineers are
    rarely also web developers) and suffering major link rot (many prideful assembly
    warriors proudly coded their own web servers and then got auto-pwned circa 2016
    by vulnerability scanners and remote-code execution, and just went offline), so
    if that happens here just be sure to check archive.org for them. I tried to
    provide summaries as a mirror of the most important stuff, and then link to
    remaining works which are better or more thorough and exhaustive.

    ## Assembling Binary Machine Code

    ### Operating `Modes`:
  7. mikesmullin revised this gist Nov 26, 2018. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -985,6 +985,8 @@ References:
    http://wwwcdf.pd.infn.it/localdoc/gdbint.pdf
    - Windbg Commands
    https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/commands
    - x86dbg is perhaps the best Windows debugger today
    https://x64dbg.com/
    - Hex-Rays Interactive Disassembler (IDA); most professional, but expensive
    https://www.hex-rays.com/products/ida/index.shtml
    - Binary Ninja; less featureful but cheap, modern interactive disassembler
  8. mikesmullin revised this gist Nov 25, 2018. 1 changed file with 9 additions and 9 deletions.
    18 changes: 9 additions & 9 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -334,16 +334,16 @@ when referring to these registers, which describes both a) operand width, and b)
    While there are several places you may reference a register, including `MODRM.reg`, `MODRM.rm`, `SIB.index`, `SIB.base`,
    and `PO.reg`, you'll find they all use the same `3` or `4`-bit mapping convention, as follows:

    |Register<br>Reference|<br>Low `8`-bits³|(`3`-bit / `4th`-bit=`0b1`)<br>High `8`-bits¹ ³|<br>Low `16`-bits|<br>Low `32`-bits⁴|<br>Full `64`-bit Register
    |Register<br>Reference|(`3`-bit / `4th`-bit=`0b1`)<br>Low `8`-bits³|<br>High `8`-bits¹ ³|<br>Low `16`-bits|<br>Low `32`-bits⁴|<br>Full `64`-bit Register
    -|-|-|-|-|-
    `0b000`| |`AL`/`R8B` |`AX`/`R8W` |`EAX`/`R8D` |`RAX`/`R8`
    `0b001`| |`CL`/`R9B` |`CX`/`R9W` |`ECX`/`R9D` |`RCX`/`R9`
    `0b010`| |`DL`/`R10B` |`DX`/`R10W`|`EDX`/`R10D`|`RDX`/`R10`
    `0b011`| |`BL`/`R11B` |`BX`/`R11W`|`EBX`/`R11D`|`RBX`/`R11`
    `0b100`|`AH`|`SPL`²/`R12B`|`SP`/`R12W`|`ESP`/`R12D`|`RSP`/`R12`
    `0b101`|`CH`|`BPL`²/`R13B`|`BP`/`R13W`|`EBP`/`R13D`|`RBP`/`R13`
    `0b110`|`DH`|`SIL`²/`R14B`|`SI`/`R14W`|`ESI`/`R14D`|`RSI`/`R14`
    `0b111`|`BH`|`DIL`²/`R15B`|`DI`/`R15W`|`EDI`/`R15D`|`RDI`/`R15`
    `0b000`|`AL`/`R8B` | |`AX`/`R8W` |`EAX`/`R8D` |`RAX`/`R8`
    `0b001`|`CL`/`R9B` | |`CX`/`R9W` |`ECX`/`R9D` |`RCX`/`R9`
    `0b010`|`DL`/`R10B` | |`DX`/`R10W`|`EDX`/`R10D`|`RDX`/`R10`
    `0b011`|`BL`/`R11B` | |`BX`/`R11W`|`EBX`/`R11D`|`RBX`/`R11`
    `0b100`|`SPL`²/`R12B`|`AH`|`SP`/`R12W`|`ESP`/`R12D`|`RSP`/`R12`
    `0b101`|`BPL`²/`R13B`|`CH`|`BP`/`R13W`|`EBP`/`R13D`|`RBP`/`R13`
    `0b110`|`SIL`²/`R14B`|`DH`|`SI`/`R14W`|`ESI`/`R14D`|`RSI`/`R14`
    `0b111`|`DIL`²/`R15B`|`BH`|`DI`/`R15W`|`EDI`/`R15D`|`RDI`/`R15`

    **NOTES:**

  9. mikesmullin revised this gist Nov 22, 2018. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -989,6 +989,8 @@ References:
    https://www.hex-rays.com/products/ida/index.shtml
    - Binary Ninja; less featureful but cheap, modern interactive disassembler
    https://binary.ninja/
    - Reversed non-standard opcode mappings which may confuse normal disassemblers
    https://github.com/XlogicX/irasm

    ---

  10. mikesmullin revised this gist Nov 22, 2018. 1 changed file with 29 additions and 5 deletions.
    34 changes: 29 additions & 5 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -1,6 +1,30 @@
    # Mike's x86-64 Assembly (ASM) Notes

    > One day I might like to make my own version of ASM or C language, similar to Golang or Rust and/or NASM.
    Most CS graduates learned the bare minimum assembly to pass an exam.
    I am doing it after 15+ years web programming industry experience because I have always been self-taught, and it took a long time for it to became relevant to me, but now I harbor several motivations:

    - **Information Security**:
    - **Reverse Engineering**: Blue team malware analysis
    - **Penetration Testing**: Red team tooling and AV evasion
    - Writing my own [secure] [server and/or personal] **Operating System**,
    or contributing to a cool one that already enjoys broad support (ie. Linux kernel, hardening)
    - **Anonymity and Anti-Surveillance**: Understanding how undocumented features of commodity CPU vendors hardware lead to
    some of the most egregious vulnerabilities and exploits in history (ie. Spectre, Meltdown)
    - **Systems programming** (Windows assembly, Linux driver contribution, debugging)
    - **Game Development and Hacking** (WebAssembly, porting old games, cross-platform compatibility, DRM, shipping assets in tight binary packages, cheats and trainers)
    - **Productivity**: One day I might like to develop my own set of build tools including
    an assembler, linker, custom high-level language, and IDE (or again, contribute to a cool one that already exists like Golang or Rust)

    I feel these notes are compiled with a little more love and attention than the average
    graduate paper or recycled academic book from the 1990s, but a lot of this information
    hasn't changed since then, so I make good use of references, as well. When doing my
    own research, it was sad how much of it was ugly formatted (systems engineers are
    rarely also web developers) and suffering major link rot (many prideful assembly
    warriors proudly coded their own web servers and then got auto-pwned circa 2016
    by vulnerability scanners and remote-code execution, and just went offline), so
    if that happens here just be sure to check archive.org for them. I tried to
    provide summaries as a mirror of the most important stuff, and then link to
    remaining works which are better or more thorough and exhaustive.

    ## Assembling Binary Machine Code

    @@ -646,7 +670,7 @@ a long array of floats to/from a block of memory in one operation.
    The integer part is encoded as a simple unsigned int.
    However, the fractional part is encoded as a base2 binary fraction,
    which commonly results in a [continued fraction](https://en.wikipedia.org/wiki/Continued_fraction) pattern,
    which gets truncated--and can lead to infamous FPU rounding errors if not handled carefully.
    which gets truncated--and can lead to infamous FPU rounding errors if not handled carefully.
    ex: `3.1f` = `0b11` + `0b000 1100 1100 1100 1100 110...` _(the pattern would repeat infinitely if not truncated)_
    This is stored little-endian so any zero-fill happens on the right side.

    @@ -988,7 +1012,7 @@ Here is some useful trivia about that:
    match the exact release version and compiler used.
    - Confusingly, there is no trivial way to tell static and dynamic `.lib` files apart,
    except that [dynamic] import libraries for DLLs will be much smaller than the
    matching static library would be.
    matching static library would be.
    - `.lib` files may only be used at compile time to build statically linked binaries.
    - `.dll` files are intended to only be used at runtime to as dynamically linked binaries.
    - Technically `.dll` files contain enough information that a reverse engineer could
    @@ -1005,7 +1029,7 @@ Here is some useful trivia about that:
    code which uses the `.dll`, or via fuzz testing.
    - The version of `gcc` toolchain GNU linker (`ld`) ported to Windows can statically
    link using `.dll` inputs directly, which means it is able to implicitly synthesize
    the normally required but missing `.lib` stubs automagically!
    the normally required but missing `.lib` stubs automagically!
    - **Decorated names** or **mangled names** are a symbol naming convention used in the
    COFF files. They are a series of ASCII prefix and suffixes which guarantee that
    each function is named uniquely when merged into the same flat COFF table format.
    @@ -1030,7 +1054,7 @@ References:
    https://docs.microsoft.com/en-us/windows/desktop/Debug/pe-format
    - MSDN Article from 2002 going into tremendous depth on history and intentions
    http://www.delphibasics.info/home/delphibasicsarticles/anin-depthlookintothewin32portableexecutablefileformat-part1
    http://www.delphibasics.info/home/delphibasicsarticles/anin-depthlookintothewin32portableexecutablefileformat-part2
    http://www.delphibasics.info/home/delphibasicsarticles/anin-depthlookintothewin32portableexecutablefileformat-part2
    - MSDN Article from Mar 2002 detailing steps the Windows Loader takes with PE binaries
    https://www.cnblogs.com/binsys/articles/2711010.html
    - Peering Inside the PE: A Tour of the Win32 Portable Executable File Format
  11. mikesmullin revised this gist Nov 22, 2018. 1 changed file with 88 additions and 1 deletion.
    89 changes: 88 additions & 1 deletion x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -522,7 +522,10 @@ References:

    ---

    # Appendix: Extensions
    # Appendix: x86 Extensions

    As new models of the x86 family are released, the instruction set is extended with new features.
    Here we provide a chronologically ordered summary of what was added, when, and why.

    ## History of the FPU

    @@ -617,6 +620,84 @@ References:

    ---

    # Floating Point Numbers (IEEE-754)

    Floats come in various sizes. When serialized for compact transmission over the
    network, a clever dev may try to encode them as a string, or a tuple of 1-byte
    integers (integer and mantissa, optionally an exponent). But when you need the
    processor to do really quick, especially bulk, binary floating point math, the
    following is the standard form used everywhere.

    SIMD instructions operate almost exclusively on `ST0-7`, `MMX0-7`, and more recently
    the `XMM0-15` registers. When utilizing the high-precision `80`/`128`-bit values,
    you may need to perform multiple `MOV` and `PUSH` operations to fill the entire
    register, since the other registers and `immediate` operands are much smaller.
    As an optimization, some instructions accept a memory pointer operand to read/write
    a long array of floats to/from a block of memory in one operation.

    ### Data Structure:

    - **`1`-bit Sign** (`0`=positive)
    - **`8`-bit base2 Exponent add `+127` bias** ([why not signed two's compliment?](https://stackoverflow.com/a/2835476))
    Take the whole number integer part, convert to binary, remove any [insignificant] 0 prefixes, count digits, minus one, that's the binary exponent
    convert that binary exponent (say, 8 digits) to binary and add +127
    - **`23`-bit Mantissa a.k.a. Significand**
    This is the combination of the integer and fractional parts concatenated.
    The integer part is encoded as a simple unsigned int.
    However, the fractional part is encoded as a base2 binary fraction,
    which commonly results in a [continued fraction](https://en.wikipedia.org/wiki/Continued_fraction) pattern,
    which gets truncated--and can lead to infamous FPU rounding errors if not handled carefully.
    ex: `3.1f` = `0b11` + `0b000 1100 1100 1100 1100 110...` _(the pattern would repeat infinitely if not truncated)_
    This is stored little-endian so any zero-fill happens on the right side.

    ## Let's manually encode `1.0f`!

    - **sign:** `0b0` = a positive number
    - **mantissa:** `0b1` + `0b0` zero-extended
    _(It is easier to calculate in this order because the mantissa value informs the exponent value.)_
    - **exponent:** `0d0` + `0d127` = `0d127` = `0b01111111`

    ```
    IEEE-754 32-bit (single precision) Floating Point (x86; little-endian)
    offset 0 1 10 32
    single [0 0111 1111 1000 0000 0000 0000 0000 000] = 0x3f800000 = 1.0f
    | | | | |
    sign 1 | | | |
    exponent |<--8-->| | |
    mantissa |<-----------23----------->|
    ```

    The structure is the same for `64`-bit (`double` precision) floats
    except the exponent has `11` bits, and a bias of `+1023`.

    The exponent bit has a four magic values which have reserved special meanings:

    Exponent | Mantissa | Meaning
    -|-|-
    `0b0` | `0b0` | zero (`0d0`)
    `0b0` | non-zero | denormalized
    all `0b1`'s | `0b0` | `Infinity`
    all `0b1`'s | non-zero | `NaN` ¹

    **NOTES**:
    1. You can hide data inside the mantissa of `NaN` structures.
    Some compilers use this to specify more precise reason codes (ie. if `NaN` resulted from failed computation.)

    References:
    - How to encode a float by hand
    https://www.youtube.com/watch?v=8afbTaA-gOQ
    - University lecture explaining the math
    https://www.youtube.com/watch?v=03fhijH6e2w
    - University lecture performing addition by hand
    https://www.youtube.com/watch?v=KiWz-mGFqHI
    - Interactive hosted calculator
    https://babbage.cs.qc.cuny.edu/ieee-754.old/decimal.html
    - Explaining floating point rounding errors
    https://www.youtube.com/watch?v=PZRI1IfStY0

    ---

    # Appendix: Stack vs. Heap

    The stack is a data structure in memory the processor can understand and maintain,
    @@ -726,6 +807,8 @@ References:
    https://manybutfinite.com/post/anatomy-of-a-program-in-memory/
    - Stack Frame Layout on x64
    https://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64
    - Microsoft __fastcall 64 ABI calling convention
    https://msdn.microsoft.com/en-us/library/ms235286.aspx

    ---

    @@ -1044,10 +1127,14 @@ References:
    http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/
    - How Rust encodes exceptions and interrupts
    https://os.phil-opp.com/handling-exceptions/
    - NeHe's famous OpenGL game dev tutorials incl. examples in Windows MASM
    http://nehe.gamedev.net/tutorial/creating_an_opengl_window_(win32)/13001/
    - How Debuggers Work w/ Breakpoints
    https://eli.thegreenplace.net/2011/01/27/how-debuggers-work-part-2-breakpoints
    - Ralf Brown's BIOS Interrupt List
    http://www.ctyme.com/rbrown.htm
    - Agner Fog's books and blog, reknown for advanced assembly information
    https://www.agner.org/optimize/
    - Intel® 64 and IA-32 Architectures Software Developer Manuals
    https://software.intel.com/en-us/articles/intel-sdm
    - AMD64 Architecture Programmer's Manual
  12. mikesmullin revised this gist Nov 18, 2018. 1 changed file with 4 additions and 0 deletions.
    4 changes: 4 additions & 0 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -724,6 +724,8 @@ References:
    https://en.wikipedia.org/wiki/C_dynamic_memory_allocation
    - Anatomy of a Program in Memory
    https://manybutfinite.com/post/anatomy-of-a-program-in-memory/
    - Stack Frame Layout on x64
    https://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64

    ---

    @@ -846,6 +848,8 @@ References:
    https://www.pcjs.org/pubs/pc/software/tools/microsoft/masm/5.00/
    - Third-party community support forums (anecdotal information and references)
    http://www.masm32.com/board/
    - Art of Assembly (contains summary of MASM syntax)
    http://www.oopweb.com/Assembly/Documents/ArtOfAssembly/Volume/Chapter_8/CH08-1.html#top
    - Steve Gibson's MASM enthusiast page
    https://www.grc.com/smgassembly.htm
    - Netwide Assembler (NASM) Documentation
  13. mikesmullin revised this gist Nov 17, 2018. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -974,10 +974,12 @@ References:
    https://en.wikipedia.org/wiki/Address_space_layout_randomization
    - In 2017 "ASLR⊕Cache" attack demonstrated defeating ASLR from a web browser using JavaScript
    https://www.vusec.net/projects/anc/
    - NTSTATUS values (Windows `%errorlevel%` codes)
    - NTSTATUS values (Windows %errorlevel% codes)
    https://msdn.microsoft.com/en-us/library/cc704588.aspx
    - Official intro and reference for Windows-based graphical user interfaces
    https://docs.microsoft.com/en-us/windows/desktop/winmsg/windowing
    - Windows System Error Codes
    https://docs.microsoft.com/en-us/windows/desktop/Debug/system-error-codes

    ---

  14. mikesmullin revised this gist Nov 17, 2018. No changes.
  15. mikesmullin revised this gist Nov 17, 2018. 1 changed file with 0 additions and 2 deletions.
    2 changes: 0 additions & 2 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -813,8 +813,6 @@ Calculated in some of the following ways:
    the segment_selector refers to the GDT which refers to a protected memory page
    the offset is the address relative to that.

    **NOTE:** pointers are only used by JMP and CALL instructions.

    References:
    - Using Short/Relative vs. Far Jumps
    https://thestarman.pcministry.com/asm/2bytejumps.htm
  16. mikesmullin revised this gist Nov 17, 2018. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -978,6 +978,8 @@ References:
    https://www.vusec.net/projects/anc/
    - NTSTATUS values (Windows `%errorlevel%` codes)
    https://msdn.microsoft.com/en-us/library/cc704588.aspx
    - Official intro and reference for Windows-based graphical user interfaces
    https://docs.microsoft.com/en-us/windows/desktop/winmsg/windowing

    ---

  17. mikesmullin revised this gist Nov 17, 2018. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -104,7 +104,7 @@ References:
    https://stackoverflow.com/questions/21165678/why-64-bit-mode-long-mode-doesnt-use-segment-registers
    - How much memory can a 64-bit machine address? (physically, logically, and theoretically)
    https://superuser.com/questions/168114/how-much-memory-can-a-64bit-machine-address-at-a-time
    - Open Security Training: Intermedia Intel x86: Architecture, Assembly, and Applications
    - Open Security Training: Intermediate Intel x86: Architecture, Assembly, and Applications
    https://www.youtube.com/playlist?list=PL8F8D45D6C1FFD177

    ##### REX Prefix Byte Data Structure (8 bits)
    @@ -854,6 +854,8 @@ References:
    https://www.nasm.us/doc/
    - NASM Tutorial
    http://cs.lmu.edu/~ray/notes/nasmtutorial/
    - SASM: Simple crossplatform IDE for NASM, MASM, GAS, FASM assembly languages
    https://dman95.github.io/SASM/english.html

    ---

  18. mikesmullin revised this gist Nov 16, 2018. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -873,7 +873,7 @@ References:
    - GDB Internals
    http://wwwcdf.pd.infn.it/localdoc/gdbint.pdf
    - Windbg Commands
    http://windbg.info/doc/1-common-cmds.html
    https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/commands
    - Hex-Rays Interactive Disassembler (IDA); most professional, but expensive
    https://www.hex-rays.com/products/ida/index.shtml
    - Binary Ninja; less featureful but cheap, modern interactive disassembler
    @@ -974,6 +974,8 @@ References:
    https://en.wikipedia.org/wiki/Address_space_layout_randomization
    - In 2017 "ASLR⊕Cache" attack demonstrated defeating ASLR from a web browser using JavaScript
    https://www.vusec.net/projects/anc/
    - NTSTATUS values (Windows `%errorlevel%` codes)
    https://msdn.microsoft.com/en-us/library/cc704588.aspx

    ---

  19. mikesmullin revised this gist Nov 15, 2018. 1 changed file with 25 additions and 3 deletions.
    28 changes: 25 additions & 3 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -885,7 +885,8 @@ References:

    Windows executables (`*.exe`, `*.dll`) use **Portable Executable** (`PE`) format,
    which is a wrapper around and **Component Object File Format** (`COFF`), which is
    used by binary linker files (`*.obj`, `*.lib`).
    used by binary linker files (`*.obj`, `*.lib`). Technically Windows 64-bit uses
    a version internally called PE32+.

    A linker (ie. `link.exe`, `cl.exe`, `ld`, etc.) is basically designed to parse one
    or more `COFF` files, and wrap them into a single executable with a `PE` header.
    @@ -895,9 +896,12 @@ Here is some useful trivia about that:
    - Microsoft COFF is an extended version of the original by AT&T.
    - `.obj` and `.lib` files contain a simple table data structure mapping
    unique ASCII string symbol names to code or address offsets in another file.
    - `.lib` may include source code, but most of the time (e.g., in Visual Studio)
    they are just headers with pointers to address offsets in a `.dll` which must
    - `.lib` may include source code (static), but most of the time (e.g., in Visual Studio)
    they are just header stubs (dynamic) with pointers to address offsets in a `.dll` which must
    match the exact release version and compiler used.
    - Confusingly, there is no trivial way to tell static and dynamic `.lib` files apart,
    except that [dynamic] import libraries for DLLs will be much smaller than the
    matching static library would be.
    - `.lib` files may only be used at compile time to build statically linked binaries.
    - `.dll` files are intended to only be used at runtime to as dynamically linked binaries.
    - Technically `.dll` files contain enough information that a reverse engineer could
    @@ -912,6 +916,9 @@ Here is some useful trivia about that:
    when, and what effect they have on the `.dll` functions.
    Though a determined hacker could successfully guess them by looking at example
    code which uses the `.dll`, or via fuzz testing.
    - The version of `gcc` toolchain GNU linker (`ld`) ported to Windows can statically
    link using `.dll` inputs directly, which means it is able to implicitly synthesize
    the normally required but missing `.lib` stubs automagically!
    - **Decorated names** or **mangled names** are a symbol naming convention used in the
    COFF files. They are a series of ASCII prefix and suffixes which guarantee that
    each function is named uniquely when merged into the same flat COFF table format.
    @@ -934,6 +941,11 @@ Here is some useful trivia about that:
    References:
    - PE Format
    https://docs.microsoft.com/en-us/windows/desktop/Debug/pe-format
    - MSDN Article from 2002 going into tremendous depth on history and intentions
    http://www.delphibasics.info/home/delphibasicsarticles/anin-depthlookintothewin32portableexecutablefileformat-part1
    http://www.delphibasics.info/home/delphibasicsarticles/anin-depthlookintothewin32portableexecutablefileformat-part2
    - MSDN Article from Mar 2002 detailing steps the Windows Loader takes with PE binaries
    https://www.cnblogs.com/binsys/articles/2711010.html
    - Peering Inside the PE: A Tour of the Win32 Portable Executable File Format
    https://msdn.microsoft.com/en-us/library/ms809762.aspx
    - Portable Executable
    @@ -952,6 +964,16 @@ References:
    https://docs.microsoft.com/en-us/cpp/build/reference/decorated-names?view=vs-2017
    - Linking Explicitly
    https://msdn.microsoft.com/en-us/library/784bt7z7.aspx
    - Creating the smallest possible PE executable
    https://web.archive.org/web/20101024125357/http://www.phreedom.org:80/solar/code/tinype/
    - DLL search order
    https://docs.microsoft.com/en-us/windows/desktop/dlls/dynamic-link-library-search-order
    - CppCon 2017: James McNellis “Everything You Ever Wanted to Know about DLLs”
    https://www.youtube.com/watch?v=JPQWQfDhICA
    - Address Space Layout Randomization (ASLR)
    https://en.wikipedia.org/wiki/Address_space_layout_randomization
    - In 2017 "ASLR⊕Cache" attack demonstrated defeating ASLR from a web browser using JavaScript
    https://www.vusec.net/projects/anc/

    ---

  20. mikesmullin revised this gist Nov 13, 2018. 1 changed file with 4 additions and 0 deletions.
    4 changes: 4 additions & 0 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -948,6 +948,10 @@ References:
    http://wjradburn.com/software/
    - Difference between .lib and .dll
    http://www.differencebetween.net/technology/difference-between-lib-and-dll/
    - Decorated/Mangled Names
    https://docs.microsoft.com/en-us/cpp/build/reference/decorated-names?view=vs-2017
    - Linking Explicitly
    https://msdn.microsoft.com/en-us/library/784bt7z7.aspx

    ---

  21. mikesmullin revised this gist Nov 13, 2018. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -1000,6 +1000,8 @@ References:
    https://www.youtube.com/watch?v=oaVwzYN6BP4
    - Visual x86, x64, and ARM Emulator
    https://www.codeproject.com/Articles/478527/X86-ARM-Emulator
    - Build an 8-bit computer from scratch
    https://eater.net/8bit/parts
    - C to Linux x86-64 Assembly (ASM) examples
    https://gist.github.com/mikesmullin/6330894
    - Linux x86_64 Syscall Table
    @@ -1013,4 +1015,4 @@ References:
    - Intel® 64 and IA-32 Architectures Software Developer Manuals
    https://software.intel.com/en-us/articles/intel-sdm
    - AMD64 Architecture Programmer's Manual
    https://www.amd.com/system/files/TechDocs/24594.pdf
    https://www.amd.com/system/files/TechDocs/24594.pdf
  22. mikesmullin revised this gist Nov 13, 2018. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -906,7 +906,7 @@ Here is some useful trivia about that:
    constants passed as function arguments. These are typically shared in the form of
    a C header (`*.h`) file, as part of an SDK (e.g,
    [windows sdk](https://docs.microsoft.com/en-us/windows/desktop/winmsg/wm-destroy)
    , [opengl sdk](view-source:https://www.khronos.org/registry/OpenGL/api/GLES2/gl2.h)),
    , [opengl sdk](https://www.khronos.org/registry/OpenGL/api/GLES2/gl2.h)),
    if the developer wants you to have them.
    The other thing you may not have is the documentation about what inputs are valid,
    when, and what effect they have on the `.dll` functions.
  23. mikesmullin revised this gist Nov 13, 2018. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -906,7 +906,7 @@ Here is some useful trivia about that:
    constants passed as function arguments. These are typically shared in the form of
    a C header (`*.h`) file, as part of an SDK (e.g,
    [windows sdk](https://docs.microsoft.com/en-us/windows/desktop/winmsg/wm-destroy)
    , [opengl sdk](https://www.khronos.org/registry/OpenGL/api/GLES2/gl2.h)),
    , [opengl sdk](view-source:https://www.khronos.org/registry/OpenGL/api/GLES2/gl2.h)),
    if the developer wants you to have them.
    The other thing you may not have is the documentation about what inputs are valid,
    when, and what effect they have on the `.dll` functions.
  24. mikesmullin revised this gist Nov 13, 2018. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -938,7 +938,7 @@ References:
    https://msdn.microsoft.com/en-us/library/ms809762.aspx
    - Portable Executable
    https://en.wikipedia.org/wiki/Portable_Executable
    - Official Microsoft PE/COFF Technical Specification for Rev 6., 1999
    - Official Microsoft PE/COFF Technical Specification for Rev 6., 1999
    https://courses.cs.washington.edu/courses/cse378/03wi/lectures/LinkerFiles/coff.pdf
    - Handy Quick-Reference Posters
    https://github.com/corkami/pics/blob/master/binary/README.md#executables
  25. mikesmullin revised this gist Nov 13, 2018. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -913,8 +913,8 @@ Here is some useful trivia about that:
    Though a determined hacker could successfully guess them by looking at example
    code which uses the `.dll`, or via fuzz testing.
    - **Decorated names** or **mangled names** are a symbol naming convention used in the
    `.obj` files. They are a series of ASCII prefix and suffixes which guarantee that
    each function is named uniquely when merged into the same flat `.obj` table format.
    COFF files. They are a series of ASCII prefix and suffixes which guarantee that
    each function is named uniquely when merged into the same flat COFF table format.
    The additional data mangled into the name includes:
    - The **function name**.
    - The **class name** that the function is a member of, if it is a member function.
  26. mikesmullin revised this gist Nov 13, 2018. 1 changed file with 18 additions and 24 deletions.
    42 changes: 18 additions & 24 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -65,6 +65,10 @@ References:
    https://www.cs.virginia.edu/~evans/cs216/guides/x86.html
    - Why is Displacement limited to 32-bits?
    https://stackoverflow.com/questions/31853189/x86-64-assembly-why-displacement-not-64-bits
    - Opcode Reference (Complex)
    http://ref.x86asm.net/
    - Opcode Reference (Simple)
    http://www.felixcloutier.com/x86/

    #### The Prefix

    @@ -120,7 +124,8 @@ Trivia:
    References:
    - Nice illustration of REX bits being prepended
    https://paul.bone.id.au/2018/09/26/more-x86-addressing/

    - Good explanation of encoding the RAX prefix for Long mode 64-bit registers
    https://www.systutorials.com/72643/beginners-guide-x86-64-instruction-encoding/

    ### The Operation Code (Opcode)

    @@ -338,7 +343,9 @@ References:
    http://lomont.org/Math/Papers/2009/Introduction%20to%20x64%20Assembly.pdf
    - x86 Oddities, Ange Albertini (Reverse Engineer), 2017
    https://github.com/corkami/docs/blob/master/x86/x86.md

    - Good table of Registers
    https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/x64-architecture
    https://www.tortall.net/projects/yasm/manual/html/arch-x86-registers.html

    ### The `Memory` Address Operand

    @@ -436,6 +443,8 @@ References:
    https://stackoverflow.com/a/33328318
    - Addressing modes
    https://en.wikipedia.org/wiki/Addressing_mode#Simple_addressing_modes_for_data
    - Memory Translation and Segmentation
    https://manybutfinite.com/post/memory-translation-and-segmentation/

    ---

    @@ -713,6 +722,8 @@ References:
    https://stackoverflow.com/questions/21021223/how-does-the-gcc-determine-stack-size-the-function-based-on-c-will-use
    - C dynamic memory allocation
    https://en.wikipedia.org/wiki/C_dynamic_memory_allocation
    - Anatomy of a Program in Memory
    https://manybutfinite.com/post/anatomy-of-a-program-in-memory/

    ---

    @@ -841,6 +852,8 @@ References:
    https://www.grc.com/smgassembly.htm
    - Netwide Assembler (NASM) Documentation
    https://www.nasm.us/doc/
    - NASM Tutorial
    http://cs.lmu.edu/~ray/notes/nasmtutorial/

    ---

    @@ -971,35 +984,18 @@ References:

    ---

    # Appendix: General References
    # Appendix: Miscellanous Tools & References

    - Manually assembling mnemonic opcodes to binary machine code
    https://en.wikibooks.org/wiki/X86_Assembly/Machine_Language_Conversion#8086_instruction_format_(16_bit)
    - The x86 Instruction Structure
    https://www.codeproject.com/articles/662301/x-instruction-encoding-revealed-bit-twiddling-fo
    - X86-64 Instruction Encoding
    https://wiki.osdev.org/X86-64_Instruction_Encoding
    - CPU Rings Privilege and Protection
    https://manybutfinite.com/post/cpu-rings-privilege-and-protection/
    - Memory Translation and Segmentation
    https://manybutfinite.com/post/memory-translation-and-segmentation/
    - Stack vs. Heap: RAM Memory Layout
    https://imgur.com/gallery/DflKz1C
    - Opcode Reference (Complex)
    http://ref.x86asm.net/
    - Opcode Reference (Simple)
    http://www.felixcloutier.com/x86/
    - Good table of Registers
    https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/x64-architecture
    https://www.tortall.net/projects/yasm/manual/html/arch-x86-registers.html
    - Interesting overview from Haskell to Machine code
    http://www.stephendiehl.com/posts/monads_machine_code.html
    - Good explanation of encoding the RAX prefix for Long mode 64-bit registers
    https://www.systutorials.com/72643/beginners-guide-x86-64-instruction-encoding/
    - Intel: Introduction to x64 Assembly (an official guide)
    https://software.intel.com/en-us/articles/introduction-to-x64-assembly/
    - Determining whether Operand and Address size is 8, 16, 32, or 64 bits
    https://wiki.osdev.org/X86-64_Instruction_Encoding#Operand-size_and_address-size_override_prefix
    - Punching Cards (for FORTRAN programming)
    https://www.youtube.com/watch?v=oaVwzYN6BP4
    - Visual x86, x64, and ARM Emulator
    @@ -1008,15 +1004,13 @@ References:
    https://gist.github.com/mikesmullin/6330894
    - Linux x86_64 Syscall Table
    http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/
    - Anatomy of a Program in Memory
    https://manybutfinite.com/post/anatomy-of-a-program-in-memory/
    - How Rust encodes exceptions and interrupts
    https://os.phil-opp.com/handling-exceptions/
    - How Debuggers Work w/ Breakpoints
    https://eli.thegreenplace.net/2011/01/27/how-debuggers-work-part-2-breakpoints
    - Ralf Brown's BIOS Interrupt List
    http://www.ctyme.com/rbrown.htm
    - NASM Tutorial
    http://cs.lmu.edu/~ray/notes/nasmtutorial/
    - Intel® 64 and IA-32 Architectures Software Developer Manuals
    https://software.intel.com/en-us/articles/intel-sdm
    - AMD64 Architecture Programmer's Manual
    https://www.amd.com/system/files/TechDocs/24594.pdf
  27. mikesmullin revised this gist Nov 13, 2018. 1 changed file with 3 additions and 5 deletions.
    8 changes: 3 additions & 5 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -900,11 +900,9 @@ Here is some useful trivia about that:
    Though a determined hacker could successfully guess them by looking at example
    code which uses the `.dll`, or via fuzz testing.
    - **Decorated names** or **mangled names** are a symbol naming convention used in the
    `.obj` files. They are a series of ASCII prefix and suffixes which communicate
    additional detail about the function in order to reduce the likelihood that
    a function by the same name but different overloaded signature will not be
    clobbered when merged into the same flat `.obj` table format. The additional data
    mangled into the name includes:
    `.obj` files. They are a series of ASCII prefix and suffixes which guarantee that
    each function is named uniquely when merged into the same flat `.obj` table format.
    The additional data mangled into the name includes:
    - The **function name**.
    - The **class name** that the function is a member of, if it is a member function.
    This may include the class that encloses the class that contains the function, and so on.
  28. mikesmullin revised this gist Nov 13, 2018. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -900,7 +900,7 @@ Here is some useful trivia about that:
    Though a determined hacker could successfully guess them by looking at example
    code which uses the `.dll`, or via fuzz testing.
    - **Decorated names** or **mangled names** are a symbol naming convention used in the
    `.obj` files. they are a series of ASCII prefix and suffixes which communicate
    `.obj` files. They are a series of ASCII prefix and suffixes which communicate
    additional detail about the function in order to reduce the likelihood that
    a function by the same name but different overloaded signature will not be
    clobbered when merged into the same flat `.obj` table format. The additional data
  29. mikesmullin revised this gist Nov 13, 2018. 1 changed file with 4 additions and 0 deletions.
    4 changes: 4 additions & 0 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -861,6 +861,10 @@ References:
    http://wwwcdf.pd.infn.it/localdoc/gdbint.pdf
    - Windbg Commands
    http://windbg.info/doc/1-common-cmds.html
    - Hex-Rays Interactive Disassembler (IDA); most professional, but expensive
    https://www.hex-rays.com/products/ida/index.shtml
    - Binary Ninja; less featureful but cheap, modern interactive disassembler
    https://binary.ninja/

    ---

  30. mikesmullin revised this gist Nov 13, 2018. 1 changed file with 3 additions and 3 deletions.
    6 changes: 3 additions & 3 deletions x86-assembly-notes.md
    Original file line number Diff line number Diff line change
    @@ -882,15 +882,15 @@ Here is some useful trivia about that:
    they are just headers with pointers to address offsets in a `.dll` which must
    match the exact release version and compiler used.
    - `.lib` files may only be used at compile time to build statically linked binaries.
    - `.dll` files are intended to only be used at runtime to as dynamically linked binaries,
    - `.dll` files are intended to only be used at runtime to as dynamically linked binaries.
    - Technically `.dll` files contain enough information that a reverse engineer could
    statically link them without a `.lib`, if they wanted to.
    - If you only have a `.dll`, you may be missing the compile-time
    constants passed as function arguments. These are typically shared in the form of
    a C header (`*.h`) file, as part of an SDK (e.g,
    [windows sdk](https://docs.microsoft.com/en-us/windows/desktop/winmsg/wm-destroy)
    , [opengl sdk](https://www.khronos.org/registry/OpenGL/api/GLES2/gl2.h)
    ), if the developer wants you to have them.
    , [opengl sdk](https://www.khronos.org/registry/OpenGL/api/GLES2/gl2.h)),
    if the developer wants you to have them.
    The other thing you may not have is the documentation about what inputs are valid,
    when, and what effect they have on the `.dll` functions.
    Though a determined hacker could successfully guess them by looking at example