Skip to content

Instantly share code, notes, and snippets.

@djg
Last active March 21, 2025 08:41
Show Gist options
  • Save djg/dcba29cf3c6e0696e493c146dcb3f08c to your computer and use it in GitHub Desktop.
Save djg/dcba29cf3c6e0696e493c146dcb3f08c to your computer and use it in GitHub Desktop.

Revisions

  1. djg revised this gist Aug 24, 2017. 1 changed file with 5 additions and 0 deletions.
    5 changes: 5 additions & 0 deletions reading-list.md
    Original file line number Diff line number Diff line change
    @@ -46,3 +46,8 @@ Let's start meta:
    29. Dijkstra, Lamport et. al-[On-the-fly Garbage Collection: an Exercise in Cooperation](http://lamport.azurewebsites.net/pubs/garbage.pdf) (1978) … introducing tri-color marking. No matter where you stand on GC, this is well worth reading.
    30. Lamport-[Multiple Byte Processing with Full-Word Instructions](http://lamport.azurewebsites.net/pubs/multiple-byte.pdf) (1975) … I thought I wouldn't need to do this anymore when we got byte-wise SIMD instrs on CPUs, and then CUDA came along and it was suddenly profitable on GPUs.
    31. Bientinesi, van de Geijn-[Formal Correctness and Stability of Dense Linear Algebra Algorithms](http://www.cs.utexas.edu/users/flame/pubs/IMACS05.pdf) (2005) … Basically an algorithm(!) for deriving dense LA algorithms. Yields direct+blocked variants, correctness proof, stability analysis.

    Ethics and Computer Science

    Rogaway-[The Moral Character of Cryptographic Work](http://web.cs.ucdavis.edu/~rogaway/papers/moral-fn.pdf) (2015)
    Ceglowski-[Haunted By Data](http://idlewords.com/talks/haunted_by_data.htm) (2015)
  2. djg revised this gist Aug 14, 2017. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion reading-list.md
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,5 @@
    [Summary of Papers 1-10](https://fgiesen.wordpress.com/2017/08/12/papers-i-like-part-1/)
    [Papers I like Pt. 1](https://fgiesen.wordpress.com/2017/08/12/papers-i-like-part-1/)
    [Papers I like Pt. 2](https://fgiesen.wordpress.com/2017/08/14/papers-i-like-part-2/)

    Let's start meta:
    1. Lamport - [State the Problem Before Describing the Solution](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/12/State-the-Problem-Before-Describing-the-Solution.pdf) (1978).
  3. djg revised this gist Aug 13, 2017. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions reading-list.md
    Original file line number Diff line number Diff line change
    @@ -1,3 +1,5 @@
    [Summary of Papers 1-10](https://fgiesen.wordpress.com/2017/08/12/papers-i-like-part-1/)

    Let's start meta:
    1. Lamport - [State the Problem Before Describing the Solution](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/12/State-the-Problem-Before-Describing-the-Solution.pdf) (1978).
    … 1-page memo. Read it.
  4. djg revised this gist Aug 11, 2017. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion reading-list.md
    Original file line number Diff line number Diff line change
    @@ -41,4 +41,5 @@ Let's start meta:
    27. Qureshi, Jaleel et al. - [Adaptive Insertion Policies for High Performance Caching](http://web.ece.ucdavis.edu/~anhttran/files/download/caching/ISCA-2007-Qureshi-SetDuelingControl.pdf) (2007) … this is a HW paper, but the design principles here have been a great inspiration for me in designing adaptive algs in SW.
    28. Thompson-[Reflections on Trusting Trust](https://www.cs.colorado.edu/~jrblack/class/csci6268/s14/p761-thompson.pdf) (1984 Turing Award lecture) … a classic. If you haven't read it, do so!
    29. Dijkstra, Lamport et. al-[On-the-fly Garbage Collection: an Exercise in Cooperation](http://lamport.azurewebsites.net/pubs/garbage.pdf) (1978) … introducing tri-color marking. No matter where you stand on GC, this is well worth reading.
    30. Lamport-[Multiple Byte Processing with Full-Word Instructions](http://lamport.azurewebsites.net/pubs/multiple-byte.pdf) (1975) … I thought I wouldn't need to do this anymore when we got byte-wise SIMD instrs on CPUs, and then CUDA came along and it was suddenly profitable on GPUs.
    30. Lamport-[Multiple Byte Processing with Full-Word Instructions](http://lamport.azurewebsites.net/pubs/multiple-byte.pdf) (1975) … I thought I wouldn't need to do this anymore when we got byte-wise SIMD instrs on CPUs, and then CUDA came along and it was suddenly profitable on GPUs.
    31. Bientinesi, van de Geijn-[Formal Correctness and Stability of Dense Linear Algebra Algorithms](http://www.cs.utexas.edu/users/flame/pubs/IMACS05.pdf) (2005) … Basically an algorithm(!) for deriving dense LA algorithms. Yields direct+blocked variants, correctness proof, stability analysis.
  5. djg revised this gist Aug 10, 2017. 1 changed file with 9 additions and 9 deletions.
    18 changes: 9 additions & 9 deletions reading-list.md
    Original file line number Diff line number Diff line change
    @@ -22,23 +22,23 @@ Let's start meta:
    … Introducing MinBLEPs for bandlimited synthesis. Such an awesome idea!
    10. Veach - [Robust Monte Carlo Methods for Light Transport Simulation](http://graphics.stanford.edu/papers/veach_thesis/) (PhD thesis, 1997)
    … wherein he invents a big chunk of modern light transport sim. A tour de force.
    11. Goto, van de Geijn-[Anatomy of high-performance matrix multiplication](http://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf) (2008) … - how a high-perf GEMM works.
    12. Chazelle-[Filtering search: a new approach to query-answering](https://www.cs.princeton.edu/~chazelle/pubs/FilteringSearch.pdf) (1986) … - balancing preprocessing and query time, as applied to several geometric search problems. Not exactly easy reading but worth your time!
    13. McIllroy-[A Killer Adversary for Quicksort](https://pdfs.semanticscholar.org/dd00/6b6383bd512c7645d8408fa6c97875c27318.pdf) (1999) … - how to do adversarial testing right. Worth studying!
    14. Knuth-[Structured Programming with go to Statements](https://sbel.wisc.edu/Courses/ME964/Literature/knuthProgramming1974.pdf) (1974) … - oft-quoted, rarely read. It may surprise you.
    11. Goto, van de Geijn-[Anatomy of high-performance matrix multiplication](http://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf) (2008) … how a high-perf GEMM works.
    12. Chazelle-[Filtering search: a new approach to query-answering](https://www.cs.princeton.edu/~chazelle/pubs/FilteringSearch.pdf) (1986) … balancing preprocessing and query time, as applied to several geometric search problems. Not exactly easy reading but worth your time!
    13. McIllroy-[A Killer Adversary for Quicksort](https://pdfs.semanticscholar.org/dd00/6b6383bd512c7645d8408fa6c97875c27318.pdf) (1999) … how to do adversarial testing right. Worth studying!
    14. Knuth-[Structured Programming with go to Statements](https://sbel.wisc.edu/Courses/ME964/Literature/knuthProgramming1974.pdf) (1974) … oft-quoted, rarely read. It may surprise you.
    15. Bryant-[Graph-Based Algorithms for Boolean Function Manipulation](http://mtv.ece.ucsb.edu/courses/ece156B_14/randy_obdd86.pdf) (1986) … - introducing ROBDDs and kickstarting a revolution in formal ASIC verification in the process.
    16. Donoghue et al.-[Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding](http://stanford.edu/~boyd/papers/scs.html) (2016) … a great example of good theory making for an elegant, practical algorithm.
    17. Braun et al.-[Simple and Efficient Construction of Static Single Assignment Form](http://pp.ipd.kit.edu/uploads/publikationen/braun13cc.pdf) (2013) … There are many SSA construction algorithms; this is the slickest one I know.
    18. Lamport, Palais-[On the Glitch Phenomenon](http://lamport.azurewebsites.net/pubs/pubs.html#glitch) (1976) … - arbitration is hard. Read Lamport's notes on it too!
    19. Curtsinger, Berger-[Stabilizer: Statistically sound performance evaluation](http://people.cs.umass.edu/~emery/pubs/stabilizer-asplos13.pdf) (2013) - … on how the ubiquity of address-indexed caches distorts program performance measurements, and what to do about it.
    20. Tomasulo-[An efficient algorihm for exploiting multiple arithmetic units](http://www.ece.umd.edu/class/enee646.F2014/Tomasulo.PDF) (1967) … - to my knowledged, the first Out of Order execution implementation that actually shipped - and applied to a small enough problem that it's easy to understand.
    18. Lamport, Palais-[On the Glitch Phenomenon](http://lamport.azurewebsites.net/pubs/pubs.html#glitch) (1976) … arbitration is hard. Read Lamport's notes on it too!
    19. Curtsinger, Berger-[Stabilizer: Statistically sound performance evaluation](http://people.cs.umass.edu/~emery/pubs/stabilizer-asplos13.pdf) (2013) … on how the ubiquity of address-indexed caches distorts program performance measurements, and what to do about it.
    20. Tomasulo-[An efficient algorihm for exploiting multiple arithmetic units](http://www.ece.umd.edu/class/enee646.F2014/Tomasulo.PDF) (1967) … to my knowledged, the first Out of Order execution implementation that actually shipped - and applied to a small enough problem that it's easy to understand.
    21. Wilson, Johnstone, Neely, Boles - [Dynamic storage allocation: A survey and critical review](http://www-2.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213-f98/doc/dsa.pdf) (1995) … Knuth got this one wrong.
    22. Meyer, Tischer-[GLICBAWLS](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131.4042&rep=rep1&type=pdf) (2001) … Two Aussies make a dirty joke for an IOCCC entry, accidentally develop a fairly competitive lossless image coder in the process. It happens.
    23. Hutton et al.-[Improving FPGA Performance and Area Using an Adaptive Logic Module](http://www2.engr.arizona.edu/~ece506/readings/project-reading/6-cad/altera-alm.pdf) (2004) … Mainly geekery for me; feel free to skip this one if you're not interested in FPGA arch, I just thought it was really cool when I first saw it! (2010 or so)
    24. O'Neill-[PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation](http://www.pcg-random.org/pdf/hmc-cs-2014-0905.pdf) (2014) … Introduces a new, pretty cool RNG and makes for a great introduction to the topic in general.
    25. Cook, Podelski, Rybalchenko-[Termination Proofs for Systems Code](http://homedirs.ccs.neu.edu/wahl/Teaching/Software-Model-Checking/2016-Spring/Papers/termination.pdf) … The Halting Problem: SOLVED!* *for a limited but very practically relevant class of programs; e.g. device drivers.
    26. O'Neill - [The Genuine Sieve of Eratosthenes](https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf) (2009) … The standard example implementation as a pure functional program is not, in fact, the same algorithm, and it's very instructive to see how it fails.
    27. Qureshi, Jaleel et al. - [Adaptive Insertion Policies for High Performance Caching](http://web.ece.ucdavis.edu/~anhttran/files/download/caching/ISCA-2007-Qureshi-SetDuelingControl.pdf) (2007) … - this is a HW paper, but the design principles here have been a great inspiration for me in designing adaptive algs in SW.
    27. Qureshi, Jaleel et al. - [Adaptive Insertion Policies for High Performance Caching](http://web.ece.ucdavis.edu/~anhttran/files/download/caching/ISCA-2007-Qureshi-SetDuelingControl.pdf) (2007) … this is a HW paper, but the design principles here have been a great inspiration for me in designing adaptive algs in SW.
    28. Thompson-[Reflections on Trusting Trust](https://www.cs.colorado.edu/~jrblack/class/csci6268/s14/p761-thompson.pdf) (1984 Turing Award lecture) … a classic. If you haven't read it, do so!
    29. Dijkstra, Lamport et. al-[On-the-fly Garbage Collection: an Exercise in Cooperation](http://lamport.azurewebsites.net/pubs/garbage.pdf) (1978) … introducing
    29. Dijkstra, Lamport et. al-[On-the-fly Garbage Collection: an Exercise in Cooperation](http://lamport.azurewebsites.net/pubs/garbage.pdf) (1978) … introducing tri-color marking. No matter where you stand on GC, this is well worth reading.
    30. Lamport-[Multiple Byte Processing with Full-Word Instructions](http://lamport.azurewebsites.net/pubs/multiple-byte.pdf) (1975) … I thought I wouldn't need to do this anymore when we got byte-wise SIMD instrs on CPUs, and then CUDA came along and it was suddenly profitable on GPUs.
  6. djg revised this gist Aug 10, 2017. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions reading-list.md
    Original file line number Diff line number Diff line change
    @@ -8,9 +8,9 @@ Let's start meta:
    4. Moffat, Turpin - [On the Implementation of Minimum Redundancy Prefix Codes](https://pdfs.semanticscholar.org/bda3/442cc6b1d10e4b36b574af0a34a668492230.pdf) (1997)
    … Much has been written about Huffman coding, a lot of it wrong. Almost everything worth knowing is in this (short!) paper.
    5.
    1. Dybvig, Hieb, Butler - [Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf)_(1990)
    1. Dybvig, Hieb, Butler - [Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf) (1990)
    … A simple, beautiful way to improve code gen forfast one-pass compilers.
    2. Milliken - [One-pass Code Generation in V8"](http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf) (year?)
    2. Milliken - [One-pass Code Generation in V8](http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf) (year?)
    … makes a great companion piece.
    6. Valmari - [Fast brief practical DFA minimization](http://www.cs.cmu.edu/~./cdm/pdf/Valmari12.pdf) (2012)
    … Takes a classic algorithm by Hopcroft that textbooks struggled to explain (or mostly omitted entirely) for decades, and simplifies it down to a (not particularly technical) 5-page paper.
  7. djg revised this gist Aug 10, 2017. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion reading-list.md
    Original file line number Diff line number Diff line change
    @@ -33,7 +33,7 @@ Let's start meta:
    19. Curtsinger, Berger-[Stabilizer: Statistically sound performance evaluation](http://people.cs.umass.edu/~emery/pubs/stabilizer-asplos13.pdf) (2013) - … on how the ubiquity of address-indexed caches distorts program performance measurements, and what to do about it.
    20. Tomasulo-[An efficient algorihm for exploiting multiple arithmetic units](http://www.ece.umd.edu/class/enee646.F2014/Tomasulo.PDF) (1967) … - to my knowledged, the first Out of Order execution implementation that actually shipped - and applied to a small enough problem that it's easy to understand.
    21. Wilson, Johnstone, Neely, Boles - [Dynamic storage allocation: A survey and critical review](http://www-2.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213-f98/doc/dsa.pdf) (1995) … Knuth got this one wrong.
    22. Meyer, Tischer-[GLICBAWLS](http://www.academia.edu/download/40157928/Glicbawls_-_Grey_Level_Image_Compression20151118-23218-qu4fld.pdf) (2001) … Two Aussies make a dirty joke for an IOCCC entry, accidentally develop a fairly competitive lossless image coder in the process. It happens.
    22. Meyer, Tischer-[GLICBAWLS](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131.4042&rep=rep1&type=pdf) (2001) … Two Aussies make a dirty joke for an IOCCC entry, accidentally develop a fairly competitive lossless image coder in the process. It happens.
    23. Hutton et al.-[Improving FPGA Performance and Area Using an Adaptive Logic Module](http://www2.engr.arizona.edu/~ece506/readings/project-reading/6-cad/altera-alm.pdf) (2004) … Mainly geekery for me; feel free to skip this one if you're not interested in FPGA arch, I just thought it was really cool when I first saw it! (2010 or so)
    24. O'Neill-[PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation](http://www.pcg-random.org/pdf/hmc-cs-2014-0905.pdf) (2014) … Introduces a new, pretty cool RNG and makes for a great introduction to the topic in general.
    25. Cook, Podelski, Rybalchenko-[Termination Proofs for Systems Code](http://homedirs.ccs.neu.edu/wahl/Teaching/Software-Model-Checking/2016-Spring/Papers/termination.pdf) … The Halting Problem: SOLVED!* *for a limited but very practically relevant class of programs; e.g. device drivers.
  8. djg revised this gist Aug 10, 2017. 1 changed file with 4 additions and 4 deletions.
    8 changes: 4 additions & 4 deletions reading-list.md
    Original file line number Diff line number Diff line change
    @@ -8,10 +8,10 @@ Let's start meta:
    4. Moffat, Turpin - [On the Implementation of Minimum Redundancy Prefix Codes](https://pdfs.semanticscholar.org/bda3/442cc6b1d10e4b36b574af0a34a668492230.pdf) (1997)
    … Much has been written about Huffman coding, a lot of it wrong. Almost everything worth knowing is in this (short!) paper.
    5.
    a.Dybvig, Hieb, Butler - [Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf)_(1990)
    … A simple, beautiful way to improve code gen forfast one-pass compilers.
    b. Milliken - [One-pass Code Generation in V8"](http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf) (year?)
    … makes a great companion piece.
    1. Dybvig, Hieb, Butler - [Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf)_(1990)
    … A simple, beautiful way to improve code gen forfast one-pass compilers.
    2. Milliken - [One-pass Code Generation in V8"](http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf) (year?)
    … makes a great companion piece.
    6. Valmari - [Fast brief practical DFA minimization](http://www.cs.cmu.edu/~./cdm/pdf/Valmari12.pdf) (2012)
    … Takes a classic algorithm by Hopcroft that textbooks struggled to explain (or mostly omitted entirely) for decades, and simplifies it down to a (not particularly technical) 5-page paper.
    7. Sarnak, Tarjan - [Planar Point Location Using Persistent Search Trees](http://users.cs.fiu.edu/~giri/teach/6405/s04/SarnakTarjanCACM86.pdf) (1986)
  9. djg revised this gist Aug 10, 2017. 1 changed file with 3 additions and 2 deletions.
    5 changes: 3 additions & 2 deletions reading-list.md
    Original file line number Diff line number Diff line change
    @@ -7,9 +7,10 @@ Let's start meta:
    … 4 pages that *anyone* working on/with complex systems should read.
    4. Moffat, Turpin - [On the Implementation of Minimum Redundancy Prefix Codes](https://pdfs.semanticscholar.org/bda3/442cc6b1d10e4b36b574af0a34a668492230.pdf) (1997)
    … Much has been written about Huffman coding, a lot of it wrong. Almost everything worth knowing is in this (short!) paper.
    5a. Dybvig, Hieb, Butler - [Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf)_(1990)
    5.
    a.Dybvig, Hieb, Butler - [Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf)_(1990)
    … A simple, beautiful way to improve code gen forfast one-pass compilers.
    5b. Milliken - [One-pass Code Generation in V8"](http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf) (year?)
    b. Milliken - [One-pass Code Generation in V8"](http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf) (year?)
    … makes a great companion piece.
    6. Valmari - [Fast brief practical DFA minimization](http://www.cs.cmu.edu/~./cdm/pdf/Valmari12.pdf) (2012)
    … Takes a classic algorithm by Hopcroft that textbooks struggled to explain (or mostly omitted entirely) for decades, and simplifies it down to a (not particularly technical) 5-page paper.
  10. djg revised this gist Aug 10, 2017. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion reading-list.md
    Original file line number Diff line number Diff line change
    @@ -1,7 +1,7 @@
    Let's start meta:
    1. Lamport - [State the Problem Before Describing the Solution](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/12/State-the-Problem-Before-Describing-the-Solution.pdf) (1978).
    … 1-page memo. Read it.
    2. Herlihy - [Wait-free synchronization](http://www.diku.dk/OLD/undervisning/2005f/dat-os/skrifter/lockfree.pdf) (year?)
    2. Herlihy - [Wait-free synchronization](http://www.diku.dk/OLD/undervisning/2005f/dat-os/skrifter/lockfree.pdf) (1991)
    … Truly seminal. Lucid + enough good ideas for 4 papers easily.
    3. Cook - [How complex systems fail](https://www.researchgate.net/profile/Richard_Cook3/publication/228797158_How_complex_systems_fail/links/0c96053410db96a89c000000.pdf) (1998)
    … 4 pages that *anyone* working on/with complex systems should read.
  11. djg revised this gist Aug 10, 2017. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion reading-list.md
    Original file line number Diff line number Diff line change
    @@ -1,5 +1,5 @@
    Let's start meta:
    1. Lamport - [State the Problem Before Describing the Solution](https://www.microsoft.com/en-us/research/publication/state-problem-describing-solution/) (1978).
    1. Lamport - [State the Problem Before Describing the Solution](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/12/State-the-Problem-Before-Describing-the-Solution.pdf) (1978).
    … 1-page memo. Read it.
    2. Herlihy - [Wait-free synchronization](http://www.diku.dk/OLD/undervisning/2005f/dat-os/skrifter/lockfree.pdf) (year?)
    … Truly seminal. Lucid + enough good ideas for 4 papers easily.
  12. djg revised this gist Aug 10, 2017. 1 changed file with 3 additions and 18 deletions.
    21 changes: 3 additions & 18 deletions reading-list.md
    Original file line number Diff line number Diff line change
    @@ -1,41 +1,26 @@
    Let's start meta:
    1. Lamport - [State the Problem Before Describing the Solution](https://www.microsoft.com/en-us/research/publication/state-problem-describing-solution/) (1978).
    … 1-page memo. Read it.

    … 1-page memo. Read it.
    2. Herlihy - [Wait-free synchronization](http://www.diku.dk/OLD/undervisning/2005f/dat-os/skrifter/lockfree.pdf) (year?)
    … Truly seminal. Lucid + enough good ideas for 4 papers easily.

    … Truly seminal. Lucid + enough good ideas for 4 papers easily.
    3. Cook - [How complex systems fail](https://www.researchgate.net/profile/Richard_Cook3/publication/228797158_How_complex_systems_fail/links/0c96053410db96a89c000000.pdf) (1998)
    … 4 pages that *anyone* working on/with complex systems should read.

    4. Moffat, Turpin - [On the Implementation of Minimum Redundancy Prefix Codes](https://pdfs.semanticscholar.org/bda3/442cc6b1d10e4b36b574af0a34a668492230.pdf) (1997)
    … Much has been written about Huffman coding, a lot of it wrong. Almost everything worth knowing is in this (short!) paper.

    5a. Dybvig, Hieb, Butler - [Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf)_(1990)
    … A simple, beautiful way to improve code gen forfast one-pass compilers.

    5b. Milliken - [One-pass Code Generation in V8"](http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf) (year?)
    … makes a great companion piece.

    6. Valmari - [Fast brief practical DFA minimization](http://www.cs.cmu.edu/~./cdm/pdf/Valmari12.pdf) (2012)
    … Takes a classic algorithm by Hopcroft that textbooks struggled to explain (or mostly omitted entirely) for decades, and simplifies it down to a (not particularly technical) 5-page paper.

    7. Sarnak, Tarjan - [Planar Point Location Using Persistent Search Trees](http://users.cs.fiu.edu/~giri/teach/6405/s04/SarnakTarjanCACM86.pdf) (1986)
    … an interesting problem in its own right, and this paper solves it by introducing persistent search trees made efficient using a *beautiful* idea (section 3!).

    8. Porter, Duff - [Compositing Digital Images](https://keithp.com/~keithp/porterduff/p253-porter.pdf) (1984)
    … Obligatory mention. Every half-year or so, another blog post crosses my timeline where someone excitedly sings the praises of premultiplied alpha. We need to get better about teaching this.

    9. Brandt - [Hard Sync Without Aliasing](http://elthariel.free.fr/ebooks/icmc01-hardsync.pdf) (2001)
    … Introducing MinBLEPs for bandlimited synthesis. Such an awesome idea!

    10. Veach - [Robust Monte Carlo Methods for Light Transport Simulation](http://graphics.stanford.edu/papers/veach_thesis/) (PhD thesis, 1997)
    … wherein he invents a big chunk of modern light transport sim. A tour de force.

    That's 10 for today, will continue tomorrow. Note most of these papers are fairly short! These are reading recommendations, and I'm trying not to waste your time. :)

    Ah, fuck it. I got plenty more, so here goes another 10.

    11. Goto, van de Geijn-[Anatomy of high-performance matrix multiplication](http://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf) (2008) … - how a high-perf GEMM works.
    12. Chazelle-[Filtering search: a new approach to query-answering](https://www.cs.princeton.edu/~chazelle/pubs/FilteringSearch.pdf) (1986) … - balancing preprocessing and query time, as applied to several geometric search problems. Not exactly easy reading but worth your time!
    13. McIllroy-[A Killer Adversary for Quicksort](https://pdfs.semanticscholar.org/dd00/6b6383bd512c7645d8408fa6c97875c27318.pdf) (1999) … - how to do adversarial testing right. Worth studying!
    @@ -55,4 +40,4 @@ Ah, fuck it. I got plenty more, so here goes another 10.
    27. Qureshi, Jaleel et al. - [Adaptive Insertion Policies for High Performance Caching](http://web.ece.ucdavis.edu/~anhttran/files/download/caching/ISCA-2007-Qureshi-SetDuelingControl.pdf) (2007) … - this is a HW paper, but the design principles here have been a great inspiration for me in designing adaptive algs in SW.
    28. Thompson-[Reflections on Trusting Trust](https://www.cs.colorado.edu/~jrblack/class/csci6268/s14/p761-thompson.pdf) (1984 Turing Award lecture) … a classic. If you haven't read it, do so!
    29. Dijkstra, Lamport et. al-[On-the-fly Garbage Collection: an Exercise in Cooperation](http://lamport.azurewebsites.net/pubs/garbage.pdf) (1978) … introducing
    30. Lamport-[Multiple Byte Processing with Full-Word Instructions](http://lamport.azurewebsites.net/pubs/multiple-byte.pdf) (1975) … I thought I wouldn't need to do this anymore when we got byte-wise SIMD instrs on CPUs, and then CUDA came along and it was suddenly profitable on GPUs.
    30. Lamport-[Multiple Byte Processing with Full-Word Instructions](http://lamport.azurewebsites.net/pubs/multiple-byte.pdf) (1975) … I thought I wouldn't need to do this anymore when we got byte-wise SIMD instrs on CPUs, and then CUDA came along and it was suddenly profitable on GPUs.
  13. djg revised this gist Aug 10, 2017. 1 changed file with 26 additions and 17 deletions.
    43 changes: 26 additions & 17 deletions reading-list.md
    Original file line number Diff line number Diff line change
    @@ -1,29 +1,38 @@
    Let's start meta:
    1. Lamport-[State the Problem Before Describing the Solution](https://www.microsoft.com/en-us/research/publication/state-problem-describing-solution/) (1978).
    1. Lamport - [State the Problem Before Describing the Solution](https://www.microsoft.com/en-us/research/publication/state-problem-describing-solution/) (1978).
    … 1-page memo. Read it.

    2. Herlihy-[Wait-free synchronization](http://www.diku.dk/OLD/undervisning/2005f/dat-os/skrifter/lockfree.pdf)
    . Truly seminal. Lucid + enough good ideas for 4 papers easily.
    2. Herlihy - [Wait-free synchronization](http://www.diku.dk/OLD/undervisning/2005f/dat-os/skrifter/lockfree.pdf) (year?)
    … Truly seminal. Lucid + enough good ideas for 4 papers easily.

    3. Cook-[How complex systems fail](https://www.researchgate.net/profile/Richard_Cook3/publication/228797158_How_complex_systems_fail/links/0c96053410db96a89c000000.pdf) (1998)
    … 4 pages that *anyone* working on/with complex systems should read.
    3. Cook - [How complex systems fail](https://www.researchgate.net/profile/Richard_Cook3/publication/228797158_How_complex_systems_fail/links/0c96053410db96a89c000000.pdf) (1998)
    … 4 pages that *anyone* working on/with complex systems should read.

    4. Moffat, Turpin-[On the Implementation of Minimum Redundancy Prefix Codes](https://pdfs.semanticscholar.org/bda3/442cc6b1d10e4b36b574af0a34a668492230.pdf) (1997)
    … Much has been written about Huffman coding, a lot of it wrong. Almost everything worth knowing is in this (short!) paper.
    4. Moffat, Turpin - [On the Implementation of Minimum Redundancy Prefix Codes](https://pdfs.semanticscholar.org/bda3/442cc6b1d10e4b36b574af0a34a668492230.pdf) (1997)
    … Much has been written about Huffman coding, a lot of it wrong. Almost everything worth knowing is in this (short!) paper.

    5a. Dybvig, Hieb, Butler-[Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf)_(1990)
    … A simple, beautiful way to improve code gen forfast one-pass compilers.
    5a. Dybvig, Hieb, Butler - [Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf)_(1990)
    … A simple, beautiful way to improve code gen forfast one-pass compilers.

    5b. Milliken-[One-pass Code Generation in V8"](http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf) (year?)
    … makes a great companion piece.
    5b. Milliken - [One-pass Code Generation in V8"](http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf) (year?)
    … makes a great companion piece.

    6. Valmari-[Fast brief practical DFA minimization](http://www.cs.cmu.edu/~./cdm/pdf/Valmari12.pdf) (2012) … Takes a classic algorithm by Hopcroft that textbooks struggled to explain (or mostly omitted entirely) for decades, and simplifies it down to a (not particularly technical) 5-page paper.
    7. Sarnak, Tarjan-[Planar Point Location Using Persistent Search Trees](http://users.cs.fiu.edu/~giri/teach/6405/s04/SarnakTarjanCACM86.pdf) (1986) … an interesting problem in its own right, and this paper solves it by introducing persistent search trees made efficient using a *beautiful* idea (section 3!).
    8. Porter, Duff-[Compositing Digital Images](https://keithp.com/~keithp/porterduff/p253-porter.pdf) (1984) … Obligatory mention. Every half-year or so, another blog post crosses my timeline where someone excitedly sings the praises of premultiplied alpha. We need to get better about teaching this.
    9. Brandt-[Hard Sync Without Aliasing](http://elthariel.free.fr/ebooks/icmc01-hardsync.pdf) (2001) … Introducing MinBLEPs for bandlimited synthesis. Such an awesome idea!
    10. Veach-[Robust Monte Carlo Methods for Light Transport Simulation](http://graphics.stanford.edu/papers/veach_thesis/) (PhD thesis, 1997) … wherein he invents a big chunk of modern light transport sim. A tour de force.
    6. Valmari - [Fast brief practical DFA minimization](http://www.cs.cmu.edu/~./cdm/pdf/Valmari12.pdf) (2012)
    … Takes a classic algorithm by Hopcroft that textbooks struggled to explain (or mostly omitted entirely) for decades, and simplifies it down to a (not particularly technical) 5-page paper.

    - that's 10 for today, will continue tomorrow. Note most of these papers are fairly short! These are reading recommendations, and I'm trying not to waste your time. :)
    7. Sarnak, Tarjan - [Planar Point Location Using Persistent Search Trees](http://users.cs.fiu.edu/~giri/teach/6405/s04/SarnakTarjanCACM86.pdf) (1986)
    … an interesting problem in its own right, and this paper solves it by introducing persistent search trees made efficient using a *beautiful* idea (section 3!).

    8. Porter, Duff - [Compositing Digital Images](https://keithp.com/~keithp/porterduff/p253-porter.pdf) (1984)
    … Obligatory mention. Every half-year or so, another blog post crosses my timeline where someone excitedly sings the praises of premultiplied alpha. We need to get better about teaching this.

    9. Brandt - [Hard Sync Without Aliasing](http://elthariel.free.fr/ebooks/icmc01-hardsync.pdf) (2001)
    … Introducing MinBLEPs for bandlimited synthesis. Such an awesome idea!

    10. Veach - [Robust Monte Carlo Methods for Light Transport Simulation](http://graphics.stanford.edu/papers/veach_thesis/) (PhD thesis, 1997)
    … wherein he invents a big chunk of modern light transport sim. A tour de force.

    That's 10 for today, will continue tomorrow. Note most of these papers are fairly short! These are reading recommendations, and I'm trying not to waste your time. :)

    Ah, fuck it. I got plenty more, so here goes another 10.

  14. djg revised this gist Aug 10, 2017. 1 changed file with 14 additions and 4 deletions.
    18 changes: 14 additions & 4 deletions reading-list.md
    Original file line number Diff line number Diff line change
    @@ -1,12 +1,22 @@
    Let's start meta:
    1. Lamport-[State the Problem Before Describing the Solution](https://www.microsoft.com/en-us/research/publication/state-problem-describing-solution/) (1978).
    … 1-page memo. Read it.

    2. Herlihy-[Wait-free synchronization](http://www.diku.dk/OLD/undervisning/2005f/dat-os/skrifter/lockfree.pdf)
    …. Truly seminal. Lucid + enough good ideas for 4 papers easily.
    3. Cook-[How complex systems fail](https://www.researchgate.net/profile/Richard_Cook3/publication/228797158_How_complex_systems_fail/links/0c96053410db96a89c000000.pdf) (1998) … 4 pages that *anyone* working on/with complex systems should read.
    4. Moffat, Turpin-[On the Implementation of Minimum Redundancy Prefix Codes](https://pdfs.semanticscholar.org/bda3/442cc6b1d10e4b36b574af0a34a668492230.pdf) (1997) … Much has been written about Huffman coding, a lot of it wrong. Almost everything worth knowing is in this (short!) paper.
    5a. Dybvig, Hieb, Butler-[Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf)_(1990) … A simple, beautiful way to improve code gen forfast one-pass compilers.
    5b. Milliken-[One-pass Code Generation in V8"](http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf) (year?) … makes a great companion piece.

    3. Cook-[How complex systems fail](https://www.researchgate.net/profile/Richard_Cook3/publication/228797158_How_complex_systems_fail/links/0c96053410db96a89c000000.pdf) (1998)
    … 4 pages that *anyone* working on/with complex systems should read.

    4. Moffat, Turpin-[On the Implementation of Minimum Redundancy Prefix Codes](https://pdfs.semanticscholar.org/bda3/442cc6b1d10e4b36b574af0a34a668492230.pdf) (1997)
    … Much has been written about Huffman coding, a lot of it wrong. Almost everything worth knowing is in this (short!) paper.

    5a. Dybvig, Hieb, Butler-[Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf)_(1990)
    … A simple, beautiful way to improve code gen forfast one-pass compilers.

    5b. Milliken-[One-pass Code Generation in V8"](http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf) (year?)
    … makes a great companion piece.

    6. Valmari-[Fast brief practical DFA minimization](http://www.cs.cmu.edu/~./cdm/pdf/Valmari12.pdf) (2012) … Takes a classic algorithm by Hopcroft that textbooks struggled to explain (or mostly omitted entirely) for decades, and simplifies it down to a (not particularly technical) 5-page paper.
    7. Sarnak, Tarjan-[Planar Point Location Using Persistent Search Trees](http://users.cs.fiu.edu/~giri/teach/6405/s04/SarnakTarjanCACM86.pdf) (1986) … an interesting problem in its own right, and this paper solves it by introducing persistent search trees made efficient using a *beautiful* idea (section 3!).
    8. Porter, Duff-[Compositing Digital Images](https://keithp.com/~keithp/porterduff/p253-porter.pdf) (1984) … Obligatory mention. Every half-year or so, another blog post crosses my timeline where someone excitedly sings the praises of premultiplied alpha. We need to get better about teaching this.
  15. djg revised this gist Aug 10, 2017. 1 changed file with 4 additions and 2 deletions.
    6 changes: 4 additions & 2 deletions reading-list.md
    Original file line number Diff line number Diff line change
    @@ -1,6 +1,8 @@
    Let's start meta:
    1. Lamport-[State the Problem Before Describing the Solution](https://www.microsoft.com/en-us/research/publication/state-problem-describing-solution/) (1978). … 1-page memo. Read it.
    2. Herlihy-[Wait-free synchronization](http://www.diku.dk/OLD/undervisning/2005f/dat-os/skrifter/lockfree.pdf) …. Truly seminal. Lucid + enough good ideas for 4 papers easily.
    1. Lamport-[State the Problem Before Describing the Solution](https://www.microsoft.com/en-us/research/publication/state-problem-describing-solution/) (1978).
    … 1-page memo. Read it.
    2. Herlihy-[Wait-free synchronization](http://www.diku.dk/OLD/undervisning/2005f/dat-os/skrifter/lockfree.pdf)
    …. Truly seminal. Lucid + enough good ideas for 4 papers easily.
    3. Cook-[How complex systems fail](https://www.researchgate.net/profile/Richard_Cook3/publication/228797158_How_complex_systems_fail/links/0c96053410db96a89c000000.pdf) (1998) … 4 pages that *anyone* working on/with complex systems should read.
    4. Moffat, Turpin-[On the Implementation of Minimum Redundancy Prefix Codes](https://pdfs.semanticscholar.org/bda3/442cc6b1d10e4b36b574af0a34a668492230.pdf) (1997) … Much has been written about Huffman coding, a lot of it wrong. Almost everything worth knowing is in this (short!) paper.
    5a. Dybvig, Hieb, Butler-[Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf)_(1990) … A simple, beautiful way to improve code gen forfast one-pass compilers.
  16. djg revised this gist Aug 10, 2017. 1 changed file with 32 additions and 31 deletions.
    63 changes: 32 additions & 31 deletions reading-list.md
    Original file line number Diff line number Diff line change
    @@ -1,36 +1,37 @@
    1. Let's start meta: Lamport-State the Problem Before Describing the Solution (1978). https://www.microsoft.com/en-us/research/publication/state-problem-describing-solution/ … 1-page memo. Read it.
    2. Herlihy-"Wait-free synchronization" http://www.diku.dk/OLD/undervisning/2005f/dat-os/skrifter/lockfree.pdf …. Truly seminal. Lucid + enough good ideas for 4 papers easily.
    3. Cook-"How complex systems fail" (1998) https://www.researchgate.net/profile/Richard_Cook3/publication/228797158_How_complex_systems_fail/links/0c96053410db96a89c000000.pdf … 4 pages that *anyone* working on/with complex systems should read.
    4. Moffat, Turpin-"On the Implementation of Minimum Redundancy Prefix Codes" (1997) https://pdfs.semanticscholar.org/bda3/442cc6b1d10e4b36b574af0a34a668492230.pdf … Much has been written about Huffman coding, a lot of it wrong. Almost everything worth knowing is in this (short!) paper.
    5. Dybvig, Hieb, Butler-"Destination-Driven Code Generation" (1990) https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf … A simple, beautiful way to improve code gen forfast one-pass compilers.
    5b. Milliken-"One-pass Code Generation in V8" (year?) http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf … makes a great companion piece.
    6. Valmari-"Fast brief practical DFA minimization" (2012) http://www.cs.cmu.edu/~./cdm/pdf/Valmari12.pdf … Takes a classic algorithm by Hopcroft that textbooks struggled to explain (or mostly omitted entirely) for decades, and simplifies it down to a (not particularly technical) 5-page paper.
    7. Sarnak, Tarjan-"Planar Point Location Using Persistent Search Trees" (1986) http://users.cs.fiu.edu/~giri/teach/6405/s04/SarnakTarjanCACM86.pdf … an interesting problem in its own right, and this paper solves it by introducing persistent search trees made efficient using a *beautiful* idea (section 3!).
    8. Porter, Duff-"Compositing Digital Images" (1984) https://keithp.com/~keithp/porterduff/p253-porter.pdf … Obligatory mention. Every half-year or so, another blog post crosses my timeline where someone excitedly sings the praises of premultiplied alpha. We need to get better about teaching this.
    9. Brandt-"Hard Sync Without Aliasing" (2001) http://elthariel.free.fr/ebooks/icmc01-hardsync.pdf … Introducing MinBLEPs for bandlimited synthesis. Such an awesome idea!
    10. Veach-"Robust Monte Carlo Methods for Light Transport Simulation" (PhD thesis, 1997) http://graphics.stanford.edu/papers/veach_thesis/ … wherein he invents a big chunk of modern light transport sim. A tour de force.
    Let's start meta:
    1. Lamport-[State the Problem Before Describing the Solution](https://www.microsoft.com/en-us/research/publication/state-problem-describing-solution/) (1978). … 1-page memo. Read it.
    2. Herlihy-[Wait-free synchronization](http://www.diku.dk/OLD/undervisning/2005f/dat-os/skrifter/lockfree.pdf) …. Truly seminal. Lucid + enough good ideas for 4 papers easily.
    3. Cook-[How complex systems fail](https://www.researchgate.net/profile/Richard_Cook3/publication/228797158_How_complex_systems_fail/links/0c96053410db96a89c000000.pdf) (1998) … 4 pages that *anyone* working on/with complex systems should read.
    4. Moffat, Turpin-[On the Implementation of Minimum Redundancy Prefix Codes](https://pdfs.semanticscholar.org/bda3/442cc6b1d10e4b36b574af0a34a668492230.pdf) (1997) … Much has been written about Huffman coding, a lot of it wrong. Almost everything worth knowing is in this (short!) paper.
    5a. Dybvig, Hieb, Butler-[Destination-Driven Code Generation](https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf)_(1990) … A simple, beautiful way to improve code gen forfast one-pass compilers.
    5b. Milliken-[One-pass Code Generation in V8"](http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf) (year?) … makes a great companion piece.
    6. Valmari-[Fast brief practical DFA minimization](http://www.cs.cmu.edu/~./cdm/pdf/Valmari12.pdf) (2012) … Takes a classic algorithm by Hopcroft that textbooks struggled to explain (or mostly omitted entirely) for decades, and simplifies it down to a (not particularly technical) 5-page paper.
    7. Sarnak, Tarjan-[Planar Point Location Using Persistent Search Trees](http://users.cs.fiu.edu/~giri/teach/6405/s04/SarnakTarjanCACM86.pdf) (1986) … an interesting problem in its own right, and this paper solves it by introducing persistent search trees made efficient using a *beautiful* idea (section 3!).
    8. Porter, Duff-[Compositing Digital Images](https://keithp.com/~keithp/porterduff/p253-porter.pdf) (1984) … Obligatory mention. Every half-year or so, another blog post crosses my timeline where someone excitedly sings the praises of premultiplied alpha. We need to get better about teaching this.
    9. Brandt-[Hard Sync Without Aliasing](http://elthariel.free.fr/ebooks/icmc01-hardsync.pdf) (2001) … Introducing MinBLEPs for bandlimited synthesis. Such an awesome idea!
    10. Veach-[Robust Monte Carlo Methods for Light Transport Simulation](http://graphics.stanford.edu/papers/veach_thesis/) (PhD thesis, 1997) … wherein he invents a big chunk of modern light transport sim. A tour de force.

    - that's 10 for today, will continue tomorrow. Note most of these papers are fairly short! These are reading recommendations, and I'm trying not to waste your time. :)

    Ah, fuck it. I got plenty more, so here goes another 10.

    11. Goto, van de Geijn-"Anatomy of high-performance matrix multiplication" (2008) http://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf … - how a high-perf GEMM works.
    12. Chazelle-"Filtering search: a new approach to query-answering" (1986) https://www.cs.princeton.edu/~chazelle/pubs/FilteringSearch.pdf … - balancing preprocessing and query time, as applied to several geometric search problems. Not exactly easy reading but worth your time!
    13. McIllroy-"A Killer Adversary for Quicksort" (1999) https://pdfs.semanticscholar.org/dd00/6b6383bd512c7645d8408fa6c97875c27318.pdf … - how to do adversarial testing right. Worth studying!
    14. Knuth-"Structured Programming with go to Statements" (1974) https://sbel.wisc.edu/Courses/ME964/Literature/knuthProgramming1974.pdf … - oft-quoted, rarely read. It may surprise you.
    15. Bryant-"Graph-Based Algorithms for Boolean Function Manipulation" (1986) http://mtv.ece.ucsb.edu/courses/ece156B_14/randy_obdd86.pdf … - introducing ROBDDs and kickstarting a revolution in formal ASIC verification in the process.
    16. Donoghue et al.-"Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding" (2016) http://stanford.edu/~boyd/papers/scs.html … a great example of good theory making for an elegant, practical algorithm.
    17. Braun et al.-"Simple and Efficient Construction of Static Single Assignment Form" (2013) http://pp.ipd.kit.edu/uploads/publikationen/braun13cc.pdf … There are many SSA construction algorithms; this is the slickest one I know.
    18. Lamport, Palais-"On the Glitch Phenomenon" (1976) http://lamport.azurewebsites.net/pubs/pubs.html#glitch … - arbitration is hard. Read Lamport's notes on it too!
    19. Curtsinger, Berger-"Stabilizer: Statistically sound performance evaluation" (2013) http://people.cs.umass.edu/~emery/pubs/stabilizer-asplos13.pdf- … on how the ubiquity of address-indexed caches distorts program performance measurements, and what to do about it.
    20. Tomasulo-"An efficient algorihm for exploiting multiple arithmetic units" (1967) http://www.ece.umd.edu/class/enee646.F2014/Tomasulo.PDF … - to my knowledged, the first Out of Order execution implementation that actually shipped - and applied to a small enough problem that it's easy to understand.
    21. Wilson, Johnstone, Neely, Boles - "Dynamic storage allocation: A survey and critical review" (1995) http://www-2.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213-f98/doc/dsa.pdf … Knuth got this one wrong.
    22. Meyer, Tischer-"GLICBAWLS" (2001) http://www.academia.edu/download/40157928/Glicbawls_-_Grey_Level_Image_Compression20151118-23218-qu4fld.pdf … Two Aussies make a dirty joke for an IOCCC entry, accidentally develop a fairly competitive lossless image coder in the process. It happens.
    23. Hutton et al.-"Improving FPGA Performance and Area Using an Adaptive Logic Module" (2004) http://www2.engr.arizona.edu/~ece506/readings/project-reading/6-cad/altera-alm.pdf … Mainly geekery for me; feel free to skip this one if you're not interested in FPGA arch, I just thought it was really cool when I first saw it! (2010 or so)
    24. O'Neill-"PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation" (2014) http://www.pcg-random.org/pdf/hmc-cs-2014-0905.pdf … Introduces a new, pretty cool RNG and makes for a great introduction to the topic in general.
    25. Cook, Podelski, Rybalchenko-"Termination Proofs for Systems Code" http://homedirs.ccs.neu.edu/wahl/Teaching/Software-Model-Checking/2016-Spring/Papers/termination.pdf … The Halting Problem: SOLVED!* *for a limited but very practically relevant class of programs; e.g. device drivers.
    26. O'Neill - "The Genuine Sieve of Eratosthenes" (2009) https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf … The standard example implementation as a pure functional program is not, in fact, the same algorithm, and it's very instructive to see how it fails.
    27. Qureshi, Jaleel et al. - "Adaptive Insertion Policies for High Performance Caching" (2007) http://web.ece.ucdavis.edu/~anhttran/files/download/caching/ISCA-2007-Qureshi-SetDuelingControl.pdf … - this is a HW paper, but the design principles here have been a great inspiration for me in designing adaptive algs in SW.
    28. Thompson-"Reflections on Trusting Trust" (1984 Turing Award lecture) https://www.cs.colorado.edu/~jrblack/class/csci6268/s14/p761-thompson.pdf … a classic. If you haven't read it, do so!
    29. Dijkstra, Lamport et. al-"On-the-fly Garbage Collection: an Exercise in Cooperation" (1978) http://lamport.azurewebsites.net/pubs/garbage.pdf … introducing
    30. Lamport-"Multiple Byte Processing with Full-Word Instructions" (1975) http://lamport.azurewebsites.net/pubs/multiple-byte.pdf … I thought I wouldn't need to do this anymore when we got byte-wise SIMD instrs on CPUs, and then CUDA came along and it was suddenly proftable on GPUs.
    11. Goto, van de Geijn-[Anatomy of high-performance matrix multiplication](http://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf) (2008) … - how a high-perf GEMM works.
    12. Chazelle-[Filtering search: a new approach to query-answering](https://www.cs.princeton.edu/~chazelle/pubs/FilteringSearch.pdf) (1986) … - balancing preprocessing and query time, as applied to several geometric search problems. Not exactly easy reading but worth your time!
    13. McIllroy-[A Killer Adversary for Quicksort](https://pdfs.semanticscholar.org/dd00/6b6383bd512c7645d8408fa6c97875c27318.pdf) (1999) … - how to do adversarial testing right. Worth studying!
    14. Knuth-[Structured Programming with go to Statements](https://sbel.wisc.edu/Courses/ME964/Literature/knuthProgramming1974.pdf) (1974) … - oft-quoted, rarely read. It may surprise you.
    15. Bryant-[Graph-Based Algorithms for Boolean Function Manipulation](http://mtv.ece.ucsb.edu/courses/ece156B_14/randy_obdd86.pdf) (1986) … - introducing ROBDDs and kickstarting a revolution in formal ASIC verification in the process.
    16. Donoghue et al.-[Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding](http://stanford.edu/~boyd/papers/scs.html) (2016) … a great example of good theory making for an elegant, practical algorithm.
    17. Braun et al.-[Simple and Efficient Construction of Static Single Assignment Form](http://pp.ipd.kit.edu/uploads/publikationen/braun13cc.pdf) (2013) … There are many SSA construction algorithms; this is the slickest one I know.
    18. Lamport, Palais-[On the Glitch Phenomenon](http://lamport.azurewebsites.net/pubs/pubs.html#glitch) (1976) … - arbitration is hard. Read Lamport's notes on it too!
    19. Curtsinger, Berger-[Stabilizer: Statistically sound performance evaluation](http://people.cs.umass.edu/~emery/pubs/stabilizer-asplos13.pdf) (2013) - … on how the ubiquity of address-indexed caches distorts program performance measurements, and what to do about it.
    20. Tomasulo-[An efficient algorihm for exploiting multiple arithmetic units](http://www.ece.umd.edu/class/enee646.F2014/Tomasulo.PDF) (1967) … - to my knowledged, the first Out of Order execution implementation that actually shipped - and applied to a small enough problem that it's easy to understand.
    21. Wilson, Johnstone, Neely, Boles - [Dynamic storage allocation: A survey and critical review](http://www-2.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213-f98/doc/dsa.pdf) (1995) … Knuth got this one wrong.
    22. Meyer, Tischer-[GLICBAWLS](http://www.academia.edu/download/40157928/Glicbawls_-_Grey_Level_Image_Compression20151118-23218-qu4fld.pdf) (2001) … Two Aussies make a dirty joke for an IOCCC entry, accidentally develop a fairly competitive lossless image coder in the process. It happens.
    23. Hutton et al.-[Improving FPGA Performance and Area Using an Adaptive Logic Module](http://www2.engr.arizona.edu/~ece506/readings/project-reading/6-cad/altera-alm.pdf) (2004) … Mainly geekery for me; feel free to skip this one if you're not interested in FPGA arch, I just thought it was really cool when I first saw it! (2010 or so)
    24. O'Neill-[PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation](http://www.pcg-random.org/pdf/hmc-cs-2014-0905.pdf) (2014) … Introduces a new, pretty cool RNG and makes for a great introduction to the topic in general.
    25. Cook, Podelski, Rybalchenko-[Termination Proofs for Systems Code](http://homedirs.ccs.neu.edu/wahl/Teaching/Software-Model-Checking/2016-Spring/Papers/termination.pdf) … The Halting Problem: SOLVED!* *for a limited but very practically relevant class of programs; e.g. device drivers.
    26. O'Neill - [The Genuine Sieve of Eratosthenes](https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf) (2009) … The standard example implementation as a pure functional program is not, in fact, the same algorithm, and it's very instructive to see how it fails.
    27. Qureshi, Jaleel et al. - [Adaptive Insertion Policies for High Performance Caching](http://web.ece.ucdavis.edu/~anhttran/files/download/caching/ISCA-2007-Qureshi-SetDuelingControl.pdf) (2007) … - this is a HW paper, but the design principles here have been a great inspiration for me in designing adaptive algs in SW.
    28. Thompson-[Reflections on Trusting Trust](https://www.cs.colorado.edu/~jrblack/class/csci6268/s14/p761-thompson.pdf) (1984 Turing Award lecture) … a classic. If you haven't read it, do so!
    29. Dijkstra, Lamport et. al-[On-the-fly Garbage Collection: an Exercise in Cooperation](http://lamport.azurewebsites.net/pubs/garbage.pdf) (1978) … introducing
    30. Lamport-[Multiple Byte Processing with Full-Word Instructions](http://lamport.azurewebsites.net/pubs/multiple-byte.pdf) (1975) … I thought I wouldn't need to do this anymore when we got byte-wise SIMD instrs on CPUs, and then CUDA came along and it was suddenly profitable on GPUs.
  17. djg created this gist Aug 10, 2017.
    36 changes: 36 additions & 0 deletions reading-list.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,36 @@
    1. Let's start meta: Lamport-State the Problem Before Describing the Solution (1978). https://www.microsoft.com/en-us/research/publication/state-problem-describing-solution/ … 1-page memo. Read it.
    2. Herlihy-"Wait-free synchronization" http://www.diku.dk/OLD/undervisning/2005f/dat-os/skrifter/lockfree.pdf …. Truly seminal. Lucid + enough good ideas for 4 papers easily.
    3. Cook-"How complex systems fail" (1998) https://www.researchgate.net/profile/Richard_Cook3/publication/228797158_How_complex_systems_fail/links/0c96053410db96a89c000000.pdf … 4 pages that *anyone* working on/with complex systems should read.
    4. Moffat, Turpin-"On the Implementation of Minimum Redundancy Prefix Codes" (1997) https://pdfs.semanticscholar.org/bda3/442cc6b1d10e4b36b574af0a34a668492230.pdf … Much has been written about Huffman coding, a lot of it wrong. Almost everything worth knowing is in this (short!) paper.
    5. Dybvig, Hieb, Butler-"Destination-Driven Code Generation" (1990) https://pdfs.semanticscholar.org/dcb8/8719880e1f76ad71fb1c5aebb118e2ecfe71.pdf … A simple, beautiful way to improve code gen forfast one-pass compilers.
    5b. Milliken-"One-pass Code Generation in V8" (year?) http://cs.au.dk/~mis/dOvs/slides/46b-codegeneration-in-V8.pdf … makes a great companion piece.
    6. Valmari-"Fast brief practical DFA minimization" (2012) http://www.cs.cmu.edu/~./cdm/pdf/Valmari12.pdf … Takes a classic algorithm by Hopcroft that textbooks struggled to explain (or mostly omitted entirely) for decades, and simplifies it down to a (not particularly technical) 5-page paper.
    7. Sarnak, Tarjan-"Planar Point Location Using Persistent Search Trees" (1986) http://users.cs.fiu.edu/~giri/teach/6405/s04/SarnakTarjanCACM86.pdf … an interesting problem in its own right, and this paper solves it by introducing persistent search trees made efficient using a *beautiful* idea (section 3!).
    8. Porter, Duff-"Compositing Digital Images" (1984) https://keithp.com/~keithp/porterduff/p253-porter.pdf … Obligatory mention. Every half-year or so, another blog post crosses my timeline where someone excitedly sings the praises of premultiplied alpha. We need to get better about teaching this.
    9. Brandt-"Hard Sync Without Aliasing" (2001) http://elthariel.free.fr/ebooks/icmc01-hardsync.pdf … Introducing MinBLEPs for bandlimited synthesis. Such an awesome idea!
    10. Veach-"Robust Monte Carlo Methods for Light Transport Simulation" (PhD thesis, 1997) http://graphics.stanford.edu/papers/veach_thesis/ … wherein he invents a big chunk of modern light transport sim. A tour de force.

    - that's 10 for today, will continue tomorrow. Note most of these papers are fairly short! These are reading recommendations, and I'm trying not to waste your time. :)

    Ah, fuck it. I got plenty more, so here goes another 10.

    11. Goto, van de Geijn-"Anatomy of high-performance matrix multiplication" (2008) http://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf … - how a high-perf GEMM works.
    12. Chazelle-"Filtering search: a new approach to query-answering" (1986) https://www.cs.princeton.edu/~chazelle/pubs/FilteringSearch.pdf … - balancing preprocessing and query time, as applied to several geometric search problems. Not exactly easy reading but worth your time!
    13. McIllroy-"A Killer Adversary for Quicksort" (1999) https://pdfs.semanticscholar.org/dd00/6b6383bd512c7645d8408fa6c97875c27318.pdf … - how to do adversarial testing right. Worth studying!
    14. Knuth-"Structured Programming with go to Statements" (1974) https://sbel.wisc.edu/Courses/ME964/Literature/knuthProgramming1974.pdf … - oft-quoted, rarely read. It may surprise you.
    15. Bryant-"Graph-Based Algorithms for Boolean Function Manipulation" (1986) http://mtv.ece.ucsb.edu/courses/ece156B_14/randy_obdd86.pdf … - introducing ROBDDs and kickstarting a revolution in formal ASIC verification in the process.
    16. Donoghue et al.-"Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding" (2016) http://stanford.edu/~boyd/papers/scs.html … a great example of good theory making for an elegant, practical algorithm.
    17. Braun et al.-"Simple and Efficient Construction of Static Single Assignment Form" (2013) http://pp.ipd.kit.edu/uploads/publikationen/braun13cc.pdf … There are many SSA construction algorithms; this is the slickest one I know.
    18. Lamport, Palais-"On the Glitch Phenomenon" (1976) http://lamport.azurewebsites.net/pubs/pubs.html#glitch … - arbitration is hard. Read Lamport's notes on it too!
    19. Curtsinger, Berger-"Stabilizer: Statistically sound performance evaluation" (2013) http://people.cs.umass.edu/~emery/pubs/stabilizer-asplos13.pdf- … on how the ubiquity of address-indexed caches distorts program performance measurements, and what to do about it.
    20. Tomasulo-"An efficient algorihm for exploiting multiple arithmetic units" (1967) http://www.ece.umd.edu/class/enee646.F2014/Tomasulo.PDF … - to my knowledged, the first Out of Order execution implementation that actually shipped - and applied to a small enough problem that it's easy to understand.
    21. Wilson, Johnstone, Neely, Boles - "Dynamic storage allocation: A survey and critical review" (1995) http://www-2.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213-f98/doc/dsa.pdf … Knuth got this one wrong.
    22. Meyer, Tischer-"GLICBAWLS" (2001) http://www.academia.edu/download/40157928/Glicbawls_-_Grey_Level_Image_Compression20151118-23218-qu4fld.pdf … Two Aussies make a dirty joke for an IOCCC entry, accidentally develop a fairly competitive lossless image coder in the process. It happens.
    23. Hutton et al.-"Improving FPGA Performance and Area Using an Adaptive Logic Module" (2004) http://www2.engr.arizona.edu/~ece506/readings/project-reading/6-cad/altera-alm.pdf … Mainly geekery for me; feel free to skip this one if you're not interested in FPGA arch, I just thought it was really cool when I first saw it! (2010 or so)
    24. O'Neill-"PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation" (2014) http://www.pcg-random.org/pdf/hmc-cs-2014-0905.pdf … Introduces a new, pretty cool RNG and makes for a great introduction to the topic in general.
    25. Cook, Podelski, Rybalchenko-"Termination Proofs for Systems Code" http://homedirs.ccs.neu.edu/wahl/Teaching/Software-Model-Checking/2016-Spring/Papers/termination.pdf … The Halting Problem: SOLVED!* *for a limited but very practically relevant class of programs; e.g. device drivers.
    26. O'Neill - "The Genuine Sieve of Eratosthenes" (2009) https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf … The standard example implementation as a pure functional program is not, in fact, the same algorithm, and it's very instructive to see how it fails.
    27. Qureshi, Jaleel et al. - "Adaptive Insertion Policies for High Performance Caching" (2007) http://web.ece.ucdavis.edu/~anhttran/files/download/caching/ISCA-2007-Qureshi-SetDuelingControl.pdf … - this is a HW paper, but the design principles here have been a great inspiration for me in designing adaptive algs in SW.
    28. Thompson-"Reflections on Trusting Trust" (1984 Turing Award lecture) https://www.cs.colorado.edu/~jrblack/class/csci6268/s14/p761-thompson.pdf … a classic. If you haven't read it, do so!
    29. Dijkstra, Lamport et. al-"On-the-fly Garbage Collection: an Exercise in Cooperation" (1978) http://lamport.azurewebsites.net/pubs/garbage.pdf … introducing
    30. Lamport-"Multiple Byte Processing with Full-Word Instructions" (1975) http://lamport.azurewebsites.net/pubs/multiple-byte.pdf … I thought I wouldn't need to do this anymore when we got byte-wise SIMD instrs on CPUs, and then CUDA came along and it was suddenly proftable on GPUs.