### Debugging an intermittent test failure

I just spent 10+ days on a single bug. Posting these notes in case they're helpful to someone else (or to future me!) in a similar situation.

The bug manifested as an intermittent test failure. This was an end-to-end test using [Playwright](https://playwright.dev/), and it involved multiple automated browsers connecting to a server and doing things concurrently.

Each browser produced hundreds of lines of debug logging, and so did the server; and for most of the time I was trying to solve this problem, I was:

- Running the test multiple times until I got both a failure and a success
- Copying the logs from the browser and from the server into text files
- Running a VS Code macro that did a bunch of search-and-replace work to reduce the visual noise in the logs
- Diffing the good and bad logs in VS Code to see what was happening differently
- Tweaking something in the code (adding logging somewhere, adding pauses, changing the order in which things were done, etc.)
- Repeating all of the above

This was extremely laborious and error-prone: I had lots of opportunities to miss a step, or get distracted, or lose my train of thought altogether.

The only way I managed to get out of this soul-destroying quagmire was to take a bit of time to create some tooling to automate this process so that I could get a fast feedback loop going.

First I needed to get the logging from both the browser and the server into a single stream.

I did this in a Playwright using a base page class that looks like this:

```ts
export class BasePage {
  userName?: string
  teamName?: string

  constructor(readonly page: Page) {
    this.page = page
  }

  // GENERAL

  log(msg: ConsoleMessage) {
    const text: string = msg.text().replace(/(color: #([0-9A-F]{6}))|(color: inherit)|%c/g, '')
    console.log(text)
  }

  async start() {
    await this.page.goto('/')

    // feed browser logs to test output
    this.page.on('console', msg => {
      this.log(msg)
    })

    // enable debug logging
    await this.page.evaluate(`window.localStorage.setItem('debug', '*')`)
    // reload so these take effect
    await pause(500)
    await this.page.reload()
    return this
  }

  // ...
}
```

I'm using the [debug-js](https://github.com/debug-js/debug) library for debug logging, and it took me a while to figure out how to get the debug variable set correctly: You have to add it to local storage, give that a moment to settle, and then reload the page.

Once that's done, you can just listen for console events from the browser and re-log them in the test code that's running in the terminal.

Then I wrote the script shown in `flaky.js`, which automates the steps listed above. It runs the test until it gets one success and one failure, each time piping the output to a text file.

It then runs each set of logs through a process that is intended to make it possible to diff the logs with a minimum of visual noise.

In my case there were lots of hashes and guids that were different on each run; these needed to be replaced with tokens that would be the same across runs.

I also replaced certain key bits of text with emoji, which made the logs easier to scan.

Once I had this set up, I was able to pin down the problem within a few hours.

Here's a screenshot of that diff; in the terminal you can see the test run where it finally stopped failing. 🥳

![screenshot](https://gist.github.com/assets/2136620/977300fd-fcfe-4931-8f46-61153c2bffb1)