Skip to content

Instantly share code, notes, and snippets.

@Changochen
Forked from dlaehnemann/flamegraph_rust.md
Created February 7, 2022 02:50
Show Gist options
  • Select an option

  • Save Changochen/feeda4ff51f32cd02a430b9d9f7a5c27 to your computer and use it in GitHub Desktop.

Select an option

Save Changochen/feeda4ff51f32cd02a430b9d9f7a5c27 to your computer and use it in GitHub Desktop.

Revisions

  1. @dlaehnemann dlaehnemann revised this gist Aug 7, 2020. 1 changed file with 3 additions and 0 deletions.
    3 changes: 3 additions & 0 deletions flamegraph_rust.md
    Original file line number Diff line number Diff line change
    @@ -1,3 +1,6 @@
    Flamegraphing in Rust can now be done with a new `cargo` subcommand. Please check this out before embarking on the legacy journey below:
    https://github.com/flamegraph-rs/flamegraph

    # flamegraphing rust binaries' cpu usage with perf

    ## One-time setup
  2. @dlaehnemann dlaehnemann revised this gist Feb 26, 2019. 1 changed file with 5 additions and 4 deletions.
    9 changes: 5 additions & 4 deletions flamegraph_rust.md
    Original file line number Diff line number Diff line change
    @@ -12,7 +12,7 @@ http://www.brendangregg.com/perf.html#Prerequisites
    echo "PATH=/path/to/FlameGraph:$PATH" >> .profile
    source .profile
    ```
    3. If you are running an older version of perf (any Linux kernel version before [v4.8-rc1](https://github.com/torvalds/linux/releases/tag/v4.8-rc1)), you should also:
    3. If you are running an older version of perf (any Linux kernel version before [v4.8-rc1](https://github.com/torvalds/linux/releases/tag/v4.8-rc1)), you should also (this will resolve some further mangled names on top of the `c++filt` unmangling):
    1. Clone [`rust-unmangle`](https://github.com/Yamakaky/rust-unmangle) and [add it to your path](https://stackoverflow.com/a/7360945):
    ```
    git clone https://github.com/Yamakaky/rust-unmangle.git
    @@ -71,14 +71,15 @@ sudo sysctl -w kernel.perf_event_paranoid=-1
    The resulting `report.perf` can be rather large, depending on the length of your example run and the sampling frequency selected, easily going into the GBs -- so make sure you have the space available. Based on the report, generate the flamegraph with:
    ```
    perf script | stackcollapse-perf.pl | c++filt | rust-unmangle | flamegraph.pl > flame.svg
    perf script | stackcollapse-perf.pl | stackcollapse-recursive.pl | c++filt | rust-unmangle | flamegraph.pl > flame.svg
    ```
    Tools here are:
    * `stackcollapse-perf.pl`: The stackcollapse Perl script by Brendan Gregg, which groups identical levels in stacks together. For installation instructions see above. @tomtung reports, that he [also needed `stackcollapse-recursive.pl` for unmangling of recursive calls](https://github.com/TyOverby/flame/issues/33#issuecomment-449349910), so give it a spin if you suspect this is keeping some unmangling from happening.
    * `stackcollapse-perf.pl`: The stackcollapse Perl script by Brendan Gregg, which groups identical levels in stacks together. For installation instructions see above.
    * `stackcollapse-recursive.pl`: This further collapses some recursive calls, improving readability. @tomtung [reported this useful addition](https://github.com/TyOverby/flame/issues/33#issuecomment-449349910).
    * `c++filt`: This is a `C++` demangler / unmangler that takes care of demangling a lot of the Rust name mangling, as [Rust](https://en.wikipedia.org/wiki/Name_mangling#Rust) also uses [`C++` name mangling](https://github.com/rust-lang/rust/blob/76affa5d6f5d1b8c3afcd4e0c6bbaee1fb0daeb4/src/librustc_trans/back/symbol_names.rs#L358). It should be available in a standard linux installation.
    * `rust-unmangle`: This script nmangles (almost) all remaining names mangled by Rust. It is optional and should not be necessary in versions of perf from Linux kernel [v4.8-rc1](https://github.com/torvalds/linux/releases/tag/v4.8-rc1) onwards, as these should include rust unmangling code (and [it worked without rust-unmangle](https://github.com/TyOverby/flame/issues/33#issuecomment-449349910) for @tomtung) -- but I haven't tested the newer version myself and needed it for my older one, so who knows who else does, as well
    * `rust-unmangle`: This script unmangles some remaining names mangled by Rust. It is optional and should not be necessary in versions of perf from Linux kernel [v4.8-rc1](https://github.com/torvalds/linux/releases/tag/v4.8-rc1) onwards, as these should include rust unmangling code (and [it worked without rust-unmangle](https://github.com/TyOverby/flame/issues/33#issuecomment-449349910) for @tomtung) -- but I haven't tested the newer version myself and needed it for my older one, so who knows who else does, as well.
    * `flamegraph.pl`: This Perl script by Brendan Gregg takes the collapsed stacks and renders them into the (interactive) `.svg` format.
    Inspect the `flame.svg` file by opening it in a browser and hovering over individual bars to get the respective function names displayed. You can also search for bars containing certain expressions (top right), click on bars to zoom in on them and reset the view (top left).
  3. @dlaehnemann dlaehnemann revised this gist Dec 29, 2018. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions flamegraph_rust.md
    Original file line number Diff line number Diff line change
    @@ -76,9 +76,9 @@ perf script | stackcollapse-perf.pl | c++filt | rust-unmangle | flamegraph.pl >
    Tools here are:
    * `stackcollapse-pref.pl`: The stackcollapse Perl script by Brendan Gregg, which groups identical levels in stacks together. For installation instructions see above.
    * `stackcollapse-perf.pl`: The stackcollapse Perl script by Brendan Gregg, which groups identical levels in stacks together. For installation instructions see above. @tomtung reports, that he [also needed `stackcollapse-recursive.pl` for unmangling of recursive calls](https://github.com/TyOverby/flame/issues/33#issuecomment-449349910), so give it a spin if you suspect this is keeping some unmangling from happening.
    * `c++filt`: This is a `C++` demangler / unmangler that takes care of demangling a lot of the Rust name mangling, as [Rust](https://en.wikipedia.org/wiki/Name_mangling#Rust) also uses [`C++` name mangling](https://github.com/rust-lang/rust/blob/76affa5d6f5d1b8c3afcd4e0c6bbaee1fb0daeb4/src/librustc_trans/back/symbol_names.rs#L358). It should be available in a standard linux installation.
    * `rust-unmangle`: This script unmangles (almost) all remaining names mangled by Rust. It is optional and should not be necessary in versions of perf from Linux kernel [v4.8-rc1](https://github.com/torvalds/linux/releases/tag/v4.8-rc1) onwards, as these should include rust unmangling code (and [it worked without rust-unmangle](https://github.com/TyOverby/flame/issues/33#issuecomment-449349910) for @tomtung) -- but I haven't tested the newer version myself and needed it for my older one, so who knows who else does, as well
    * `rust-unmangle`: This script nmangles (almost) all remaining names mangled by Rust. It is optional and should not be necessary in versions of perf from Linux kernel [v4.8-rc1](https://github.com/torvalds/linux/releases/tag/v4.8-rc1) onwards, as these should include rust unmangling code (and [it worked without rust-unmangle](https://github.com/TyOverby/flame/issues/33#issuecomment-449349910) for @tomtung) -- but I haven't tested the newer version myself and needed it for my older one, so who knows who else does, as well
    * `flamegraph.pl`: This Perl script by Brendan Gregg takes the collapsed stacks and renders them into the (interactive) `.svg` format.
    Inspect the `flame.svg` file by opening it in a browser and hovering over individual bars to get the respective function names displayed. You can also search for bars containing certain expressions (top right), click on bars to zoom in on them and reset the view (top left).
  4. @dlaehnemann dlaehnemann revised this gist Dec 29, 2018. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion flamegraph_rust.md
    Original file line number Diff line number Diff line change
    @@ -78,7 +78,7 @@ Tools here are:
    * `stackcollapse-pref.pl`: The stackcollapse Perl script by Brendan Gregg, which groups identical levels in stacks together. For installation instructions see above.
    * `c++filt`: This is a `C++` demangler / unmangler that takes care of demangling a lot of the Rust name mangling, as [Rust](https://en.wikipedia.org/wiki/Name_mangling#Rust) also uses [`C++` name mangling](https://github.com/rust-lang/rust/blob/76affa5d6f5d1b8c3afcd4e0c6bbaee1fb0daeb4/src/librustc_trans/back/symbol_names.rs#L358). It should be available in a standard linux installation.
    * `rust-unmangle`: This script unmangles (almost) all remaining names mangled by Rust. It is optional and should not be necessary in versions of perf from Linux kernel [v4.8-rc1](https://github.com/torvalds/linux/releases/tag/v4.8-rc1) onwards, as these should include rust unmangling code -- but I haven't tested the newer version and needed it for my older one, so who knows who else does, as well
    * `rust-unmangle`: This script unmangles (almost) all remaining names mangled by Rust. It is optional and should not be necessary in versions of perf from Linux kernel [v4.8-rc1](https://github.com/torvalds/linux/releases/tag/v4.8-rc1) onwards, as these should include rust unmangling code (and [it worked without rust-unmangle](https://github.com/TyOverby/flame/issues/33#issuecomment-449349910) for @tomtung) -- but I haven't tested the newer version myself and needed it for my older one, so who knows who else does, as well
    * `flamegraph.pl`: This Perl script by Brendan Gregg takes the collapsed stacks and renders them into the (interactive) `.svg` format.
    Inspect the `flame.svg` file by opening it in a browser and hovering over individual bars to get the respective function names displayed. You can also search for bars containing certain expressions (top right), click on bars to zoom in on them and reset the view (top left).
  5. @dlaehnemann dlaehnemann revised this gist Nov 13, 2018. 1 changed file with 12 additions and 0 deletions.
    12 changes: 12 additions & 0 deletions flamegraph_rust.md
    Original file line number Diff line number Diff line change
    @@ -56,6 +56,18 @@ Options here are:
    * `-e cpu-clock`: this selects the event `cpu-clock` for perf sampling -- without it, the following argument did not really matter in my environment
    * `-F 997`: this ensures a sampling at 997 Hz, the value off from a round 1000 is to avoid lockstep sampling (see e.g. [Brendan Gregg's blog post from 2014](http://www.brendangregg.com/blog/2014-06-22/perf-cpu-sample.html))
    Should perf give you errors regarding `sysctl` settings, you can inspect the current values with, e.g.:
    ```
    sysctl -n kernel.perf_event_paranoid
    ```
    And permanently write new values into them with:
    ```
    sudo sysctl -w kernel.perf_event_paranoid=-1
    ```
    The resulting `report.perf` can be rather large, depending on the length of your example run and the sampling frequency selected, easily going into the GBs -- so make sure you have the space available. Based on the report, generate the flamegraph with:
    ```
  6. @dlaehnemann dlaehnemann revised this gist Nov 13, 2018. 1 changed file with 22 additions and 22 deletions.
    44 changes: 22 additions & 22 deletions flamegraph_rust.md
    Original file line number Diff line number Diff line change
    @@ -5,29 +5,29 @@
    1. Install `perf`, using Brendan Gregg's guide:
    http://www.brendangregg.com/perf.html#Prerequisites
    2. Install `flamegraph` from repo:
    1. Clone the repo locally: `git clone https://github.com/brendangregg/FlameGraph`
    2. Add the main directory with all the `*.pl` Perl files to the path:
    ```
    cd
    echo "PATH=/path/to/FlameGraph:$PATH" >> .profile
    source .profile
    ```
    1. Clone the repo locally: `git clone https://github.com/brendangregg/FlameGraph`
    2. Add the main directory with all the `*.pl` Perl files to the path:
    ```
    cd
    echo "PATH=/path/to/FlameGraph:$PATH" >> .profile
    source .profile
    ```
    3. If you are running an older version of perf (any Linux kernel version before [v4.8-rc1](https://github.com/torvalds/linux/releases/tag/v4.8-rc1)), you should also:
    1. Clone [`rust-unmangle`](https://github.com/Yamakaky/rust-unmangle) and [add it to your path](https://stackoverflow.com/a/7360945):
    ```
    git clone https://github.com/Yamakaky/rust-unmangle.git
    ```
    2. Make `rust-unmangle` executable:
    ```
    cd rust-unmangle
    chmod u+x rust-unmangle
    ```
    3. Add it to your path:
    ```
    cd
    echo "PATH=/path/to/rust-unmangle:$PATH" >> .profile
    source .profile
    ```
    1. Clone [`rust-unmangle`](https://github.com/Yamakaky/rust-unmangle) and [add it to your path](https://stackoverflow.com/a/7360945):
    ```
    git clone https://github.com/Yamakaky/rust-unmangle.git
    ```
    2. Make `rust-unmangle` executable:
    ```
    cd rust-unmangle
    chmod u+x rust-unmangle
    ```
    3. Add it to your path:
    ```
    cd
    echo "PATH=/path/to/rust-unmangle:$PATH" >> .profile
    source .profile
    ```
    ## compiling and flamegraphing a binary
  7. @dlaehnemann dlaehnemann revised this gist Oct 25, 2018. 1 changed file with 6 additions and 4 deletions.
    10 changes: 6 additions & 4 deletions flamegraph_rust.md
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,6 @@
    # One-time setup
    # flamegraphing rust binaries' cpu usage with perf

    ## One-time setup

    1. Install `perf`, using Brendan Gregg's guide:
    http://www.brendangregg.com/perf.html#Prerequisites
    @@ -27,7 +29,7 @@ http://www.brendangregg.com/perf.html#Prerequisites
    source .profile
    ```

    # flamegraphing a binary
    ## compiling and flamegraphing a binary

    To turn on debugging information in the binary, to get actual function names in the flamegraph output, temporarily add to `Cargo.toml` (you should remove this for an actual release):

    @@ -77,7 +79,7 @@ perf report

    But really, you want to rather look at the flamegraph... ;)

    # use SVG files in GitHub issues and comments
    ## use SVG files in GitHub issues and comments

    It is not possible to directly include `.svg` files in GitHub issues and comments. However, I found a reasonable work-around:

    @@ -87,7 +89,7 @@ It is not possible to directly include `.svg` files in GitHub issues and comment
    For an example, have a look at this Pull Request:
    <https://github.com/PROSIC/libprosic/pull/48>

    # Sources
    ## Sources

    Many thanks to the people behind the following sources upon which I have built this little howto:
    * <http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html>
  8. @dlaehnemann dlaehnemann created this gist Oct 25, 2018.
    97 changes: 97 additions & 0 deletions flamegraph_rust.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,97 @@
    # One-time setup

    1. Install `perf`, using Brendan Gregg's guide:
    http://www.brendangregg.com/perf.html#Prerequisites
    2. Install `flamegraph` from repo:
    1. Clone the repo locally: `git clone https://github.com/brendangregg/FlameGraph`
    2. Add the main directory with all the `*.pl` Perl files to the path:
    ```
    cd
    echo "PATH=/path/to/FlameGraph:$PATH" >> .profile
    source .profile
    ```
    3. If you are running an older version of perf (any Linux kernel version before [v4.8-rc1](https://github.com/torvalds/linux/releases/tag/v4.8-rc1)), you should also:
    1. Clone [`rust-unmangle`](https://github.com/Yamakaky/rust-unmangle) and [add it to your path](https://stackoverflow.com/a/7360945):
    ```
    git clone https://github.com/Yamakaky/rust-unmangle.git
    ```
    2. Make `rust-unmangle` executable:
    ```
    cd rust-unmangle
    chmod u+x rust-unmangle
    ```
    3. Add it to your path:
    ```
    cd
    echo "PATH=/path/to/rust-unmangle:$PATH" >> .profile
    source .profile
    ```

    # flamegraphing a binary

    To turn on debugging information in the binary, to get actual function names in the flamegraph output, temporarily add to `Cargo.toml` (you should remove this for an actual release):

    ```
    [profile.release]
    debug = true
    ```

    Then compile with the `--release` flag, to get cargo to optimize the resulting binary. Otherwise, any slowness may be due to a lack of compiler optimisations:

    ```
    cargo build --release
    ```

    Run the cpu sampling with:

    ```
    perf record --call-graph dwarf,16384 -e cpu-clock -F 997 target/release/name-of-binary <command-line-arguments>
    ```

    Options here are:

    * `--call-graph dwarf,16384`: `dwarf` ensures correct stack dumps, as the standard frame pointers gave me incorrect stacks; the `,16384` doubles the stack dump size from the standard value, which has helped me avoid split stacks (I am assuming the smaller stack size did not suffice for deeper stacks and those stacks were cut off from the bottom, so blocks in the bottom were missing making a correct merging on those lower levels impossible. So try increasing this further in case you get weird split stacks with differences in the amount of lower levels.)
    * `-e cpu-clock`: this selects the event `cpu-clock` for perf sampling -- without it, the following argument did not really matter in my environment
    * `-F 997`: this ensures a sampling at 997 Hz, the value off from a round 1000 is to avoid lockstep sampling (see e.g. [Brendan Gregg's blog post from 2014](http://www.brendangregg.com/blog/2014-06-22/perf-cpu-sample.html))

    The resulting `report.perf` can be rather large, depending on the length of your example run and the sampling frequency selected, easily going into the GBs -- so make sure you have the space available. Based on the report, generate the flamegraph with:

    ```
    perf script | stackcollapse-perf.pl | c++filt | rust-unmangle | flamegraph.pl > flame.svg
    ```

    Tools here are:

    * `stackcollapse-pref.pl`: The stackcollapse Perl script by Brendan Gregg, which groups identical levels in stacks together. For installation instructions see above.
    * `c++filt`: This is a `C++` demangler / unmangler that takes care of demangling a lot of the Rust name mangling, as [Rust](https://en.wikipedia.org/wiki/Name_mangling#Rust) also uses [`C++` name mangling](https://github.com/rust-lang/rust/blob/76affa5d6f5d1b8c3afcd4e0c6bbaee1fb0daeb4/src/librustc_trans/back/symbol_names.rs#L358). It should be available in a standard linux installation.
    * `rust-unmangle`: This script unmangles (almost) all remaining names mangled by Rust. It is optional and should not be necessary in versions of perf from Linux kernel [v4.8-rc1](https://github.com/torvalds/linux/releases/tag/v4.8-rc1) onwards, as these should include rust unmangling code -- but I haven't tested the newer version and needed it for my older one, so who knows who else does, as well
    * `flamegraph.pl`: This Perl script by Brendan Gregg takes the collapsed stacks and renders them into the (interactive) `.svg` format.

    Inspect the `flame.svg` file by opening it in a browser and hovering over individual bars to get the respective function names displayed. You can also search for bars containing certain expressions (top right), click on bars to zoom in on them and reset the view (top left).

    Or inspect the non-collapsed report interactively by issuing:

    ```
    perf report
    ```

    But really, you want to rather look at the flamegraph... ;)

    # use SVG files in GitHub issues and comments

    It is not possible to directly include `.svg` files in GitHub issues and comments. However, I found a reasonable work-around:

    1. [Create a new GitHub gist while logged in](https://gist.github.com/). You'll need to create a new gist for each image, but you can easily drag-and-drop the file in there.
    2. Use the link to the gist in your GitHub comments, advising users to follow that link and then properly inspect it by right-clicking on the preview and selecting `View Image` (tested in current Firefox).

    For an example, have a look at this Pull Request:
    <https://github.com/PROSIC/libprosic/pull/48>

    # Sources

    Many thanks to the people behind the following sources upon which I have built this little howto:
    * <http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html>
    * <https://carol-nichols.com/2015/12/09/rust-profiling-on-osx-cpu-time/>
    * <https://blog.anp.lol/rust/2016/07/24/profiling-rust-perf-flamegraph/>
    * <https://gist.github.com/KodrAus/97c92c07a90b1fdd6853654357fd557a>
    * <https://www.reddit.com/r/rust/comments/4snw3k/linux_perf_gets_rust_symbol_demangling_support/>