Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save diegopacheco/346e01c4af7790b8f484da0c74310f02 to your computer and use it in GitHub Desktop.

Select an option

Save diegopacheco/346e01c4af7790b8f484da0c74310f02 to your computer and use it in GitHub Desktop.

Revisions

  1. @rylev rylev revised this gist Sep 4, 2019. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion rust-in-large-organizations-notes.md
    Original file line number Diff line number Diff line change
    @@ -25,7 +25,7 @@
    - Jeremy F, FB
    - Manish
    - Ben, Google
    - Philip, Cumulo, Rust dev tools + infra
    - Philip, Qumulo, Rust dev tools + infra
    - Remi, Qumulo
    - Sebastian, MS, pushing for Rust adoption from sec pov
    - Thomas Ekerd, MS, site reliability engineer
  2. @rylev rylev revised this gist Sep 4, 2019. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions rust-in-large-organizations-notes.md
    Original file line number Diff line number Diff line change
    @@ -26,7 +26,7 @@
    - Manish
    - Ben, Google
    - Philip, Cumulo, Rust dev tools + infra
    - Remy, Cumulo
    - Remi, Qumulo
    - Sebastian, MS, pushing for Rust adoption from sec pov
    - Thomas Ekerd, MS, site reliability engineer
    - James, MS
    @@ -301,7 +301,7 @@
    - Increasing number of users -- C and C++ wanting to consume Rust APIs
    - Concerns:
    - unwinding
    - Cumulo: basically spent most of the last year preparing to do bidir FFI between Rust and C
    - Qumulo: basically spent most of the last year preparing to do bidir FFI between Rust and C
    - fairly larger codebase in a dialect of C
    - rules you can impose on C side which helps sometimes
    - in one direction (Rust calling C) we have been able to use bindgen
  3. @rylev rylev renamed this gist Sep 2, 2019. 1 changed file with 0 additions and 0 deletions.
    File renamed without changes.
  4. @rylev rylev created this gist Sep 2, 2019.
    428 changes: 428 additions & 0 deletions gistfile1.txt
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,428 @@
    # Rust in Large Organizations

    **Initially taken by Niko Matsakis and lightly edited by Ryan Levick**

    ## Agenda

    - Introductions
    - Cargo inside large build systems
    - FFI
    - Foundations and financial support

    ## Attending

    - Joe, Microsoft, Seattle Rust Meetup
    - Tom at Mozilla, using Rust for sync
    - Lena at Mozilla, sync storage etc
    - Jack Moffit at FB, Libra team
    - Brian Anderson at Pingcap
    - acrichto
    - erickt
    - dtolnay, David Tolnay
    - Raj Vengalil, Azure IoT
    - cuviper, Redhat
    - Rain, FB
    - Jeremy F, FB
    - Manish
    - Ben, Google
    - Philip, Cumulo, Rust dev tools + infra
    - Remy, Cumulo
    - Sebastian, MS, pushing for Rust adoption from sec pov
    - Thomas Ekerd, MS, site reliability engineer
    - James, MS
    - Brandom Williams, FB
    - JR, Mozilla backend services
    - Phil
    - Will, crash ingestion mozilla
    - Stjepan, Ferrous system

    ## cargo

    - FB dev env -- backend services repo -- is mostly C++
    and Java. Very polyglot environment. Glued together with Buck,
    FB's Bazel.
    - Buck: Language agnostic. Supports Rust.
    - rustc drops in quite nicely, basically equivalent to C++ compiler.
    - wanted to use cargo but it just does too much to fit in
    - need to delineate parts of cargo that are desired with those that conflict with Buck
    - ecosys is big advantage for Rust but hard to separate from cargo
    - current scheme:
    - big cargo.toml including all the things used in internal repo
    - cargo builds artifacts that are presented to buck
    - buck can link against those
    - reasonably successful
    - but approaching 700 crates in transitive dep graph, getting very cumbersome to rebuild etc
    - plus pinned to a specific version of compiler (prebuilt artifacts)
    - works ok but build.rs build scripts are a big complication
    - specific cargo pain points:
    - build scripts
    - "features" feature
    - a lot of crates don't use features the way they're intended -- they're used for exclusive A or B choices
    - this creates the possibility to break the build
    - need some sort of "cfg" feature that represents forks of a crate
    - Google does a similar thing for fuschia
    - cargo builds 3rd party artifacts, normal build consumes those
    - problems:
    - handful of 3rd party artifacts depend on things built in tree
    - want to be able to do partial builds, e.g. w/o a feature, or just for some targets
    - developing for a new OS, so we compile some code for host, some for target
    - presently do 2 full builds, but it's a pain
    - don't have as much control over the flags getting passed to rustc as we'd like
    - dep flags + linker flags aren't as specific as we need to distribute deps that are needed for indiv targets
    - prototype using "cargo raise" (use "gn" (from Chrome) to generate ninja files)
    - based on a modification of cargo raise that generates bazel build files
    - has its own handling of build.rs stuff
    - rather than outputting build files, it outputs a json format that could be the basis for the proposed "cargo build plans" feature
    - would be good to know what inputs etc are needed, how this would fit for Buck
    - can Buck consume internal files?
    - gn is aware of the concept of a Rust target
    - cumulo build system:
    - doesn't use cargo, invokes rustc directly
    - cargo just builds json
    - build all deps as shared libraries, whether or not they want that
    - `.so` libraries, `.rmeta` files
    - hits a lot of problems
    - ran into problems, notably lack of support for build.rs -- have to reimpl cargo
    - building for 2 different targets
    - have own platform
    - linux target for procedural macros
    - need sometimes to pass flags that are target specific, build a target config map
    - would prefer to use cargo
    - does cargo raise support build.rs?
    - has some builtin support for build.rs?
    - not automatic: you declare purpose of build.rs
    - things that do rustc version detection?
    - sometimes you want to (e.g.) disable build.rs that supply native deps which come from bazel
    - why can't you run build.rs as part of the build tool?
    - fundamental problems:
    - no declared inputs, no declared outputs
    - buck/bazel etc has to know what files the build script is consuming, producing etc
    - also, they are arbitrary execution, which can be a security concern
    - proc macros have some similar concerns.
    - e.g., pest which looks at cargo source dir env variable and finds your grammar def'n file
    - doesn't fit well
    - one thing that was discussed years ago:
    - capability system for build.rs that restrict what scripts can do
    - e.g., read from this directory, write to that one
    - cargo can then audit/sandbox to enforce said rules
    - run build script in a sandbox
    - e.g. crossvm has an impl of this inside of chrome; all crossvm devices run in their own jail
    - nontrivial engineering effort
    - could do at a higher level, sandbox
    - jeremy: build scripts classified into 3 or 4 distinct types, is this complete?
    - doing codegen. read a file, bindgen, etc
    - gateway to some other library, using pkgconfig or something to find the library, or they build it from source
    - feature detection on rustc
    - "scary ones" -- database reads, timestamps
    - plausibly could address those use cases in other ways
    - feature detection is an obvious one, e.g. we had an rfc for compiler versions
    - version compat is a common thing
    - what version of rust are people using?
    - stable
    - "stableish" -- bootstrap
    - nightly
    - who here is using toolchains distributed by rust?
    - ms (partially), mozilla, libra
    - why a custom toolchain?
    - config.toml tweaks
    - use clang's version of some unwinding code
    - custom linker
    - panic=abort
    - custom targets
    - compliance reasons (wanting to build from source for security reasons)
    - bootstrapping + compliance
    - where to get initial rust version?
    - several attempts:
    - most successful is using mrustc at version 1.22 and building from there
    - ms, google did that
    - is there a possibility of long term drift?
    - builds are not *quite* reproducible at present, but almost
    - was a point where build w/ mrustc + build with toolchain had non-matching hashes
    - might have to tweak the paths
    - in principle it can be done, should maybe prioritize it
    - maybe have an approved "how to bootstrap from C" documentation
    - specific reason fb builds from source:
    - want to always have the option to apply a local patch
    - don't want to get stuck with a "we must have this patch yesterday" scenario and have to figure out how to apply patch then
    - in most cases, also building llvm, want to share llvm for cross-lang LTO
    - must have a newer LLVM than what rust ships with
    - some folks have cross-lang LTO working
    - but rustc doesn't want to produce bitcode files
    - pass the linker `/bin/echo`
    - pgo -- coming soon
    - fb uses after the fact binary rewriting
    - splitting out linker was a potential change to rustc or cargo that google wants
    - would be interesting to know "here is what must be passed to gcc to successfully link"
    - another option: give a python script as the linker
    - turns out servo does it, too
    - show of hands survey:
    - "who is interested in a common backend for 'those things'"
    - nobody knows what that means
    - buck needs a "fully specified dep dag", seems like a common thing for other build systems
    - seems like we have to do a few cases to work out the general rules first
    - rudimentary cargo build plan support:
    - gives a dag of rustc executions
    - but it's too low level for buck, also bazel
    - pressure: every once in a while people propose "rewriting cargo.toml" into the tree
    - so far resisted that
    - a possible outcome buck has thought of:
    - buck support for cargo.toml
    - ton of code that's open source for people (natch) don't want to build w/ buck out of tree
    - want ability to simultaneously maintain buck/cargo support
    - currently done by hand and horrible
    - internally even people want this for mac/win builds which buck doesn't support
    - google w/ gn does something similar, keeps cargo.toml in order to upstream it
    - in some cases can generate a cargo.toml file programatically
    - also imp't for IDE support
    - IDE support
    - RLS kind of working with buck
    - knowing laughter :)
    - problematic assumptions: e.g., searching the filesystem for cargo.toml, but it's millions of files
    - symptom of a larger thing
    - cargo is designed for managing rust code
    - assumes source tree is mostly rust code
    - but often rust is embedded in a large source tree with tons of non-rust
    - so having some "root for all rust code" where you search below is problematic
    - top-level directory not gonna work
    - always having to create artificial "root" directories
    - rust-analyzer avoids this by not baking cargo in as deeply
    - but still has this "top level directory" model that contains all the rust code which means a small amount of rust amongst everything else
    - generating a cargo.toml for 1 project works well, but when you have multiple targets that interact
    - cumulo has a ton of C and Rust code that must be all combined into one big final artifact
    - IDE support that avoids cargo is a must
    - current state of the art: ctags
    - cramertj: cargo.toml is basically the intermediate repr for specifying deps
    - are there other things one might want?
    - build system has its own custom language to do that description
    - can use that to generate cargo.toml files though for IDE etc
    - what changes might one want in a "non-cargo IDE language"?
    - maybe cargo would work fine
    - manish: does this also cause problems for clippy and rustfmt?
    - cargo.toml is also useful for this
    - who uses clippy? most folks
    - rustfmt? most folks
    - fb invokes it on individual files for that
    - libra uses cargo to build
    - "cacheability" (sccache) has gotten worse over time
    - procedural macros aren't getting cached (dylibs)
    - are other people doing anything with this?
    - ff has a distributed cache in the office
    - (buck does caching of everything)
    - native deps? also integrated into buck
    - assume that if a C dep changes, rust must be rebuilt?
    - `-lnative` is not very well-scoped (just to a directory, not specific libs)
    - problem: can't cache link steps as a result
    - maybe also part of the problem with sccache
    - in buck, each lib gets its own directory, sidestepping this problem
    - linker want:
    - ability to specify a specific mapping from link name to the native library
    - option to ignore link directories or transform
    - in buck case, if you have a dep on a native library, you get two options (`-lfoo` and full path to foo)
    - crate features, misuse thereof:
    - people seem to want option to have mutually exclusive features
    - want to have impls clone etc for testing but not in a release build
    - hacked up something using cargo features but doesn't work all the time
    - problems:
    - dev dependency `foo` with feature "testing"
    - sometimes testing gets turned on semi-randomly (???)
    - but you can also accidentally use "testing" in a normal tree
    - deps for build scripts leak through to the real graph, perhaps part of the "semi-random" behavior
    - designing from the wrong direction, perhaps?
    - a lot of requirements coming up that are "above and beyond" existing cargo spec and design
    - contra: goal is to have cargo co-exist with buck/bazel/etc, these are the features needed for that?
    - do we want to build another tool that is not cargo?
    - but everybody already has a tool and wants to use it
    - but how can we do minimal work so that integration of cargo + these other tools is smoother
    - working with rest of rust ecosys
    - de facto standard that crates.io + cargo have created
    - defined entirely by impl of cargo
    - only access at present is through cargo's impl
    - refactoring cargo into indep chunks with better interfaces might be the sol'n (and has been discussed)
    - cargo build plans, but they're not there yet
    - key thing: version resolution, very much in cargo's domain, would be good to specify
    - external dependencies + FFI?
    - can we use FFI to talk to rust?
    - want module boundary between rust things, using ffi
    - today: build scripts in cargo exist, common thing is to build+link to native libraries
    - one of the things that cargo raise does, you can describe the purpose of a build.rs (e.g., primarily to produce that 3rd party lib)
    - but you can translate that to a dep for that native library in your build system
    - summarize + action items?
    - cramertj wants to know what
    - dtolnay is working on a potential design ideas for a successor to build.rs
    - cargo metadata description to specify what it is doing, maybe replace build.rs?
    - just listing inputs would be a huge improvement
    - yes but we want something that's *easier* than build.rs today, to incentivize it
    - caching, can we improve it
    - some of it may be low-hanging fruit, e.g. on mac `.a` file has timestamps
    - but part of it is the growing popularity of procedural macros (`.so` are uncachable by sccache)
    - if linker were more predictable, sccache could handle it, but it's not
    - might be able to handle by separating out linking
    - how to translate cargo.toml etc?
    - buck today runs cargo, takes output with dep info + rlib files
    - but new tool goal is to determine from cargo metadata
    - no way of "definitively connecting" resolved deps with unresolved deps
    - cargo vendor tends to be a bit overagressive
    - lots of things people want, seems to vary between groups
    - when developing procedural macros, could do better job of noticing token stream output hasn't changed..
    - incremental
    - sccache sometimes handles that well (e.g. w/ build.rs)
    - related topic: distributed builds
    - sccache has support for that
    - but maybe sends whole dep folder, not always ok
    - would need more precise dep information to handle that (passing precise info for *transitive* dependencies)
    - `--extern` is precise, but transitive deps are still figured out by rustc
    - related: would be nice if, for rustc, could pass all the sources explicitly
    - in buck do you list all sources?
    - yes but a lot of globs :)
    - would be nice to have a tool that handled all the easy cases, with room for "extra" cases here and there

    - alex: interested in solving a lot of these issues and have thoughts
    - open to talking later about this stuff
    - a lot of small details, bug fixes, etc -- long road, no silver bullet
    - some kind of "enterprise cargo" place to hold this discussion(s)
    - a lot of needs boil down to:
    - quick fix combined with longer re-architecture

    ## FFI

    - two distinct languages invoking one another
    - sometimes linked into one process, sometimes cross process (RPC)
    - COM requires symbols to be ABI compatible
    - inline assembly, direct syscalls
    - "C parity"
    - FFI with C and C++
    - FB is doing C++ interop, as is Google
    - FFI beyond C or C++?
    - Java
    - syscalls
    - C# perhaps
    - (Ruby, Python)
    - Bindings to other languages are often mediated through a C layer
    - Increasing number of users -- C and C++ wanting to consume Rust APIs
    - Concerns:
    - unwinding
    - Cumulo: basically spent most of the last year preparing to do bidir FFI between Rust and C
    - fairly larger codebase in a dialect of C
    - rules you can impose on C side which helps sometimes
    - in one direction (Rust calling C) we have been able to use bindgen
    - but in the other direction (C calling Rust) we wrote a compiler plugin (uh oh) to generate C headers
    - Specification questions
    - concerned about cross-lang lto revealing a lot of interactions
    - Cross-lang thin lto
    - Dynamic testing and static testing
    - Have aliasing rules proven to be a problem?
    - FB: not so much. Mostly mediating rules through bindgen and trying to set things up to get compilation failures
    - Google: currently checking for changes
    - Google: pursuing a bit ways to annotate C and C++ headers so that can generate safe rust signatures from it
    - might be an interesting thing to standardize on
    - bindgen has a cumbersome mechanism for that (do)
    - would be nice to include small shim layers e.g. to translate to `Result`
    - FB:
    - C++ codebase in FB uses exceptions, have wrappers that captures and converts exceptions, this becomes a `Result` on the Rust side
    - manually annotating noexcept functions? basically all of them can
    - C headers are manually created with a `try { } except` block in C++
    - the code being interop'd is mostly C++ but have to manually write C APIs for it
    - build with panic=abort? no, unwind
    - also catching Rust exceptions at boundary?
    - C code doesn't call into Rust code that often
    - happy to make it abort though
    - but mozilla wants to handle panics, though it does it by translating it into a swift/java exception
    - usually the purpose is wanting to capture the call stack and report it
    - in theory could panic=abort if could capture java stack
    - FB sets a custom panic handler to report errors, then exits (could use panic=abort)
    - For COM FFI case? how handling virtual dispatch
    - manual adaptation with vtables and things
    - on Rust side, does that "look like" a trait?
    - active area of investigation
    - believe that (with proc macro support) can expose a trait that is actually a struct + vtable
    - similar to what GNOME projects are doing for glib bindings
    - mozilla does it for XPCOM, which is basically same thing
    - various bits of existing crates, but it's mostly nasty
    - Jeremy: one thing I've been thinking about:
    - standard set of library functions corresponding to C++ types
    - e.g. some way to use std-string from within rust code
    - good to have for templated types (unique-ptr, shared-ptr, and so on)
    - all types that can be directly used from Rust in some way
    - quite clunky today to have a C++ function that returns something Rust can use
    - on C side, it'd use the plain C++ types
    - but on Rust side, it'd invoke and do the right things
    - one of the pieces needed for C++ interop
    - instantiate the vec/string/other impls
    - should this part of bindgen?
    - missing part: manually instantiating separate things for each specialization
    - major topics of FFI
    - being able to "use header files" and get a "reasonably safe" FFI in Rust
    - what are building blocks we'd need to move things to user space?
    - template instantation list is one building block -- somebody has to write the tool, nothing needed from rustc
    - expectation is that there is always some work to manually bind
    - but what is minimal work we can do to make it easy to translate
    - annotations might be company specific -- fb vs google?
    - maybe? but can we collaborate?
    - different C++ dialects and patterns in use
    - what about from other languages, esp. around C++?
    - closest inspiration might come from Swift
    - rich bindings from Rust to C++ for hashmaps etc
    - because FB uses thrift for RPC mechanism (and sometimes FFI)
    - would be useful to be able to do tricks like that for hashmap and sets perhaps
    - some kind of tool for consuming a C++ header file to automatically produce an interface in Rust
    - complication in some environments: multiple allocators

    ## use of unsafe

    - ms: would like to know how to control use of unsafe in codebase
    - google: grep
    - servo used the compiler directives to disallow unsafe where possible
    - in some cases, allow unsafe within a specific file
    - integrate with review tool to draw attention
    - unsafe is really many things: sometimes simple, sometimes not
    - C++ code: all unsafe? not reviewed under the same standard?
    - more interesting question is unsafe in dependencies
    - auditing in crate graph in general is a problem
    - geometric growth of deps
    - how do you audit safe code?
    - would be great if there were some central place doing auditing (and getting paid to do it)
    - but we'd also need some mechanism to declare what's been audited etc
    - blessed crates and versions
    - let crates.io metadata include auditing
    - presumably want to know also things like 2fa, review policy, etc
    - attacks these days are very targeted in other ecosystems -- e.g., replacing specific versions of crates to attack specific targets
    - number of deps are in the hundreds, ranging from a few hundred to ~800 depending on project
    - in some cases, can pull in a frozen diff and not update
    - but not all
    - auditing of the compiler itself?
    - would prefer to have two implementations maybe

    ## "governance"

    - MS: do we know what's going into the compiler?
    - do we know what changes are going in?
    - FB: not been a big concern of ours
    - in some cases, had issues where things got stabilized or bug fixes that broke code
    - would like to be canarying the nightly compiler regularly
    - but having more impl's would increase confidence
    - ways to support?
    - contracting
    - full time hires
    - how can we give $$ to rust org?
    - need a foundation
    - money/resources for Rust CI
    - participating in crater?
    - working on a way to run crater and send back pass/fail
    - ecosystem support
    - filling gaps in ecosystem
    - supporting key crates
    - helping to file GSoc proposals?

    ## will we do this again? how to continue these conversations?

    - don't need super frequent updates
    - most helpful thing is to identify topics and spin off topics
    - try to provide feedback for roadmap
    - organize a regular meeting on zulip to talk about issues
    - quarterly maybe
    - we might want to consider f2f meetings in other conferences or at least in europe
    - maybe rustfest
    - key point:
    - don't want to alienate and separate enterprise from the Rust community at large
    - focusing on working groups and zulip for communication is a win