stevedonovan · July 1, 2018 12:38 · Jul 1, 2018
diff --git a/common-rust-traits.md b/common-rust-traits.md
@@ -0,0 +1,445 @@
+## What is a Trait?
+
+In Rust, types containing data - structs, enums and any other 'aggregate'
+types like tuples and arrays - are dumb. They may have methods but that
+is just a convenience; they are just functions. Types have no
+relationship with each other.
+
+_Traits_ are the abstract mechanism for adding functionality to types
+and establishing relationships between them.
+
+## Printing Out: Display
+
+For a value to be printed out using "{}", it must _implement_ the [Display](?) trait.
+
+If we're only interested in how a value displays itself, then there are
+two ways to define functions taking such values. In this example, we
+want to print out slices of references to displayable values.
+
+The first is _generic_ where the element
+type of the slice is _any_ type that implements `Display`:
+
+```rust
+fn display_items_generic<T: Display> (items: &[&T]) {
+    for item in items.iter() {
+        println!("{}", item);
+    }
+}
+
+display_items_generic(&[&10, &20]);
+```
+Here the trait `Display` is acting as a constraint on a generic type.
+Separate code is generated for each distinct type `T`.  There is no
+direct analog with mainstream languages here - the closest would be C++
+[concepts](https://en.wikipedia.org/wiki/Concepts_(C%2B%2B)) which solves
+the "compile-time duck-typing" problem with C++ templates.
+
+The second is _polymorphic_, where the element type of the slice is a
+reference to `Display`.
+
+```rust
+fn display_items_polymorphic (items: &[&Display]) {
+    for item in items.iter() {
+        println!("{}", item);
+    }
+}
+
+display_items_generic(&[&10, "hello"]);
+```
+
+Code is only generated once for `display_items_polymorphic`, but we invoke
+different code for each type dynamically.  Note that the slice can now contain
+references to _any_ value that implements `Display`.  Here `Display` is
+acting very much like what is called an _interface_ in Java.
+
+The conversion involved is interesting: a reference to a concrete type
+becomes a _trait object_.  It's non-trivial because the trait object
+has two parts - the original reference and a 'virtual method table'
+containing the methods of the trait (a so-called "fat pointer").
+
+```rust
+let d: &Display = &10;
+```
+
+(A little _too_ much magic is happening here, and Rust is moving towards a
+more explicit notation for trait objects, `&dyn Display` etc.)
+
+How to decide between generic and polymorphic?  The second is more flexible,
+but involves going through _virtual methods_ which is slightly slower.
+Generic functions/structs can implement 'zero overhead abstractions'
+since the compiler can inline such functions.  The only honest answer is
+"it depends". Bear in mind that the actual cost of using trait objects
+might be negligible compared to the other work done by a program.  (It's hard
+to make engineering decisions based on micro-benchmarks.)
+
+Defining `Display` for your own types is straightforward but needs to be
+explicit, since the compiler cannot reasonably guess what the
+output format must be (unlike with [Debug](?))
+
+```rust
+use std::fmt;
+
+struct MyType {
+    x: u32,
+    y: u32
+}
+
+impl fmt::Display for MyType {
+    fn display(&self, f: &mut fmt::Formatter) -> fmt::Result {
+        write!(f, "x={},y={}", self.x, self.y)
+    }
+}
+```
+
+Any type that implements `Display` _automatically_ implements [ToString](?), so
+`42.to_string()`, `"hello".to_string()` all work as expected.
+
+(Rust traits often operate in little groups like this.)
+
+## Conversion: From and Into
+
+An important pair of traits is `From/Into`. The [From](?) trait expresses the conversion
+of one value into another using the `from` method. So we have `String::from("hello")` .
+If `From` is implemented, then the [Into](?) trait is auto-implemented.
+
+Since `String` implements `From<&str>`, then `&str` automatically implements `Into<String>`.
+
+```rust
+let s = String::from("hello");  // From
+let s: String = "hello".into(); // Into
+```
+The [json](?) crate provides a nice example. A JSON object is indexed with strings,
+and new fields can be created by inserting `JsonValue` values:
+
+```rust
+obj["surname"] = JsonValue::from("Smith"); // From
+obj["name"] = "Joe".into(); // Into
+obj["age"] = 35.into(); // Into
+```
+Note how convenient it is to use `into` here, instead of `from`! We are doing
+a conversion which Rust will not do implicitly, but `into()` is a small word,
+easy to type and read.
+
+`From` expresses a conversion that _always_ succeeds. It may be relatively expensive, though:
+converting a string slice to a `String` will allocate a buffer and copy the bytes. The
+conversion always takes place by value.
+
+`From/Info` has an intimate relationship with Rust error handling.
+
+This statement:
+
+```rust
+let res = returns_some_result()?;
+```
+is basically sugar for this:
+
+```rust
+let res = match returns_some_result() {
+    Ok(r) => r,
+    Err(e) => return Err(e.into())
+};
+```
+That is, any error type which can convert _into_ the returned error type works.
+
+A useful strategy for informal error handling is to make the function return
+`Result<T,Box<Error>>`.  Any type that implements `Error` can be converted
+into the trait object `Box<Error>`.
+
+## Making Copies: Clone and Copy
+
+`From` (and its mirror image `Into`) describe how distinct types are converted into
+each other. `Clone` describes how a new value of the same type can be created.
+Rust likes to make any potentially expensive operation obvious, so `val.clone()`.
+
+This can simply involve moving some bits around ("bitwise copy").
+A number is just a bit pattern in memory.
+
+But `String` is different, since as well as size and capacity fields,
+it has dynamically-allocated string data. To clone a string involves
+allocating that buffer and copying the original bytes into it.
+
+Making your types cloneable is easy, as long as every type in a struct or enum
+implements `Clone`:
+
+```rust
+#[derive(Debug,Clone)]
+struct Person {
+    first_name: String,
+    last_name: String,
+}
+```
+
+`Copy` is a _marker trait_ (there are no methods to implement) which says that
+a type may be copied by just moving bits. You can define it for your own
+structs:
+
+```rust
+#[derive(Debug,Clone,Copy)]
+struct Point {
+    x: f32,
+    y: f32,
+    z: f32
+}
+```
+Again, only possible if all types implement `Copy`. You cannot sneak in a
+non-`Copy` type like `String` here!
+
+This trait interacts with a key Rust feature: moving. Moving a value is always
+done by simply moving bits around.  If the value is `Copy`, then the original
+location remains valid.
+
+```rust
+let n1 = 42;
+let n2 = n1;
+// n1 is still fine (i32 is Copy)
+let s1 = "hello".to_string();
+let s2 = s1;
+// value moved into s2, s1 can no longer be used!
+```
+Bad things would happen if `s1` was still valid - both `s1` and `s2` would
+be dropped at the end of scope and their shared buffer would be deallocated twice!
+C++ handles this situation by always copying; in Rust you
+must say `s1.clone()`.
+
+## Fallible Conversions - FromStr
+
+If I have the integer `42`, then it is quite safe to convert this to an owned string,
+which is expressed by `ToString`.  However, if I have the string "42" then in general
+the conversion into `i32` must be prepared to fail.
+
+To implement [FromStr](?) takes two things; an implementation of the `from_str` method
+and setting the associated type `Err` to the error type returned when the conversion fails.
+
+Usually it's used implicitly through the string `parse` method. This is a method with
+a generic output type, which needs to be tied down.
+
+E.g. using the so-called turbofish operator:
+
+```rust
+let answer = match "42".parse::<i32>() {
+    Ok(n) => n,
+    Err(e) => panic!("'42' was not 42!");
+};
+```
+
+Or (more elegantly) in a function where we can use `?`:
+
+```rust
+let answer: i32 = "42".parse()?;
+```
+
+The Rust standard library defines `FromStr` for the numerical types and for network addresses.
+It is of course possible for external crates to define `FromStr` for their types and then
+they will work with `parse` as well.  This is a cool thing about the standard traits - they
+are all open for further extension.
+
+## Reference Conversions - AsRef
+
+[AsRef](?) expresses the situation where a cheap _reference_ conversion is possible
+between two types.
+
+The most common place you will see it in action is with `&Path`. In an ideal world,
+all file systems would enforce UTF-8 names and we could just use `String` to
+store them. However, we have not yet arrived at Utopia and Rust has a dedicated
+type `PathBuf` with specialized path handling methods, backed by `OsString`,
+which represents untrusted text from the OS. `&Path` is the borrowed counterpart
+to `PathBuf`. It is cheap to get a `&Path` reference from regular Rust strings
+so `AsRef` is appropriate:
+
+```rust
+// asref.rs
+fn exists(p: impl AsRef<Path>) -> bool {
+    p.as_ref().exists()
+}
+
+assert!(exists("asref.rs"));
+assert!(exists(Path::new("asref.rs")));
+let ps = String::from("asref.rs");
+assert!(exists(&ps));
+assert!(exists(PathBuf::from("asref.rs")));
+```
+
+This allows any function or method working with file system paths to be conveniently
+called with any type that implements `AsRef<Path>`.  From the documentation:
+
+```rust
+impl AsRef<Path> for Path
+impl AsRef<Path> for OsStr
+impl AsRef<Path> for OsString
+impl AsRef<Path> for str
+impl AsRef<Path> for String
+impl AsRef<Path> for PathBuf
+```
+
+Follow this pattern when defining a public API, because people are accustomed to
+this little convenience.
+
+`AsRef<str>` is implemented for `String`, so we can also say:
+
+```rust
+fn is_hello(s: impl AsRef<str>) {
+    assert_eq!("hello", s.as_ref());
+}
+
+is_hello("hello");
+is_hello(String::from("hello"));
+```
+This seems attractive, but using this is very much a matter of taste. Idiomatic Rust code
+prefers to declare string arguments as `&str` and lean on _deref coercion_
+for convenient passing of `&String` references.
+
+## Deref
+
+Many string methods in Rust are not actually defined on `String`. The methods
+explicitly defined typically _mutate_ the string, like `push` and `push_str`.
+But something like `starts_with` applies to string slices as well.
+
+At one point in Rust's history, this had to be done explicitly, so if you
+had a `String` called `s`, you would have to say 's.as_str().starts_with("hello")`.
+You will occasionally see `as_str()`, but mostly method resolution happens
+through the magic of _deref coercion_.
+
+The [Deref](?) trait is actually used to implement the "dereference" operator `*`.
+This has the same meaning as in C - extract the value which the reference is
+pointing to - although doesn't appear explicitly as much. If `r` is a reference,
+then you say `r.foo()`, but if you did want the value, you have to say `*r`
+(In this respect Rust references are more like C pointers than C++ references,
+which try to be indistinguishable from C++ values.)
+
+`String` implements `Deref`;  the type of `&*s` is `&str`.
+
+Deref coercion means that `&String` will implicitly convert into `&str`:
+
+```rust
+let s: String = "hello".into();
+let rs: &str = &s;
+```
+"Coercion" is a strong word, but this is one of the few places in Rust
+where type conversion happens silently. `&String` is a very
+different type to `&str`! I still remember my
+confusion when the compiler insisted that these types were distinct,
+especially with operators where the convenience of deref coercion
+does not happen.  The match operator matches types explicitly
+and this is where `s.as_str()` is still necessary - `&s` would not work:
+
+```
+let s = "hello".to_string();
+...
+match s.as_str() {
+    "hello" => {},
+    "dolly" => {},
+    ....
+}
+```
+
+It's idiomatic to use string slices in function arguments, knowing that
+`&String` will convert to `&str`.
+
+Deref coercion is also used to resolve methods - if the method isn't defined
+on `String`, then we try `&str`.
+
+A similar relationship holds between `Vec<T>` and `&[T]`. Likewise, it's
+not idiomatic to have `&Vec<T>` as a function argument type, since `&[T]`
+is more flexible and `&Vec<T>` will convert to `&[T]`.
+
+## Ownership: Borrow
+
+Ownership is an important concept in Rust; we have types like `String` that
+"own" their data, and types like `&str` that can "borrow" data from
+an owned typed.
+
+The [Borrow](?) trait solves a sticky problem with associative maps and sets.
+Typically we would keep owned strings in a `HashSet` to avoid borrowing blues.
+But we really don't want to _create_ a `String` to query set membership!
+
+```rust
+let mut set = HashSet::new();
+set.insert("one".to_string());
+// set is now HashSet<String>
+if set.contains("two") {
+    println!("got two!");
+}
+```
+The borrowed type `&str` can be used instead of `&String` here!
+
+## Iteration: Iterator and IntoIterator
+
+The [Iterator](?) trait is interesting. You are only required to implement
+one method - `next()` - and all that method must do is return an
+`Option` value each time it's called. When that value is `None` we
+are finished.
+
+However, there are a lot of _provided_ methods which have default
+implementations in `Iterator`. You get `map`,`filter`,etc for free.
+
+This is the verbose way to use an iterator:
+
+```rust
+let mut iter = [10, 20, 30].iter();
+while let Some(n) = iter.next() {
+    println!("got {}", n);
+}
+```
+The `for` statement provides a shortcut:
+
+```rust
+for n in [10, 20, 30].iter() {
+    println!("got {}", n);
+}
+```
+The expression here actually is _anything that can convert into an iterator_,
+which is expressed by `IntoIterator`.  So `for n in &[10, 20, 30] {...}` works
+as well - a slice is definitely not an iterator, but it implements
+`IntoIterator`.  Simularly, `for i in 0..10 {...}` involves a range expression
+implicitly converting into an iterator.  Iterators implement `IntoIterator`
+(trivially).
+
+So the `for` statement in Rust is specifically tied to a single trait.
+
+Iterators in Rust are a zero-overhead abstraction, which means that _usually_
+you do not pay a run-time penalty for using them. In fact, if you wrote out
+a loop over slice elements explicitly it would be slower because
+of run-time index range checks.
+
+The most general way to pass a sequence of values to a function is
+to use `IntoIterator`. Just using `&[T]` is too limited and requires the caller
+to build up a buffer (which could be both awkward and expensive), `Iterator<Item=T>`
+itself requires caller to call `iter()` etc.
+
+```rust
+fn sum (ii: impl IntoIterator<Item=i32>) -> i32 {
+    ii.into_iter().sum()
+}
+
+println!("{}", sum(0..9));
+println!("{}", sum(vec![1,2,3]));
+// cloned() here makes an interator over i32 from an interator over &i32
+println!("{}", sum([1,2,3].iter().cloned()));
+```
+
+## Conclusion: Why are there So Many Ways to Create a String?
+
+```rust
+let s = "hello".to_string();  // ToString
+let s = String::from("hello"); // From
+let s: String = "hello".into(); // Into
+let s = "hello".to_owned();  // ToOwned
+```
+This is a common complaint at first - people like to have one idiomatic way of
+doing common operations.  And curiously enough - none of these are actual
+`String` methods!
+
+But all these traits are needed, since they make truly generic programming possible;
+when you create strings in code, just pick one way and use it consistently.
+
+A consequence of Rust's dependence on traits is that it can take a while
+to [learn to read the documentation](https://stevedonovan.github.io/rust-gentle-intro/5-stdlib-containers.html).
+Knowing what methods can be called on a type depends on what traits are implemented for that type.
+
+However, Rust traits are not sneaky. They have to be brought into scope before they
+can be used. For instance, you need `use std::error::Error` before you can
+call `description()` on a type implementing `Error`.  A _lot_ of types are brought
+in by default by the Rust prelude, however.
+
+
+