Skip to content

Instantly share code, notes, and snippets.

@jamescrowley
Forked from wincent/gist:598fa75e22bdfa44cf47
Last active August 29, 2015 14:25
Show Gist options
  • Select an option

  • Save jamescrowley/69266b31d9799ce1fff4 to your computer and use it in GitHub Desktop.

Select an option

Save jamescrowley/69266b31d9799ce1fff4 to your computer and use it in GitHub Desktop.

2015-01-29 Unofficial Relay FAQ

Disclaimer: I work on Relay at Facebook. Relay is a complex system on which we're iterating aggressively. I'll do my best here to provide accurate, useful answers, but the details are subject to change. I may also be wrong. Feedback and additional questions are welcome.

Table of Contents

What is Relay?

Relay is a new framework from Facebook that provides data-fetching functionality for React applications.

Each component specifies its own data dependencies declaratively using a query language called GraphQL. The data are made available to the component via properties on `this.props`.

Developers compose these React components naturally, and Relay takes care of composing the data queries into efficient batches, providing each component with exactly the data that it requested (and no more), updating those components when the data changes, and maintaining a client-side store (cache) of all data.

What is GraphQL?

GraphQL is a data querying language designed to describe the complex, nested data dependencies of modern applications. It's been in production use in Facebook's native apps for several years.

GraphQL itself is an engine for mapping from queries to code that is responsible for actually fetching the data, so is agnostic about what underlying storage is actually used. Relay uses GraphQL as its query language, but it is not tied to a specific implementation of GraphQL.

What's the value proposition?

By co-locating the queries with the view code, the developer can reason about what a component is doing by looking at it in isolation; it's not necessary to consider the context where the component was rendered in order to understand it. Components can be moved anywhere in a render hierarchy without having to apply a cascade of modifications to parent components or to the server code which prepares the data payload.

Co-location leads developers to fall into the "pit of success", because they get exactly the data they asked for and the data they asked for is explicitly defined right next to where it is used. This means that performance becomes the default (it becomes much harder to accidentally over-fetch), and components are more robust (under-fetching is also less likely for the same reason, so components won't try to render missing data and blow up at runtime).

Relay provides a predictable environment for developers by maintaining an invariant: a component won't be rendered until all the data it requested is available. Additionally, queries are defined statically (ie. we can extract queries from a component tree before rendering) and the GraphQL schema provides an authoritative description of what queries are valid, so we can validate queries early and fail fast when the developer makes a mistake.

The other thing Relay does to prevent errors is "masking" of the data it passes into each component. This means that only the fields of an object that a component explicitly asks for will be accessible to that component, even if other fields are known and cached in the store (because another component requested them). We call this masking, and it makes it impossible for implicit data dependency bugs to exist latently in the system.

Note that co-location in itself isn't the end goal here. At the moment our queries reside, explicitly, in the components, but through the power of static analysis (specifically, Flow and the type information encoded in the GraphQL schema), you can imagine a state where the queries can be inferred via an analysis of which subcomponents a component renders (and therefore, which subqueries need to be composed into the component query) and which properties it itself accesses. If we can get there, then over-fetching and under-fetching will go from "unlikely" to outright impossible.

By handling all data-fetching via a single abstraction, we're able to handle a bunch of things that would otherwise have to be dealt with repeatedly and pervasively across the application:

  • Performance: All queries flow through the framework code, where things that would otherwise be inefficient "N+1" query patterns get automatically collapsed and batched into efficient, minimal queries. Likewise, the framework knows which data have been previously requested, or for which requests are currently "in flight", so queries can be automatically de-duplicated and the minimal queries can be produced.
  • Subscriptions: All data flows into a single store, and all reads from the store are via the framework, so the framework knows which components care about which data and should be re-rendered when data changes; components never have to set up individual subscriptions.
  • Common patterns: We can make common patterns such as pagination easy (this is the example that Jing gave at the conference); if you have 10 records initially, getting the next page just means declaring you want 15 records in total, and the framework automatically constructs the minimal query to grab the delta between what you have and what you need, requests it, and re-renders your view when the data becomes available.
  • Simplified server implementation: Rather than having a proliferation of end-points (per action, per route), a single GraphQL endpoint can serve as a facade for any number of underlying resources.
  • Uniform mutations: There is one consistent pattern for performing mutations (writes), and it is conceptually baked into the data querying model itself. You can think of a mutation as a query with side-effects: you provide some parameters that describe the change to be made (eg. attaching a comment to a record) and a query that specifies the data you'll need to update your view of the world after the mutation completes (eg. the comment count on the record), and the data flows through the system using the normal flow. We can do an immediate "optimistic" update on the client (ie. update the view under the assumption that the write will succeed), and finally commit it or roll it back in the event of an error when the server payload comes back.

How does it relate to Flux?

In some ways Relay is inspired by Flux, but the mental model is much simpler. Instead of multiple stores, there is one central store that caches all GraphQL data. Instead of explicit subscriptions, the framework itself can track which data each component requests, and which components should be updated whenever the data change. Instead of actions, modifications take the form of "mutations".

(Note: I am somewhat of a Flux newbie, so my use of the terminology might not be totally correct here.)

What about routing?

Relay does have a notion of routes and routing, but it's one of the APIs that we're currently improving so I'll keep away from details (which may change) and try to speak in generalities.

Relay uses routes to determine which data to fetch to render a given component (it's possible for a component to be composed on any number of different views).

The data required to render a particular view is a function of the route and any query params that may be supplied in the route or by the component itself.

You can think of the route as a URI, which itself may contain "query params" (not necessarily part of a URI query string; the params may be embedded as path components with the URI).

When can we have it?

We're working very hard right now to get this ready for public consumption and we are super excited about sharing it with you, but we can't say yet when that will be. We'll keep you posted!

How does GraphQL work?

I'm an engineer on the Relay team, not on GraphQL itself, so I'll only answer this in a crude way.

The GraphQL engine parses queries into an AST representation that describes the requested object (obtained via what we call a "root call") and the desired fields from that object. Fields themselves may contain arbitrarily nested objects with their own fields.

Given this tree, it traverses the nodes evaluating an executor which knows how to retrieve objects, and how to access fields on the retrieved objects (for example, a field may map to a property on the object, or to a function that computes derived data or itself performs an arbitrary call to another service).

We create definition classes that contain rich metadata, and which are used to provide two things: (1) a schema that describes all the possible fields, relationships and types that can be represented in a valid query; and (2) the mapping from fields to the actual retrieval mechanism. These classes can be used to wrap business objects, or services, or any other data source we wish to expose to GraphQL.

This is made performant by a combination of two things: pervasive caching (so that if the same data is requested multiple times at different places in the tree, the actual data is fetched only once) and extensive use of asynchronous primitives (`async`/`await`) which enable us to effectively parallelize and batch operations.

Why aren't query params component state?

Jing's slide showed us modifying query params using a `this.setQueryParams` call, rather than React's `this.setState`.

This is because setting query params is an inherently async operation. Setting query params may trigger a network request. Multiple `setQueryParams` calls may be issued before the results of prior calls arrive. Later calls should supersede prior calls. The corresponding fetchs may fail or need to be retried. This complexity needs to be abstracted away.

The framework preserves an invariant that it won't try to render a component until all the data it requested are available, so it wouldn't do to only have, say, 10 objects in a list and to trigger a render when the query params have been updated to request 15 objects but the next 5 objects haven't arrived yet.

So, the `setQueryParams` API provides us with an abstraction behind which to hide the details of all this asynchrony. The framework can track both "current" and "pending" values of query params, and make sure that the component always sees the right value for any given query param (ie. the one that reflects the reality of what we have in the store at render time).

This is an API that we're actively working on right now, so it may change between now and the open source release.

Where is Relay being used?

Relay is being used in a few places in production. It's being used with React Native in the Facebook Groups app (currently on the iOS app store) and on an experimental new version of the Facebook mobile website that is currently rolled out to a small number of test users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment