{ # Here's the list of organization IDs. # I don't know if there's a limit to these lists. # Any time you see a plural ("nodes") it *usually* means we're # about to loop over a result set. But I think this is implementer # convention; "nodes" and "edges" are GraphQL reserved words, but # most of the rest aren't. # For the moment, I've only put two organization IDs in here. nodes(ids: ["MDEyOk9yZ2FuaXphdGlvbjYyMzM5OTQ=", "MDEyOk9yZ2FuaXphdGlvbjY0MzA3MA=="]) { # Join to a nested list of organization objects. id ... on Organization { name # The organization name # Further nesting: Repos. # Only asking for 100 because that's a GitHub-set limit. # When you want to filter the set of objects returned, it # seems to be done in this kind of way - as arguments to # the node list. So here, we're filtering for public repos. repositories(first: 100, privacy: PUBLIC) { nodes { # Metadata about the repos. I'm only fetching a few values # but I could, in theory, go wild with sub-queries about # contributors, pull requests, etc. # This is ALL coming back in this ONE query. name createdAt url homepageUrl languages(first:5) { nodes { name } } pullRequests(first:5, states:[OPEN]) { nodes { author { login } title } } } } } } # All the useful repo data is up above. # This following section is cost data about the query itself. # The GitHub GraphQL API gives API users a limited hourly budget, # in "points", to stop the system being overloaded. Every time # you execute a query, it subtracts the cost from your remaining # budget for the hour. # If you're experimenting with big queries, I recommend adding # this block to the *beginning* of your query, so you can keep # an eye on things. Remember that a low query cost becomes a # problem when you're doing a few hundred of these an hour. # For more information: https://developer.github.com/v4/guides/resource-limitations/ rateLimit { limit # Your maximum budget. Your budget is reset to this every hour. cost # The cost of this query. remaining # How much of your API budget remains. resetAt # The time (in UTC epoch seconds) when your budget will reset. } } # If it wasn't for the 100-node-per-list limit, we'd be able to get # *all* the repos for each organization with this query. Instead, # we need to do a series of queries which page through the repos, # a hundred at a time. However, note that we're actually getting 200 # repos back with this query - 100 per org. So the current plan is # to provide the org IDs for each gov org in GitHub (30 or so) and # list them all in the query, so we get up to 3000 repos back. # # So it's kind of horizontally-slicing across the org structure. # # (We've had some conversations with GitHub folks and they say that # this kind of query behaviour is fine, as long as we keep an eye # on our query cost limits.) # Future work: # # Would it be possible to provide the list of organization IDs in # a GraphQL variable rather than hardcoded into the query?