Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save djd0723/08f32691ccac03e56e966d28695ce92f to your computer and use it in GitHub Desktop.
Save djd0723/08f32691ccac03e56e966d28695ce92f to your computer and use it in GitHub Desktop.
How to turn a list of JSON objects into a Datasette

How to turn a list of JSON objects into a Datasette

This repository has a dataset of 184.879 crimes committed in Buenos Aires: https://github.com/ramadis/delitos-caba

Download the raw data like this:

wget 'https://github.com/ramadis/delitos-caba/releases/download/3.0/delitos.json'

Now use Pandas to load that into a dataframe:

>>> import pandas as pd
>>> df = pd.read_json(open('delitos.json'))
>>> df
                   barrio  cantidad_vehiculos  cantidad_victimas     comuna  \
0                FLORESTA                   2                  0  Comuna 10   
1                 PALERMO                   1                  0  Comuna 14   
2            VILLA CRESPO                   3                  0  Comuna 15   
3       PARQUE AVELLANEDA                   1                  0   Comuna 9   
4        VILLA GRAL MITRE                   1                  0  Comuna 11   
5                FLORESTA                   2                  0  Comuna 10   
6        PARQUE CHACABUCO                   1                  0   Comuna 7   
7                RECOLETA                   4                  0   Comuna 2   
8                  FLORES                   2                  0   Comuna 7   
9           NUEVA POMPEYA                   2                  0   Comuna 4   

Pandas can save that file to a SQLite database like so:

>>> import sqlite3
>>> conn = sqlite3.connect('delitos.db')
>>> df.to_sql('delitos', conn)

Now you can deploy that SQLite file as a web-based API using datasette publish. You'll need to install Zeit Now and Datasette first.

Include source information with command-line options:

datasette publish now delitos.db --source_url="https://github.com/ramadis/delitos-caba" --source="ramadis/delitos-caba"
> Deploying /private/var/folders/jj/fngnv0810tn2lt_kd3911pdc0000gp/T/tmpcagpkk5f/datasette under simonw
> Ready! https://datasette-tkgtsfyjyj.now.sh (copied to clipboard) [6s]
> Synced 2 files (399B) [0ms] 
> ...

A few seconds later: https://datasette-tkgtsfyjyj.now.sh

Let's give it a nice URL:

now alias https://datasette-tkgtsfyjyj.now.sh delitos-caba.now.sh

Now you can visit it at https://delitos-caba.now.sh/

Here's an example SQL query showing the most common barrios: https://delitos-caba.now.sh/delitos-6395324?sql=select+%22barrio%22%2C+count%28%2A%29+as+%22count%22+from+delitos++group+by+%22barrio%22+order+by+%22count%22+desc+limit+100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment