Just a few lines to show how to use the NYC TLC Taxi data in a DuckDB database in R.
It is assumed tht you already have some .parquet data in your ./data directory taken from here.
I am using all the data in the period 01-2019 to 07-2021, which is about 120M datapoints.
The aims here are:
- to show how fast operations are using DuckDB
 - the fact that one can use dplyr verbs to operate on DuckDB - as well as SQL of course