-
-
Save tomhopper/edb10f680510d092cd56 to your computer and use it in GitHub Desktop.
Revisions
-
jhofman created this gist
Jan 20, 2016 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,19 @@ library(dplyr) # create a dummy dataframe with 100,000 groups and 1,000,000 rows # and partition by group_id df <- data.frame(group_id=sample(1:1e5, 1e6, replace=T), val=sample(1:100, 1e6, replace=T)) %>% group_by(group_id) # filter rows with a value of 1 naively system.time(df %>% filter(val == 1)) # user system elapsed # 1.447 0.017 1.476 # ungroup before filtering for a huge speedup system.time(df %>% ungroup() %>% filter(val == 1)) # user system elapsed # 0.007 0.003 0.010