curl -L https://github.com/docker/machine/releases/download/v0.8.2/docker-machine-`uname -s`-`uname -m` >/usr/local/bin/docker-machine && \
chmod +x /usr/local/bin/docker-machine
docker-machine version
docker-machine ls
docker-machine create --driver virtualbox default
docker-machine ls
docker-machine env default
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| ###### development tools | |
| sudo apt-get install build-essential python-dev git nodejs-legacy npm gnome-tweak-tool openjdk-8-jdk | |
| ### Python packages | |
| sudo apt-get install python-pip python-virtualenv python-numpy python-matplotlib | |
| ### pip packages | |
| pip install django flask django-widget-tweaks django-ckeditor beautifulsoup4 requests classifier SymPy ipython |
The below steps all assume you have installed Docker. I used the Kitematic tool for OSX, and it worked great. My local container VM IP is 192.168.99.100, replace that in the commands with your local IP!
-
Let's Set up Zeppelin
I am using this Docker image https://github.com/dylanmei/docker-zeppelin to fire up Zeppelin and Spark. Note, it's slow cause there is so many processes (Spark Master, Spark Worker, Zeppelin) to start!
docker run -d --name zeppelin -p 8080:8080 dylanmei/zeppelin
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| -- Query request times over time by user | |
| select distinct TheDay, user_name | |
| , (min_dat / 1000) as min_sec, (max_dat / 1000) as max_sec | |
| , (avg_dat / 1000) as avg_sec, (median_dat / 1000) as median_sec | |
| , query_cnt | |
| from ( | |
| select DATE(end_timestamp::timestamp) as TheDay, user_name | |
| , min(request_duration_ms) over(partition by DATE(end_timestamp::timestamp), user_name ) min_dat | |
| , max(request_duration_ms) over(partition by DATE(end_timestamp::timestamp), user_name ) max_dat | |
| , avg(request_duration_ms) over(partition by DATE(end_timestamp::timestamp), user_name ) avg_dat |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # see github repos & package documentation | |
| # - http://github.com/apache/spark/tree/master/R | |
| # - http://spark.apache.org/docs/latest/api/R/ | |
| # install the SparkR package | |
| devtools::install_github("apache/spark", ref="master", subdir="R/pkg") | |
| # load the SparkR & ggplot2 packages | |
| library('SparkR') | |
| library('ggplot2') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # preliminaires | |
| library("ggplot2") | |
| library("zoo") | |
| set.seed(111) | |
| # generate plot of survival curve | |
| x <- sort(dexp(seq(0, 1, 0.01)), decreasing = TRUE) | |
| ggplot(data.frame(x = c(0, 5)), aes(x)) + stat_function(fun = dexp, args = list(rate = 1)) + scale_x_continuous(labels=c(expression(t["0"], t["1"], t["2"], t["3"], t["4"], t["5"]))) + labs(x = "Time", y = expression(y = P(T > t["i"])), title = "Survival Function") | |
| # simulate subscription data |
Typing vagrant from the command line will display a list of all available commands.
Be sure that you are in the same directory as the Vagrantfile when running these commands!
vagrant up-- starts vagrant environment (also provisions only on the FIRST vagrant up)vagrant status-- outputs status of the vagrant machinevagrant halt-- stops the vagrant machinevagrant reload-- restarts vagrant machine, loads new Vagrantfile configurationvagrant provision-- forces reprovisioning of the vagrant machine
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| This post examines the features of [R Markdown](http://www.rstudio.org/docs/authoring/using_markdown) | |
| using [knitr](http://yihui.name/knitr/) in Rstudio 0.96. | |
| This combination of tools provides an exciting improvement in usability for | |
| [reproducible analysis](http://stats.stackexchange.com/a/15006/183). | |
| Specifically, this post | |
| (1) discusses getting started with R Markdown and `knitr` in Rstudio 0.96; | |
| (2) provides a basic example of producing console output and plots using R Markdown; | |
| (3) highlights several code chunk options such as caching and controlling how input and output is displayed; | |
| (4) demonstrates use of standard Markdown notation as well as the extended features of formulas and tables; and | |
| (5) discusses the implications of R Markdown. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| This post examines the features of [R Markdown](http://www.rstudio.org/docs/authoring/using_markdown) | |
| using [knitr](http://yihui.name/knitr/) in Rstudio 0.96. | |
| This combination of tools provides an exciting improvement in usability for | |
| [reproducible analysis](http://stats.stackexchange.com/a/15006/183). | |
| Specifically, this post | |
| (1) discusses getting started with R Markdown and `knitr` in Rstudio 0.96; | |
| (2) provides a basic example of producing console output and plots using R Markdown; | |
| (3) highlights several code chunk options such as caching and controlling how input and output is displayed; | |
| (4) demonstrates use of standard Markdown notation as well as the extended features of formulas and tables; and | |
| (5) discusses the implications of R Markdown. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| hadoop fs -cat /Work/lon_text/lon_order_data_t/cdw320_lon_order_data_t.1.txt | head -100 | gzip > test.csv.gz | |
| cat cdw320_lon_order_data_t.1.txt | head -100 | gzip > ../../tsnyder/cdw320_lon_order_data_t.1.txt.gz | |
| hadoop fs -cat /Work/tsnyder/cdw320_lon_order_data_t.1.txt.gz | gunzip |
NewerOlder