-
-
Save anirtek/3eee39106cfaae46a0fa37e868248b43 to your computer and use it in GitHub Desktop.
Revisions
-
tobilg revised this gist
Mar 14, 2016 . 1 changed file with 5 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -26,4 +26,8 @@ sc.hadoopConfiguration.set("fs.s3a.connection.ssl.enabled", "false"); You can use s3a urls like this: s3a://<<BUCKET>>/<<FOLDER>>/<<FILE>> Also, it is possible to use the credentials in the path: s3a://<<ACCESS_KEY>>:<<SECRET_KEY>>@<<BUCKET>>/<<FOLDER>>/<<FILE>> -
tobilg revised this gist
Mar 14, 2016 . 1 changed file with 5 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,6 +1,10 @@ # Custom S3 endpoints with Spark To be able to use custom endpoints with the latest Spark distribution, one needs to add an external package (`hadoop-aws`). Then, custum endpoints can be configured according to [docs](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html). ## Use the `hadoop-aws` package bin/spark-shell --packages org.apache.hadoop:hadoop-aws:2.7.2 ## SparkContext configuration -
tobilg created this gist
Mar 14, 2016 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,25 @@ ## Use the `hadoop-aws` package bin/spark-shell --packages org.apache.hadoop:hadoop-aws:2.7.1 ## SparkContext configuration Add this to your application, or in the `spark-shell`: ```scala sc.hadoopConfiguration.set("fs.s3a.endpoint", "<<ENDPOINT>>"); sc.hadoopConfiguration.set("fs.s3a.access.key","<<ACCESS_KEY>>"); sc.hadoopConfiguration.set("fs.s3a.secret.key","<<SECRET_KEY>>"); ``` If your endpoint doesn't support HTTPS, then you'll need the following: ```scala sc.hadoopConfiguration.set("fs.s3a.connection.ssl.enabled", "false"); ``` ## S3 url usage You can use s3a urls like this: s3a://<<bucket>>/<<folder>>/<<file>>