-
-
Save lemanuel/97deb534991ca8eff415b4e1ab4221ef to your computer and use it in GitHub Desktop.
Revisions
-
jaceklaskowski revised this gist
Sep 28, 2016 . 1 changed file with 6 additions and 6 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -9,6 +9,12 @@ ## Spark SQL 8. Creating custom [Encoder](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoder) * [SPARK-17668 Support representing structs with case classes and tuples in spark sql udf inputs](https://issues.apache.org/jira/browse/SPARK-17668) * Create an encoder between your custom domain object of type `T` and JSON or CSV * See [Encoders](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoders$) for available encoders. * Read [Encoders - Internal Row Converters](https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-sql-Encoder.html) * (advanced/integration) Create an encoder for [Apache Arrow](https://arrow.apache.org/) (esp. after the [arrow-0.1.0 RC0](http://mail-archives.apache.org/mod_mbox/arrow-dev/201609.mbox/%3CCAO%2Bvc4BCBFY_3ZoASQ9UcMjOX_OjDg2nE9rTCoC3G5CiKqUC1w%40mail.gmail.com%3E) release candidate has recently been announced) and [ARROW-288 Implement Arrow adapter for Spark Datasets](https://issues.apache.org/jira/browse/ARROW-288). 1. Custom format, i.e. `spark.read.format(...)` or `spark.write.format(...)` 2. Multiline JSON reader / writer 2. `SQLQueryTestSuite` - this is a very fresh thing in Spark 2.0 to write tests for Spark SQL @@ -18,12 +24,6 @@ 7. (done) Developing a custom [RuleExecutor](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala#L46) and enabling it in Spark * Answering [Extending Spark Catalyst optimizer with own rules](http://stackoverflow.com/q/36152173/1305344) on StackOverflow * [Sparkathon - Developing Spark Extensions in Scala](http://www.meetup.com/WarsawScala/events/234156519/) on Sep 28th ## Spark MLlib -
jaceklaskowski revised this gist
Sep 28, 2016 . 1 changed file with 3 additions and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -15,8 +15,9 @@ * [Changelog](https://github.com/apache/spark/commits/master/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala) 4. http://stackoverflow.com/questions/39073602/i-am-running-gbt-in-spark-ml-for-ctr-prediction-i-am-getting-exception-because 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. (done) Developing a custom [RuleExecutor](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala#L46) and enabling it in Spark * Answering [Extending Spark Catalyst optimizer with own rules](http://stackoverflow.com/q/36152173/1305344) on StackOverflow * [Sparkathon - Developing Spark Extensions in Scala](http://www.meetup.com/WarsawScala/events/234156519/) on Sep 28th 8. Creating custom [Encoder](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoder) * [SPARK-17668 Support representing structs with case classes and tuples in spark sql udf inputs](https://issues.apache.org/jira/browse/SPARK-17668) * Create an encoder between your custom domain object of type `T` and JSON or CSV -
jaceklaskowski revised this gist
Sep 28, 2016 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -18,6 +18,7 @@ 7. Developing a custom [RuleExecutor](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala#L46) and enabling it in Spark * Answering [Extending Spark Catalyst optimizer with own rules](http://stackoverflow.com/q/36152173/1305344) on StackOverflow 8. Creating custom [Encoder](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoder) * [SPARK-17668 Support representing structs with case classes and tuples in spark sql udf inputs](https://issues.apache.org/jira/browse/SPARK-17668) * Create an encoder between your custom domain object of type `T` and JSON or CSV * See [Encoders](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoders$) for available encoders. * Read [Encoders - Internal Row Converters](https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-sql-Encoder.html) -
jaceklaskowski revised this gist
Sep 23, 2016 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -13,10 +13,10 @@ 2. Multiline JSON reader / writer 2. `SQLQueryTestSuite` - this is a very fresh thing in Spark 2.0 to write tests for Spark SQL * [Changelog](https://github.com/apache/spark/commits/master/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala) 4. http://stackoverflow.com/questions/39073602/i-am-running-gbt-in-spark-ml-for-ctr-prediction-i-am-getting-exception-because 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. Developing a custom [RuleExecutor](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala#L46) and enabling it in Spark * Answering [Extending Spark Catalyst optimizer with own rules](http://stackoverflow.com/q/36152173/1305344) on StackOverflow 8. Creating custom [Encoder](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoder) * Create an encoder between your custom domain object of type `T` and JSON or CSV * See [Encoders](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoders$) for available encoders. -
jaceklaskowski revised this gist
Sep 22, 2016 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -21,7 +21,7 @@ * Create an encoder between your custom domain object of type `T` and JSON or CSV * See [Encoders](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoders$) for available encoders. * Read [Encoders - Internal Row Converters](https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-sql-Encoder.html) * (advanced/integration) Create an encoder for [Apache Arrow](https://arrow.apache.org/) (esp. after the [arrow-0.1.0 RC0](http://mail-archives.apache.org/mod_mbox/arrow-dev/201609.mbox/%3CCAO%2Bvc4BCBFY_3ZoASQ9UcMjOX_OjDg2nE9rTCoC3G5CiKqUC1w%40mail.gmail.com%3E) release candidate has recently been announced) and [ARROW-288 Implement Arrow adapter for Spark Datasets](https://issues.apache.org/jira/browse/ARROW-288). ## Spark MLlib -
jaceklaskowski revised this gist
Sep 22, 2016 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -21,7 +21,7 @@ * Create an encoder between your custom domain object of type `T` and JSON or CSV * See [Encoders](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoders$) for available encoders. * Read [Encoders - Internal Row Converters](https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-sql-Encoder.html) * (advanced) Create an encoder for [Apache Arrow](https://arrow.apache.org/) (esp. after the [arrow-0.1.0 RC0](http://mail-archives.apache.org/mod_mbox/arrow-dev/201609.mbox/%3CCAO%2Bvc4BCBFY_3ZoASQ9UcMjOX_OjDg2nE9rTCoC3G5CiKqUC1w%40mail.gmail.com%3E) release candidate has recently been announced). ## Spark MLlib -
jaceklaskowski revised this gist
Sep 22, 2016 . 1 changed file with 3 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -18,7 +18,10 @@ 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. Developing a custom [RuleExecutor](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala#L46) and enabling it in Spark 8. Creating custom [Encoder](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoder) * Create an encoder between your custom domain object of type `T` and JSON or CSV * See [Encoders](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoders$) for available encoders. * Read [Encoders - Internal Row Converters](https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-sql-Encoder.html) * (advanced) Create an encoder for [Apache Arrow](https://arrow.apache.org/) (esp. after the [arrow-0.1.0 RC0](http://mail-archives.apache.org/mod_mbox/arrow-dev/201609.mbox/browser) release candidate has recently been announced). ## Spark MLlib -
jaceklaskowski revised this gist
Sep 16, 2016 . 1 changed file with 3 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -34,7 +34,9 @@ ## Misc 1. Develop a new Scala-only TCP-based [Apache Kafka](http://kafka.apache.org/) client * [A Guide To The Kafka Protocol](https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol) * [KAFKA-3360 Add a protocol page/section to the official Kafka documentation](https://issues.apache.org/jira/browse/KAFKA-3360) * See [Scala Kafka Client](https://github.com/cakesolutions/scala-kafka-client) for inspiration yet it's just _"a thin Scala wrapper over the official Apache Kafka Java Driver"_ 9. Working on Issues reported in [TensorFrames](https://github.com/databricks/tensorframes/issues). 10. Review open issues in [Spark's JIRA](https://issues.apache.org/jira/browse/SPARK-17375?jql=project%20%3D%20SPARK%20AND%20status%20%3D%20Open) and pick one to work on. -
jaceklaskowski revised this gist
Sep 15, 2016 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -34,5 +34,7 @@ ## Misc 1. Develop a new Scala-only Kafka client * See [Scala Kafka Client](https://github.com/cakesolutions/scala-kafka-client) for inspiration yet it's just _"a thin Scala wrapper over the official Apache Kafka Java Driver"_ 9. Working on Issues reported in [TensorFrames](https://github.com/databricks/tensorframes/issues). 10. Review open issues in [Spark's JIRA](https://issues.apache.org/jira/browse/SPARK-17375?jql=project%20%3D%20SPARK%20AND%20status%20%3D%20Open) and pick one to work on. -
jaceklaskowski revised this gist
Sep 14, 2016 . 1 changed file with 2 additions and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -30,9 +30,9 @@ ## Core 1. Monitoring executors (metrics, e.g. memory usage) using [SparkListener.onExecutorMetricsUpdate](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.scheduler.SparkListener@onExecutorMetricsUpdate(executorMetricsUpdate:org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate):Unit). ## Misc 9. Working on Issues reported in [TensorFrames](https://github.com/databricks/tensorframes/issues). 10. Review open issues in [Spark's JIRA](https://issues.apache.org/jira/browse/SPARK-17375?jql=project%20%3D%20SPARK%20AND%20status%20%3D%20Open) and pick one to work on. -
jaceklaskowski revised this gist
Sep 14, 2016 . 1 changed file with 4 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -28,6 +28,10 @@ * Problem to zapis Pipeline z tym Transformera, odczyt i użycie. 8. Spark MLlib 2.0 Activator ## Core 1. Monitoring executors (metrics, e.g. memory usage) using [SparkListener.onExecutorMetricsUpdate](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.scheduler.SparkListener@onExecutorMetricsUpdate(executorMetricsUpdate:org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate):Unit) ## Misc 9. Working on Issues reported in [TensorFrames](https://github.com/databricks/tensorframes/issues) -
jaceklaskowski revised this gist
Sep 12, 2016 . 1 changed file with 2 additions and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -4,8 +4,8 @@ 1. Developing a custom [StreamSourceProvider](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.sources.StreamSourceProvider) 2. Migrating TextSocketStream to SparkSession (currently uses SQLContext) 3. Developing Sink and Source for [Apache Kafka](http://kafka.apache.org/) 4. JDBC support (with PostgreSQL as the database) ## Spark SQL -
jaceklaskowski revised this gist
Sep 8, 2016 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -10,6 +10,7 @@ ## Spark SQL 1. Custom format, i.e. `spark.read.format(...)` or `spark.write.format(...)` 2. Multiline JSON reader / writer 2. `SQLQueryTestSuite` - this is a very fresh thing in Spark 2.0 to write tests for Spark SQL * [Changelog](https://github.com/apache/spark/commits/master/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala) * Filipe -
jaceklaskowski revised this gist
Sep 6, 2016 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -16,7 +16,7 @@ 4. http://stackoverflow.com/questions/39073602/i-am-running-gbt-in-spark-ml-for-ctr-prediction-i-am-getting-exception-because 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. Developing a custom [RuleExecutor](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala#L46) and enabling it in Spark 8. Creating custom [Encoder](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoder) * See [Encoders](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoders$) for available encoders. ## Spark MLlib -
jaceklaskowski revised this gist
Sep 6, 2016 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -16,6 +16,8 @@ 4. http://stackoverflow.com/questions/39073602/i-am-running-gbt-in-spark-ml-for-ctr-prediction-i-am-getting-exception-because 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. Developing a custom [RuleExecutor](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala#L46) and enabling it in Spark 8. Creating custom [Encoder](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoder). * See [Encoders](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Encoders$) for available encoders. ## Spark MLlib -
jaceklaskowski revised this gist
Sep 6, 2016 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -4,6 +4,8 @@ 1. Developing a custom [StreamSourceProvider](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.sources.StreamSourceProvider) 2. Migrating TextSocketStream to SparkSession (currently uses SQLContext) 3. [Apache Kafka](http://kafka.apache.org/) support 4. JDBC support ## Spark SQL -
jaceklaskowski revised this gist
Sep 6, 2016 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -23,7 +23,7 @@ * Problem to zapis Pipeline z tym Transformera, odczyt i użycie. 8. Spark MLlib 2.0 Activator ## Misc 9. Working on Issues reported in [TensorFrames](https://github.com/databricks/tensorframes/issues) 10. Review open issues in [Spark's JIRA](https://issues.apache.org/jira/browse/SPARK-17375?jql=project%20%3D%20SPARK%20AND%20status%20%3D%20Open) and pick one to work on. -
jaceklaskowski revised this gist
Sep 6, 2016 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,4 +1,4 @@ # Spark-a-thon - Development Activities ## Structured Streaming -
jaceklaskowski revised this gist
Sep 6, 2016 . 1 changed file with 4 additions and 6 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,13 +1,11 @@ # Spark-a-thon -- Development Activities ## Structured Streaming 1. Developing a custom [StreamSourceProvider](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.sources.StreamSourceProvider) 2. Migrating TextSocketStream to SparkSession (currently uses SQLContext) ## Spark SQL 1. Custom format, i.e. `spark.read.format(...)` or `spark.write.format(...)` 2. `SQLQueryTestSuite` - this is a very fresh thing in Spark 2.0 to write tests for Spark SQL @@ -17,7 +15,7 @@ 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. Developing a custom [RuleExecutor](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala#L46) and enabling it in Spark ## Spark MLlib 5. Creating custom Transformer * Example: [Tokenizer](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.feature.Tokenizer) -
jaceklaskowski revised this gist
Sep 6, 2016 . 1 changed file with 17 additions and 9 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,23 +1,31 @@ # Spark-a-thon ## Topics ### Structured Streaming 1. Developing a custom [StreamSourceProvider](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.sources.StreamSourceProvider) 2. Migrating TextSocketStream to SparkSession (currently uses SQLContext) ### Spark SQL 1. Custom format, i.e. `spark.read.format(...)` or `spark.write.format(...)` 2. `SQLQueryTestSuite` - this is a very fresh thing in Spark 2.0 to write tests for Spark SQL * [Changelog](https://github.com/apache/spark/commits/master/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala) * Filipe 4. http://stackoverflow.com/questions/39073602/i-am-running-gbt-in-spark-ml-for-ctr-prediction-i-am-getting-exception-because 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. Developing a custom [RuleExecutor](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala#L46) and enabling it in Spark ### Spark MLlib 5. Creating custom Transformer * Example: [Tokenizer](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.feature.Tokenizer) * Jonatan + Kuba + lejdis (Justyna + Magda) * Problem to zapis Pipeline z tym Transformera, odczyt i użycie. 8. Spark MLlib 2.0 Activator ### Misc 9. Working on Issues reported in [TensorFrames](https://github.com/databricks/tensorframes/issues) 10. Review open issues in [Spark's JIRA](https://issues.apache.org/jira/browse/SPARK-17375?jql=project%20%3D%20SPARK%20AND%20status%20%3D%20Open) and pick one to work on. -
jaceklaskowski revised this gist
Sep 6, 2016 . 1 changed file with 3 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -2,7 +2,9 @@ ## Agenda Proposal 1. (Structured Streaming) Developing a custom [StreamSourceProvider](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.sources.StreamSourceProvider) 2. (Structured Streaming) Migrating TextSocketStream to SparkSession (currently uses SQLContext) 1. (Spark SQL) Custom MF format 2. `SQLQueryTestSuite` - this is a very fresh thing in Spark 2.0 to write tests for Spark SQL * [Changelog](https://github.com/apache/spark/commits/master/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala) * Filipe -
jaceklaskowski renamed this gist
Sep 2, 2016 . 1 changed file with 7 additions and 4 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,3 +1,7 @@ # Spark-a-thon ## Agenda Proposal 1. Custom MF format 2. `SQLQueryTestSuite` - this is a very fresh thing in Spark 2.0 to write tests for Spark SQL * [Changelog](https://github.com/apache/spark/commits/master/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala) @@ -11,8 +15,7 @@ * Jonatan + Kuba + lejdis (Justyna + Magda) * Problem to zapis Pipeline z tym Transformera, odczyt i użycie. 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. Developing a custom [RuleExecutor](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala#L46) and enabling it in Spark 8. Spark MLlib 2.0 Activator 9. Working on Issues reported in [TensorFrames](https://github.com/databricks/tensorframes/issues) 10. Review open issues in [Spark's JIRA](https://issues.apache.org/jira/browse/SPARK-17375?jql=project%20%3D%20SPARK%20AND%20status%20%3D%20Open) and pick one to work on. -
jaceklaskowski revised this gist
Aug 24, 2016 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -4,6 +4,7 @@ * Filipe 3. https://issues.apache.org/jira/browse/SPARK-17156 * Jacek * [The complete working example in Scala (with sbt)](https://github.com/jaceklaskowski/spark-workshop/tree/master/solutions/multinomial-logistic-regression) 4. http://stackoverflow.com/questions/39073602/i-am-running-gbt-in-spark-ml-for-ctr-prediction-i-am-getting-exception-because 5. Creating custom Transformer * Example: [Tokenizer](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.feature.Tokenizer) -
jaceklaskowski revised this gist
Aug 24, 2016 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -12,5 +12,6 @@ 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. Developing RuleExecutor 8. Spark MLlib 2.0 Activator 9. TensorFlow Mateusz bez krawata...myśli -
jaceklaskowski revised this gist
Aug 24, 2016 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -11,5 +11,6 @@ * Problem to zapis Pipeline z tym Transformera, odczyt i użycie. 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. Developing RuleExecutor 8. Spark MLlib 2.0 Activator Mateusz bez krawata...myśli -
jaceklaskowski revised this gist
Aug 24, 2016 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -7,7 +7,7 @@ 4. http://stackoverflow.com/questions/39073602/i-am-running-gbt-in-spark-ml-for-ctr-prediction-i-am-getting-exception-because 5. Creating custom Transformer * Example: [Tokenizer](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.feature.Tokenizer) * Jonatan + Kuba + lejdis (Justyna + Magda) * Problem to zapis Pipeline z tym Transformera, odczyt i użycie. 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. Developing RuleExecutor -
jaceklaskowski revised this gist
Aug 24, 2016 . 1 changed file with 2 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,5 +1,6 @@ 1. Custom MF format 2. `SQLQueryTestSuite` - this is a very fresh thing in Spark 2.0 to write tests for Spark SQL * [Changelog](https://github.com/apache/spark/commits/master/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala) * Filipe 3. https://issues.apache.org/jira/browse/SPARK-17156 * Jacek -
jaceklaskowski revised this gist
Aug 24, 2016 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -7,6 +7,7 @@ 5. Creating custom Transformer * Example: [Tokenizer](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.feature.Tokenizer) * Jonatan + Kuba + lejdis * Problem to zapis Pipeline z tym Transformera, odczyt i użycie. 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. Developing RuleExecutor -
jaceklaskowski revised this gist
Aug 24, 2016 . 1 changed file with 0 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,5 +1,4 @@ 1. Custom MF format 2. SQLQueryTestSuite - see gmail * Filipe 3. https://issues.apache.org/jira/browse/SPARK-17156 -
jaceklaskowski revised this gist
Aug 24, 2016 . 1 changed file with 4 additions and 4 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,13 +1,13 @@ 1. Custom MF format 2. SQLQueryTestSuite - see gmail * Filipe 3. https://issues.apache.org/jira/browse/SPARK-17156 * Jacek 4. http://stackoverflow.com/questions/39073602/i-am-running-gbt-in-spark-ml-for-ctr-prediction-i-am-getting-exception-because 5. Creating custom Transformer * Example: [Tokenizer](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.feature.Tokenizer) * Jonatan + Kuba + lejdis 6. [ExecutionListenerManager](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.util.ExecutionListenerManager) 7. Developing RuleExecutor
NewerOlder