Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save vinodkc/5aee0d9cd03d18d05e03fa0264d628a7 to your computer and use it in GitHub Desktop.

Select an option

Save vinodkc/5aee0d9cd03d18d05e03fa0264d628a7 to your computer and use it in GitHub Desktop.

Revisions

  1. vinodkc revised this gist Apr 7, 2020. 1 changed file with 3 additions and 3 deletions.
    6 changes: 3 additions & 3 deletions Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -20,9 +20,9 @@ Basic testing :
    -------------

    1. Create a table employee in hive and load some data
    eg:
    Create table
    ----------------

    eg:
    Create table
    ```
    CREATE TABLE IF NOT EXISTS employee ( eid int, name String, salary String, destination String)
    COMMENT 'Employee details'
  2. vinodkc revised this gist Apr 7, 2020. 1 changed file with 5 additions and 5 deletions.
    10 changes: 5 additions & 5 deletions Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -102,7 +102,7 @@ hive.executeQuery("select * from employee").show

    a. add following property in Custom livy2-conf

    `livy.file.local-dir-whitelist=/usr/hdp/current/hive_warehouse_connector/`
    ```livy.file.local-dir-whitelist=/usr/hdp/current/hive_warehouse_connector/```

    b. Add hive-site.xml to /usr/hdp/current/spark2-client/conf on all cluster nodes.

    @@ -124,16 +124,16 @@ Note: Ensure to change the version of hive-warehouse-connector-assembly to match

    e. in first paragraph add

    ```
    %livy2
    ```
    %livy2
    import com.hortonworks.hwc.HiveWarehouseSession
    val hive = HiveWarehouseSession.session(spark).build()
    ```

    f. in second paragraph add

    ```
    %livy2
    ```
    %livy2
    hive.executeQuery("select * from employee").show
    ```

  3. vinodkc revised this gist Apr 7, 2020. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -101,6 +101,7 @@ hive.executeQuery("select * from employee").show
    7. To integrate HWC in Livy2

    a. add following property in Custom livy2-conf

    `livy.file.local-dir-whitelist=/usr/hdp/current/hive_warehouse_connector/`
    b. Add hive-site.xml to /usr/hdp/current/spark2-client/conf on all cluster nodes.
  4. vinodkc revised this gist Apr 7, 2020. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -101,7 +101,8 @@ hive.executeQuery("select * from employee").show
    7. To integrate HWC in Livy2

    a. add following property in Custom livy2-conf
    livy.file.local-dir-whitelist=/usr/hdp/current/hive_warehouse_connector/
    `livy.file.local-dir-whitelist=/usr/hdp/current/hive_warehouse_connector/`

    b. Add hive-site.xml to /usr/hdp/current/spark2-client/conf on all cluster nodes.

    c. Ensure hadoop.proxyuser.hive.hosts=* exists in core-site.xml ; refer Custom core-site section in HDFS confs
  5. vinodkc revised this gist Apr 7, 2020. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -71,8 +71,9 @@ spark.hadoop.hive.zookeeper.quorum=c420-node2.squadron-labs.com:2181,c420-node3.

    5. Run Spark-shell

    ```
    spark-shell --master yarn --conf "spark.security.credentials.hiveserver2.enabled=false" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar

    ```
    Note: Common properties are read from spark default properties

    Pyspark example :
  6. vinodkc revised this gist Apr 7, 2020. 1 changed file with 1 addition and 2 deletions.
    3 changes: 1 addition & 2 deletions Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -71,8 +71,7 @@ spark.hadoop.hive.zookeeper.quorum=c420-node2.squadron-labs.com:2181,c420-node3.

    5. Run Spark-shell

    ```spark-shell --master yarn --conf "spark.security.credentials.hiveserver2.enabled=false" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar
    ```
    spark-shell --master yarn --conf "spark.security.credentials.hiveserver2.enabled=false" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar

    Note: Common properties are read from spark default properties

  7. vinodkc revised this gist Apr 7, 2020. 1 changed file with 11 additions and 0 deletions.
    11 changes: 11 additions & 0 deletions Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -70,20 +70,27 @@ spark.hadoop.hive.zookeeper.quorum=c420-node2.squadron-labs.com:2181,c420-node3.
    ```

    5. Run Spark-shell

    ```spark-shell --master yarn --conf "spark.security.credentials.hiveserver2.enabled=false" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar
    ```

    Note: Common properties are read from spark default properties

    Pyspark example :

    ```
    pyspark --master yarn --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar --py-files /usr/hdp/current/hive_warehouse_connector/pyspark_hwc-1.0.0.3.0.1.0-187.zip --conf spark.security.credentials.hiveserver2.enabled=false
    ```

    Paste this code on shell

    ```
    from pyspark_llap.sql.session import HiveWarehouseSession
    hive = HiveWarehouseSession.session(spark).build()
    ```

    6. run following code in scala shell to view the hive table data

    ```
    import com.hortonworks.hwc.HiveWarehouseSession
    val hive = HiveWarehouseSession.session(spark).build()
    @@ -100,6 +107,7 @@ hive.executeQuery("select * from employee").show
    c. Ensure hadoop.proxyuser.hive.hosts=* exists in core-site.xml ; refer Custom core-site section in HDFS confs

    d. Login to Zeppelin and in livy2 interpreter settings add following

    ```
    livy.spark.hadoop.hive.llap.daemon.service.hosts @llap0
    livy.spark.security.credentials.hiveserver2.enabled true
    @@ -113,12 +121,15 @@ Note: Ensure to change the version of hive-warehouse-connector-assembly to match
    d. Restart livy2 interpreter

    e. in first paragraph add

    ```
    %livy2
    import com.hortonworks.hwc.HiveWarehouseSession
    val hive = HiveWarehouseSession.session(spark).build()
    ```

    f. in second paragraph add

    ```
    %livy2
    hive.executeQuery("select * from employee").show
  8. vinodkc revised this gist Apr 7, 2020. 1 changed file with 37 additions and 28 deletions.
    65 changes: 37 additions & 28 deletions Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -19,7 +19,7 @@ spark.datasource.hive.warehouse.metastoreUri thrift://c420-node3.squadron-labs.c
    Basic testing :
    -------------

    1) Create a table employee in hive and load some data
    1. Create a table employee in hive and load some data
    eg:
    Create table
    ----------------
    @@ -32,7 +32,7 @@ LINES TERMINATED BY '\n'
    STORED AS TEXTFILE;
    ```
    Load data data.txt file into hdfs
    ---------------

    ```
    1201,Vinod,45000,Technical manager
    1202,Manisha,45000,Proof reader
    @@ -43,77 +43,86 @@ Load data data.txt file into hdfs
    ```
    LOAD DATA INPATH '/tmp/data.txt' OVERWRITE INTO TABLE employee;
    ```
    2) kinit to the spark user and run
    2. kinit to the spark user and run

    ```
    spark-shell --master yarn --conf "spark.security.credentials.hiveserver2.enabled=false" --conf "spark.sql.hive.hiveserver2.jdbc.url=jdbc:hive2://c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive;principal=hive/[email protected]" --conf "spark.datasource.hive.warehouse.metastoreUri=thrift://c420-node3.squadron-labs.com:9083" --conf "spark.datasource.hive.warehouse.load.staging.dir=/tmp/" --conf "spark.hadoop.hive.llap.daemon.service.hosts=@llap0" --conf "spark.hadoop.hive.zookeeper.quorum=c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar
    ```
    Note: `spark.security.credentials.hiveserver2.enabled` should be set to false for YARN client deploy mode, and true for YARN cluster deploy mode (by default). This configuration is required for a Kerberized cluster

    Note: spark.security.credentials.hiveserver2.enabled should be set to false for YARN client deploy mode, and true for YARN cluster deploy mode (by default). This configuration is required for a Kerberized cluster

    3) run following code in scala shell to view the table data
    3. run following code in scala shell to view the table data
    ```
    import com.hortonworks.hwc.HiveWarehouseSession
    val hive = HiveWarehouseSession.session(spark).build()
    hive.execute("show tables").show
    hive.executeQuery("select * from employee").show
    ```


    4. To apply common properties by default, add following setting into ambari spark2 custom conf

    4) To apply common properties by default, add following setting into ambari spark2 custom conf


    ```
    spark.sql.hive.hiveserver2.jdbc.url=jdbc:hive2://c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive;principal=hive/[email protected]
    spark.datasource.hive.warehouse.metastoreUri=thrift://c420-node3.squadron-labs.com:9083
    spark.datasource.hive.warehouse.load.staging.dir=/tmp/
    spark.hadoop.hive.llap.daemon.service.hosts=@llap0
    spark.hadoop.hive.zookeeper.quorum=c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181
    ```


    5) spark-shell --master yarn --conf "spark.security.credentials.hiveserver2.enabled=false" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar
    5. Run Spark-shell
    ```spark-shell --master yarn --conf "spark.security.credentials.hiveserver2.enabled=false" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar
    ```
    Note: Common properties are read from spark default properties

    Pyspark example :
    ```
    pyspark --master yarn --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar --py-files /usr/hdp/current/hive_warehouse_connector/pyspark_hwc-1.0.0.3.0.1.0-187.zip --conf spark.security.credentials.hiveserver2.enabled=false

    ```
    Paste this code on shell
    ```
    from pyspark_llap.sql.session import HiveWarehouseSession
    hive = HiveWarehouseSession.session(spark).build()

    6) run following code in scala shell to view the hive table data

    ```
    6. run following code in scala shell to view the hive table data
    ```
    import com.hortonworks.hwc.HiveWarehouseSession
    val hive = HiveWarehouseSession.session(spark).build()
    hive.execute("show tables").show
    hive.executeQuery("select * from employee").show
    ```

    7. To integrate HWC in Livy2

    7) To integrate HWC in Livy2

    a) add following property in Custom livy2-conf
    a. add following property in Custom livy2-conf
    livy.file.local-dir-whitelist=/usr/hdp/current/hive_warehouse_connector/
    b) Add hive-site.xml to /usr/hdp/current/spark2-client/conf on all cluster nodes.
    b. Add hive-site.xml to /usr/hdp/current/spark2-client/conf on all cluster nodes.

    c)Ensure hadoop.proxyuser.hive.hosts=* exists in core-site.xml ; refer Custom core-site section in HDFS confs
    c. Ensure hadoop.proxyuser.hive.hosts=* exists in core-site.xml ; refer Custom core-site section in HDFS confs

    d) Login to Zeppelin and in livy2 interpreter settings add following

    d. Login to Zeppelin and in livy2 interpreter settings add following
    ```
    livy.spark.hadoop.hive.llap.daemon.service.hosts @llap0
    livy.spark.security.credentials.hiveserver2.enabled true
    livy.spark.sql.hive.hiveserver2.jdbc.url jdbc:hive2://c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive
    livy.spark.sql.hive.hiveserver2.jdbc.url.principal hive/[email protected]
    livy.spark.yarn.security.credentials.hiveserver2.enabled true
    livy.spark.jars file:///usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar

    ```
    Note: Ensure to change the version of hive-warehouse-connector-assembly to match your HWC version

    d) Restart livy2 interpreter
    d. Restart livy2 interpreter

    e) in first paragraph add
    e. in first paragraph add
    ```
    %livy2
    import com.hortonworks.hwc.HiveWarehouseSession
    val hive = HiveWarehouseSession.session(spark).build()

    f) in second paragraph add
    ```
    f. in second paragraph add
    ```
    %livy2
    hive.executeQuery("select * from employee").show

    ```

    Note: There is an Ambari defect: AMBARI-22801, which reset the proxy configs on keytab regenration/service addition. Please follow the step 7.c again in such scenarios

  9. vinodkc revised this gist Apr 7, 2020. 1 changed file with 8 additions and 7 deletions.
    15 changes: 8 additions & 7 deletions Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -23,25 +23,26 @@ Basic testing :
    eg:
    Create table
    ----------------

    ```
    CREATE TABLE IF NOT EXISTS employee ( eid int, name String, salary String, destination String)
    COMMENT 'Employee details'
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    LINES TERMINATED BY '\n'
    STORED AS TEXTFILE;

    Load data data.txt file into hdfs
    ---------------
    ```
    Load data data.txt file into hdfs
    ---------------
    ```
    1201,Vinod,45000,Technical manager
    1202,Manisha,45000,Proof reader
    1203,Masthanvali,40000,Technical writer
    1204,Kiran,40000,Hr Admin
    1205,Kranthi,30000,Op Admin


    ```
    ```
    LOAD DATA INPATH '/tmp/data.txt' OVERWRITE INTO TABLE employee;

    ```
    2) kinit to the spark user and run

    spark-shell --master yarn --conf "spark.security.credentials.hiveserver2.enabled=false" --conf "spark.sql.hive.hiveserver2.jdbc.url=jdbc:hive2://c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive;principal=hive/_[email protected]" --conf "spark.datasource.hive.warehouse.metastoreUri=thrift://c420-node3.squadron-labs.com:9083" --conf "spark.datasource.hive.warehouse.load.staging.dir=/tmp/" --conf "spark.hadoop.hive.llap.daemon.service.hosts=@llap0" --conf "spark.hadoop.hive.zookeeper.quorum=c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar
  10. vinodkc revised this gist Apr 7, 2020. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -9,7 +9,8 @@ Prerequisites :
    * Get following details from hive for spark or try this
    [HWC Quick Test Script](https://gist.github.com/vinodkc/523f6cae8afb77887130c7e0c10306b4)

    ```spark.hadoop.hive.llap.daemon.service.hosts @llap0
    ```
    spark.hadoop.hive.llap.daemon.service.hosts @llap0
    spark.sql.hive.hiveserver2.jdbc.url jdbc:hive2://c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive
    spark.datasource.hive.warehouse.metastoreUri thrift://c420-node3.squadron-labs.com:9083
    ```
  11. vinodkc revised this gist Apr 7, 2020. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -9,10 +9,10 @@ Prerequisites :
    * Get following details from hive for spark or try this
    [HWC Quick Test Script](https://gist.github.com/vinodkc/523f6cae8afb77887130c7e0c10306b4)

    spark.hadoop.hive.llap.daemon.service.hosts @llap0
    ```spark.hadoop.hive.llap.daemon.service.hosts @llap0
    spark.sql.hive.hiveserver2.jdbc.url jdbc:hive2://c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive
    spark.datasource.hive.warehouse.metastoreUri thrift://c420-node3.squadron-labs.com:9083

    ```


    Basic testing :
  12. vinodkc revised this gist Apr 7, 2020. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -16,6 +16,7 @@ spark.datasource.hive.warehouse.metastoreUri thrift://c420-node3.squadron-labs.c


    Basic testing :
    -------------

    1) Create a table employee in hive and load some data
    eg:
  13. vinodkc revised this gist Apr 7, 2020. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -6,7 +6,7 @@ Prerequisites :

    * Enable hive interactive server in hive

    >>Get following details from hive for spark or try this
    * Get following details from hive for spark or try this
    [HWC Quick Test Script](https://gist.github.com/vinodkc/523f6cae8afb77887130c7e0c10306b4)

    spark.hadoop.hive.llap.daemon.service.hosts @llap0
  14. vinodkc revised this gist Apr 7, 2020. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -2,9 +2,9 @@ Spark HWC integration - HDP 3 Secure cluster
    =============
    Prerequisites :
    -------------
    Kerberized Cluster
    * Kerberized Cluster

    Enable hive interactive server in hive
    * Enable hive interactive server in hive

    >>Get following details from hive for spark or try this
    [HWC Quick Test Script](https://gist.github.com/vinodkc/523f6cae8afb77887130c7e0c10306b4)
  15. vinodkc revised this gist Apr 7, 2020. 1 changed file with 6 additions and 3 deletions.
    9 changes: 6 additions & 3 deletions Spark HWC integration - HDP 3 Secure cluster.md
    Original file line number Diff line number Diff line change
    @@ -1,7 +1,10 @@
    #Prerequisites :
    >> Kerberized Cluster
    Spark HWC integration - HDP 3 Secure cluster
    =============
    Prerequisites :
    -------------
    Kerberized Cluster

    >>Enable hive interactive server in hive
    Enable hive interactive server in hive

    >>Get following details from hive for spark or try this
    [HWC Quick Test Script](https://gist.github.com/vinodkc/523f6cae8afb77887130c7e0c10306b4)
  16. vinodkc renamed this gist Apr 7, 2020. 1 changed file with 0 additions and 0 deletions.
  17. vinodkc revised this gist Apr 7, 2020. 1 changed file with 0 additions and 10 deletions.
    10 changes: 0 additions & 10 deletions Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -1,14 +1,4 @@
    #Prerequisites :

    ---

    Paragraph
    text `Inline Code` text
    ~~Mistaken text.~~
    *Italics*
    **Bold**

    ---
    >> Kerberized Cluster

    >>Enable hive interactive server in hive
  18. vinodkc revised this gist Apr 7, 2020. 1 changed file with 10 additions and 0 deletions.
    10 changes: 10 additions & 0 deletions Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,14 @@
    #Prerequisites :

    ---

    Paragraph
    text `Inline Code` text
    ~~Mistaken text.~~
    *Italics*
    **Bold**

    ---
    >> Kerberized Cluster

    >>Enable hive interactive server in hive
  19. vinodkc revised this gist Apr 7, 2020. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,4 @@
    Prerequisites :
    #Prerequisites :
    >> Kerberized Cluster

    >>Enable hive interactive server in hive
  20. vinodkc revised this gist Apr 7, 2020. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -3,7 +3,8 @@ Prerequisites :

    >>Enable hive interactive server in hive

    >>Get following details from hive for spark or try this [HWC Quick Test Script](https://gist.github.com/vinodkc/523f6cae8afb77887130c7e0c10306b4)
    >>Get following details from hive for spark or try this
    [HWC Quick Test Script](https://gist.github.com/vinodkc/523f6cae8afb77887130c7e0c10306b4)

    spark.hadoop.hive.llap.daemon.service.hosts @llap0
    spark.sql.hive.hiveserver2.jdbc.url jdbc:hive2://c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive
  21. vinodkc revised this gist Apr 7, 2020. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -3,7 +3,7 @@ Prerequisites :

    >>Enable hive interactive server in hive

    >>Get following details from hive for spark
    >>Get following details from hive for spark or try this [HWC Quick Test Script](https://gist.github.com/vinodkc/523f6cae8afb77887130c7e0c10306b4)

    spark.hadoop.hive.llap.daemon.service.hosts @llap0
    spark.sql.hive.hiveserver2.jdbc.url jdbc:hive2://c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive
  22. vinodkc revised this gist Jun 21, 2019. 1 changed file with 4 additions and 2 deletions.
    6 changes: 4 additions & 2 deletions Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -62,11 +62,13 @@ spark.hadoop.hive.zookeeper.quorum=c420-node2.squadron-labs.com:2181,c420-node3.

    5) spark-shell --master yarn --conf "spark.security.credentials.hiveserver2.enabled=false" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar
    Note: Common properties are read from spark default properties

    Pyspark example :
    pyspark --master yarn --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar --py-files /usr/hdp/current/hive_warehouse_connector/pyspark_hwc-1.0.0.3.0.1.0-187.zip --conf spark.security.credentials.hiveserver2.enabled=false

    `from pyspark_llap.sql.session import HiveWarehouseSession
    hive = HiveWarehouseSession.session(spark).build()`
    from pyspark_llap.sql.session import HiveWarehouseSession
    hive = HiveWarehouseSession.session(spark).build()

    6) run following code in scala shell to view the hive table data

    import com.hortonworks.hwc.HiveWarehouseSession
  23. vinodkc revised this gist Jun 21, 2019. 1 changed file with 3 additions and 2 deletions.
    5 changes: 3 additions & 2 deletions Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -64,8 +64,9 @@ spark.hadoop.hive.zookeeper.quorum=c420-node2.squadron-labs.com:2181,c420-node3.
    Note: Common properties are read from spark default properties
    Pyspark example :
    pyspark --master yarn --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar --py-files /usr/hdp/current/hive_warehouse_connector/pyspark_hwc-1.0.0.3.0.1.0-187.zip --conf spark.security.credentials.hiveserver2.enabled=false
    ```from pyspark_llap.sql.session import HiveWarehouseSession
    hive = HiveWarehouseSession.session(spark).build()```

    `from pyspark_llap.sql.session import HiveWarehouseSession
    hive = HiveWarehouseSession.session(spark).build()`
    6) run following code in scala shell to view the hive table data

    import com.hortonworks.hwc.HiveWarehouseSession
  24. vinodkc revised this gist Jun 21, 2019. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -64,7 +64,8 @@ spark.hadoop.hive.zookeeper.quorum=c420-node2.squadron-labs.com:2181,c420-node3.
    Note: Common properties are read from spark default properties
    Pyspark example :
    pyspark --master yarn --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar --py-files /usr/hdp/current/hive_warehouse_connector/pyspark_hwc-1.0.0.3.0.1.0-187.zip --conf spark.security.credentials.hiveserver2.enabled=false

    ```from pyspark_llap.sql.session import HiveWarehouseSession
    hive = HiveWarehouseSession.session(spark).build()```
    6) run following code in scala shell to view the hive table data

    import com.hortonworks.hwc.HiveWarehouseSession
  25. vinodkc revised this gist Jun 14, 2019. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -27,7 +27,7 @@ STORED AS TEXTFILE;

    Load data data.txt file into hdfs
    ---------------
    1201,Gopal,45000,Technical manager
    1201,Vinod,45000,Technical manager
    1202,Manisha,45000,Proof reader
    1203,Masthanvali,40000,Technical writer
    1204,Kiran,40000,Hr Admin
  26. vinodkc revised this gist Mar 26, 2019. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -104,5 +104,5 @@ val hive = HiveWarehouseSession.session(spark).build()
    hive.executeQuery("select * from employee").show



    Note: There is an Ambari defect: AMBARI-22801, which reset the proxy configs on keytab regenration/service addition. Please follow the step 7.c again in such scenarios

  27. vinodkc revised this gist Mar 26, 2019. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -79,7 +79,9 @@ hive.executeQuery("select * from employee").show
    livy.file.local-dir-whitelist=/usr/hdp/current/hive_warehouse_connector/
    b) Add hive-site.xml to /usr/hdp/current/spark2-client/conf on all cluster nodes.

    c) Login to Zeppelin and in livy2 interpreter settings add following
    c)Ensure hadoop.proxyuser.hive.hosts=* exists in core-site.xml ; refer Custom core-site section in HDFS confs

    d) Login to Zeppelin and in livy2 interpreter settings add following

    livy.spark.hadoop.hive.llap.daemon.service.hosts @llap0
    livy.spark.security.credentials.hiveserver2.enabled true
  28. vinodkc revised this gist Feb 25, 2019. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -88,6 +88,8 @@ livy.spark.sql.hive.hiveserver2.jdbc.url.principal hive/[email protected]
    livy.spark.yarn.security.credentials.hiveserver2.enabled true
    livy.spark.jars file:///usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar

    Note: Ensure to change the version of hive-warehouse-connector-assembly to match your HWC version

    d) Restart livy2 interpreter

    e) in first paragraph add
  29. vinodkc revised this gist Feb 12, 2019. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -62,6 +62,8 @@ spark.hadoop.hive.zookeeper.quorum=c420-node2.squadron-labs.com:2181,c420-node3.

    5) spark-shell --master yarn --conf "spark.security.credentials.hiveserver2.enabled=false" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar
    Note: Common properties are read from spark default properties
    Pyspark example :
    pyspark --master yarn --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar --py-files /usr/hdp/current/hive_warehouse_connector/pyspark_hwc-1.0.0.3.0.1.0-187.zip --conf spark.security.credentials.hiveserver2.enabled=false

    6) run following code in scala shell to view the hive table data

  30. vinodkc revised this gist Jan 15, 2019. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions Spark HWC integration - HDP 3 Secure cluster
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,6 @@
    Prerequisites :
    >> Kerberized Cluster

    >>Enable hive interactive server in hive

    >>Get following details from hive for spark