Skip to content

Instantly share code, notes, and snippets.

@asears
Last active January 24, 2024 07:16
Show Gist options
  • Save asears/cf7587561d955a6c3f2ddfed77ee18c3 to your computer and use it in GitHub Desktop.
Save asears/cf7587561d955a6c3f2ddfed77ee18c3 to your computer and use it in GitHub Desktop.

Revisions

  1. asears revised this gist Apr 8, 2016. 1 changed file with 7 additions and 2 deletions.
    9 changes: 7 additions & 2 deletions YARNNodeLabels.md
    Original file line number Diff line number Diff line change
    @@ -11,16 +11,21 @@ hadoop fs -chmod -R 700 /user/yarn
    ```
    2. Enable node labels and set label hdfs location in YARN config
    yarn.node-labels.fs-store.root-dir =
    ```hdfs://mycluster:8020/yarn/node-labels```
    ```
    hdfs://mycluster:8020/yarn/node-labels
    ```

    3. Create labels (these should probably be machine names or machine groups, I called them exclusive/shared….)

    ```
    yarn rmadmin -addToClusterNodeLabels "exclusive(exclusive=true),shared(exclusive=false)"
    ```

    4. assign labels to nodes
    ```yarn rmadmin -replaceLabelsOnNode "node3.hadoop.local=exclusive node4.hadoop.local=shared"```
    ```
    yarn rmadmin -replaceLabelsOnNode "node3.hadoop.local=exclusive node4.hadoop.local=shared"
    ```
    5. Configure capacity scheduler xml (tweak these settings to tune cluster, especially per user limits.
    ```
    yarn.scheduler.capacity.maximum-am-resource-percent=0.2
  2. asears revised this gist Apr 8, 2016. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion YARNNodeLabels.md
    Original file line number Diff line number Diff line change
    @@ -15,7 +15,9 @@ yarn.node-labels.fs-store.root-dir =

    3. Create labels (these should probably be machine names or machine groups, I called them exclusive/shared….)

    ```yarn rmadmin -addToClusterNodeLabels "exclusive(exclusive=true),shared(exclusive=false)"```
    ```
    yarn rmadmin -addToClusterNodeLabels "exclusive(exclusive=true),shared(exclusive=false)"
    ```

    4. assign labels to nodes
    ```yarn rmadmin -replaceLabelsOnNode "node3.hadoop.local=exclusive node4.hadoop.local=shared"```
  3. asears revised this gist Apr 8, 2016. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion YARNNodeLabels.md
    Original file line number Diff line number Diff line change
    @@ -10,9 +10,11 @@ hadoop fs -chown -R yarn:yarn /user/yarn
    hadoop fs -chmod -R 700 /user/yarn
    ```
    2. Enable node labels and set label hdfs location in YARN config
    yarn.node-labels.fs-store.root-dir = ```hdfs://mycluster:8020/yarn/node-labels```
    yarn.node-labels.fs-store.root-dir =
    ```hdfs://mycluster:8020/yarn/node-labels```

    3. Create labels (these should probably be machine names or machine groups, I called them exclusive/shared….)

    ```yarn rmadmin -addToClusterNodeLabels "exclusive(exclusive=true),shared(exclusive=false)"```

    4. assign labels to nodes
  4. asears revised this gist Apr 8, 2016. 1 changed file with 3 additions and 3 deletions.
    6 changes: 3 additions & 3 deletions YARNNodeLabels.md
    Original file line number Diff line number Diff line change
    @@ -10,13 +10,13 @@ hadoop fs -chown -R yarn:yarn /user/yarn
    hadoop fs -chmod -R 700 /user/yarn
    ```
    2. Enable node labels and set label hdfs location in YARN config
    yarn.node-labels.fs-store.root-dir = hdfs://mycluster:8020/yarn/node-labels
    yarn.node-labels.fs-store.root-dir = ```hdfs://mycluster:8020/yarn/node-labels```

    3. Create labels (these should probably be machine names or machine groups, I called them exclusive/shared….)
    yarn rmadmin -addToClusterNodeLabels "exclusive(exclusive=true),shared(exclusive=false)"
    ```yarn rmadmin -addToClusterNodeLabels "exclusive(exclusive=true),shared(exclusive=false)"```

    4. assign labels to nodes
    a. yarn rmadmin -replaceLabelsOnNode "node3.hadoop.local=exclusive node4.hadoop.local=shared"
    ```yarn rmadmin -replaceLabelsOnNode "node3.hadoop.local=exclusive node4.hadoop.local=shared"```
    5. Configure capacity scheduler xml (tweak these settings to tune cluster, especially per user limits.
    ```
    yarn.scheduler.capacity.maximum-am-resource-percent=0.2
  5. asears revised this gist Apr 8, 2016. 1 changed file with 4 additions and 2 deletions.
    6 changes: 4 additions & 2 deletions YARNNodeLabels.md
    Original file line number Diff line number Diff line change
    @@ -1,13 +1,14 @@
    Creating subclusters and node groups within YARN queues using node labels.

    1. Create directories in HDFS for node labels
    ```
    hadoop fs -mkdir -p /yarn/node-labels
    hadoop fs -chown -R yarn:yarn /yarn
    hadoop fs -chmod -R 700 /yarn
    hadoop fs -mkdir -p /user/yarn
    hadoop fs -chown -R yarn:yarn /user/yarn
    hadoop fs -chmod -R 700 /user/yarn

    ```
    2. Enable node labels and set label hdfs location in YARN config
    yarn.node-labels.fs-store.root-dir = hdfs://mycluster:8020/yarn/node-labels

    @@ -79,9 +80,10 @@ yarn.scheduler.capacity.root.queues=default,hive1,hive2
    http://spark.apache.org/docs/latest/running-on-yarn.html

    Spark Properties
    ```
    spark.yarn.am.nodeLabelExpression
    spark.yarn.executor.nodeLabelExpression
    spark.yarn.tags

    ```
    https://community.hortonworks.com/articles/11434/yarn-node-labels-1.html
    http://www.slideshare.net/Hadoop_Summit/node-labels-in-yarn-49792443
  6. asears revised this gist Apr 8, 2016. 1 changed file with 9 additions and 1 deletion.
    10 changes: 9 additions & 1 deletion YARNNodeLabels.md
    Original file line number Diff line number Diff line change
    @@ -74,6 +74,14 @@ yarn.scheduler.capacity.root.queues=default,hive1,hive2
    ```
    6. Refresh YARN Queues / Restart YARN / Restart HIVE
    7. Set Hive default queues to Hive1,Hive2. Set default tez sessions to 2. Restart Hive.
    8. Test nodes with example commands.
    8. Test nodes with jobs....

    http://spark.apache.org/docs/latest/running-on-yarn.html

    Spark Properties
    spark.yarn.am.nodeLabelExpression
    spark.yarn.executor.nodeLabelExpression
    spark.yarn.tags

    https://community.hortonworks.com/articles/11434/yarn-node-labels-1.html
    http://www.slideshare.net/Hadoop_Summit/node-labels-in-yarn-49792443
  7. asears revised this gist Apr 8, 2016. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions YARNNodeLabels.md
    Original file line number Diff line number Diff line change
    @@ -17,7 +17,7 @@ yarn rmadmin -addToClusterNodeLabels "exclusive(exclusive=true),shared(exclusive
    4. assign labels to nodes
    a. yarn rmadmin -replaceLabelsOnNode "node3.hadoop.local=exclusive node4.hadoop.local=shared"
    5. Configure capacity scheduler xml (tweak these settings to tune cluster, especially per user limits.

    ```
    yarn.scheduler.capacity.maximum-am-resource-percent=0.2
    yarn.scheduler.capacity.maximum-applications=10000
    yarn.scheduler.capacity.node-locality-delay=40
    @@ -71,7 +71,7 @@ yarn.scheduler.capacity.root.hive2.ordering-policy.fair.enable-size-based-weight
    yarn.scheduler.capacity.root.hive2.state=RUNNING
    yarn.scheduler.capacity.root.hive2.user-limit-factor=4
    yarn.scheduler.capacity.root.queues=default,hive1,hive2

    ```
    6. Refresh YARN Queues / Restart YARN / Restart HIVE
    7. Set Hive default queues to Hive1,Hive2. Set default tez sessions to 2. Restart Hive.
    8. Test nodes with example commands.
  8. asears created this gist Apr 8, 2016.
    79 changes: 79 additions & 0 deletions YARNNodeLabels.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,79 @@
    Creating subclusters and node groups within YARN queues using node labels.

    1. Create directories in HDFS for node labels
    hadoop fs -mkdir -p /yarn/node-labels
    hadoop fs -chown -R yarn:yarn /yarn
    hadoop fs -chmod -R 700 /yarn
    hadoop fs -mkdir -p /user/yarn
    hadoop fs -chown -R yarn:yarn /user/yarn
    hadoop fs -chmod -R 700 /user/yarn

    2. Enable node labels and set label hdfs location in YARN config
    yarn.node-labels.fs-store.root-dir = hdfs://mycluster:8020/yarn/node-labels

    3. Create labels (these should probably be machine names or machine groups, I called them exclusive/shared….)
    yarn rmadmin -addToClusterNodeLabels "exclusive(exclusive=true),shared(exclusive=false)"

    4. assign labels to nodes
    a. yarn rmadmin -replaceLabelsOnNode "node3.hadoop.local=exclusive node4.hadoop.local=shared"
    5. Configure capacity scheduler xml (tweak these settings to tune cluster, especially per user limits.

    yarn.scheduler.capacity.maximum-am-resource-percent=0.2
    yarn.scheduler.capacity.maximum-applications=10000
    yarn.scheduler.capacity.node-locality-delay=40
    yarn.scheduler.capacity.queue-mappings-override.enable=false
    yarn.scheduler.capacity.root.accessible-node-labels=shared,exclusive
    yarn.scheduler.capacity.root.accessible-node-labels.exclusive.capacity=100
    yarn.scheduler.capacity.root.accessible-node-labels.exclusive.maximum-capacity=100
    yarn.scheduler.capacity.root.accessible-node-labels.shared.capacity=100
    yarn.scheduler.capacity.root.accessible-node-labels.shared.maximum-capacity=100
    yarn.scheduler.capacity.root.acl_administer_queue=*
    yarn.scheduler.capacity.root.capacity=100
    yarn.scheduler.capacity.root.default.accessible-node-labels=*
    yarn.scheduler.capacity.root.default.accessible-node-labels.exclusive.capacity=0
    yarn.scheduler.capacity.root.default.accessible-node-labels.exclusive.maximum-capacity=100
    yarn.scheduler.capacity.root.default.accessible-node-labels.shared.capacity=50
    yarn.scheduler.capacity.root.default.accessible-node-labels.shared.maximum-capacity=100
    yarn.scheduler.capacity.root.default.acl_submit_applications=*
    yarn.scheduler.capacity.root.default.capacity=50
    yarn.scheduler.capacity.root.default.default-node-label-expression=shared
    yarn.scheduler.capacity.root.default.maximum-capacity=100
    yarn.scheduler.capacity.root.default.state=RUNNING
    yarn.scheduler.capacity.root.default.user-limit-factor=2
    yarn.scheduler.capacity.root.hive1.accessible-node-labels=*
    yarn.scheduler.capacity.root.hive1.accessible-node-labels.exclusive.capacity=100
    yarn.scheduler.capacity.root.hive1.accessible-node-labels.exclusive.maximum-capacity=100
    yarn.scheduler.capacity.root.hive1.accessible-node-labels.shared.capacity=25
    yarn.scheduler.capacity.root.hive1.accessible-node-labels.shared.maximum-capacity=100
    yarn.scheduler.capacity.root.hive1.acl_administer_queue=*
    yarn.scheduler.capacity.root.hive1.acl_submit_applications=*
    yarn.scheduler.capacity.root.hive1.capacity=25
    yarn.scheduler.capacity.root.hive1.default-node-label-expression=exclusive
    yarn.scheduler.capacity.root.hive1.maximum-capacity=100
    yarn.scheduler.capacity.root.hive1.minimum-user-limit-percent=100
    yarn.scheduler.capacity.root.hive1.ordering-policy=fair
    yarn.scheduler.capacity.root.hive1.ordering-policy.fair.enable-size-based-weight=false
    yarn.scheduler.capacity.root.hive1.state=RUNNING
    yarn.scheduler.capacity.root.hive1.user-limit-factor=4
    yarn.scheduler.capacity.root.hive2.accessible-node-labels=*
    yarn.scheduler.capacity.root.hive2.accessible-node-labels.exclusive.capacity=0
    yarn.scheduler.capacity.root.hive2.accessible-node-labels.exclusive.maximum-capacity=100
    yarn.scheduler.capacity.root.hive2.accessible-node-labels.shared.capacity=25
    yarn.scheduler.capacity.root.hive2.accessible-node-labels.shared.maximum-capacity=100
    yarn.scheduler.capacity.root.hive2.acl_administer_queue=*
    yarn.scheduler.capacity.root.hive2.acl_submit_applications=*
    yarn.scheduler.capacity.root.hive2.capacity=25
    yarn.scheduler.capacity.root.hive2.default-node-label-expression=shared
    yarn.scheduler.capacity.root.hive2.maximum-capacity=25
    yarn.scheduler.capacity.root.hive2.minimum-user-limit-percent=100
    yarn.scheduler.capacity.root.hive2.ordering-policy=fair
    yarn.scheduler.capacity.root.hive2.ordering-policy.fair.enable-size-based-weight=false
    yarn.scheduler.capacity.root.hive2.state=RUNNING
    yarn.scheduler.capacity.root.hive2.user-limit-factor=4
    yarn.scheduler.capacity.root.queues=default,hive1,hive2

    6. Refresh YARN Queues / Restart YARN / Restart HIVE
    7. Set Hive default queues to Hive1,Hive2. Set default tez sessions to 2. Restart Hive.
    8. Test nodes with example commands.

    http://www.slideshare.net/Hadoop_Summit/node-labels-in-yarn-49792443