Skip to content

Instantly share code, notes, and snippets.

@selvanponraj
Forked from AtlasPilotPuppy/hbase_rdd.scala
Created February 29, 2016 12:51
Show Gist options
  • Select an option

  • Save selvanponraj/cf62dbdd70e4319f57f7 to your computer and use it in GitHub Desktop.

Select an option

Save selvanponraj/cf62dbdd70e4319f57f7 to your computer and use it in GitHub Desktop.
Accessing Hbase from Apache Spark
import org.apache.spark.rdd.NewHadoopRDD
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.client.Result
val sparkContext = new SparkContext("local", "Simple App")
val hbaseConfiguration = (hbaseConfigFileName: String, tableName: String) => {
val hbaseConfiguration = HBaseConfiguration.create()
hbaseConfiguration.addResource(hbaseConfigFileName)
hbaseConfiguration.set(TableInputFormat.INPUT_TABLE, tableName)
hbaseConfiguration
}
val rdd = new NewHadoopRDD(
sparkContext,
classOf[TableInputFormat],
classOf[ImmutableBytesWritable],
classOf[Result],
hbaseConfiguration("/path/to/hbase-site.xml", "table-with-data")
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment