Skip to content

Instantly share code, notes, and snippets.

@tomduhourq
Created December 24, 2018 20:05
Show Gist options
  • Select an option

  • Save tomduhourq/f7126b2ce30d6d37a8d0e6dbf1c690d8 to your computer and use it in GitHub Desktop.

Select an option

Save tomduhourq/f7126b2ce30d6d37a8d0e6dbf1c690d8 to your computer and use it in GitHub Desktop.
Example of REST calls with mappartitions
val serviceResponsesRDD: RDD[List[ServiceResponse]] = requestsRDD.mapPartitions{ requestIterator =>
val requests = requestIterator.toList
logger.info(s"[START] Query service with ${requests.length} requests")
// Provides a CloseableHttpClient with back off should the service fail
val httpClient = HttpClientBuilder.withBackoff()
val responsesToAwait = requests.map(request => Service.call(request, httpClient))
val sequenceResponses = Future.sequence(responsesToAwait)
val responses = Await.result(sequenceResponses, 2 minutes)
logger.info(s"[FINISH] Query service with ${requests.length} requests")
httpClient.close()
responses.iterator
}
@rsanchezTAG
Copy link

hi Tom, that logging in lines 3 and 11, where does it land? I have similar code, and it does not show in my log files for the spark app. I do see the stuff logged from any code outside the mapPartitions lambda function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment