Tobi tobilg

Hamburg, Germany
https://tobilg.com
@tobilg.com

Recently created

Least recently created

Recently updated

Least recently updated

tobilg / transform-arrow.ts

Last active August 21, 2024 10:41

	import { Table, Vector, Field, Utf8, Type, Schema } from 'apache-arrow';

	/**
	* Cast all columns with complex data types in an Apache Arrow Table to strings
	* @param {Table} table - The Apache Arrow Table
	* @returns {Table} - A new Table with all complex data type columns cast to strings
	*/
	function castComplexColumnsToString(table: Table): Table {
	const schemaFields = table.schema.fields;

tobilg / find_lambda_at_edge_logs.sh

Last active January 26, 2024 18:26

Use it like this: ./find_lambda_at_edge_logs.sh YOUR_FUNCTION_NAME

	#!/bin/bash

	FUNCTION_NAME=$1
	for region in $(aws --output text ec2 describe-regions \| cut -f 4)
	do
	echo "Checking $region"
	for loggroup in $(aws --output text logs describe-log-groups --log-group-prefix "/aws/lambda/us-east-1.$FUNCTION_NAME" --region $region --query 'logGroups[].logGroupName')
	do
	echo "Found '$loggroup' in region $region"
	for logstream in $(aws --output text logs describe-log-streams --log-group-name $loggroup --region $region --query 'logStreams[].logStreamName')

tobilg / Serverless Lambda@Edge execution role

Last active January 9, 2022 10:53

	{
	"Type": "AWS::IAM::Role",
	"Properties": {
	"AssumeRolePolicyDocument": {
	"Version": "2012-10-17",
	"Statement": [{
	"Effect": "Allow",
	"Principal": {
	"Service": [
	"lambda.amazonaws.com",

tobilg / awsRegions.json

Last active January 25, 2024 23:41

Data is sourced from https://github.com/tobilg/aws-edge-locations and https://aws.amazon.com/about-aws/global-infrastructure/regions_az/#Region_Maps_and_Edge_Networks

	{
	"us-east-1": {
	"city": "Ashburn",
	"state": "Virginia",
	"country": "United States",
	"countryCode": "US",
	"latitude": 38.9445,
	"longitude": -77.4558029,
	"region": "North America",
	"iataCode": "IAD"

tobilg / docker_hub_stats.sh

Last active January 12, 2018 16:41

Get the Docker Hub statistics for a specific user's images as CSV. Needs curl and jq installed. Usage: ./docker_hub_stats.sh yourUserName

	#!/bin/bash

	curl -s https://hub.docker.com/v2/repositories/$1/\?page_size\=1000 \| jq -r '["user", "name", "description", "star_count", "pull_count"] as $fields \| $fields, (.results[] \| [.[$fields[]]]) \| @csv'

tobilg / custom_s3_endpoint_in_spark.md

Last active July 31, 2024 10:22

Description on how to use a custom S3 endpoint (like Rados Gateway for Ceph)

Custom S3 endpoints with Spark

To be able to use custom endpoints with the latest Spark distribution, one needs to add an external package (hadoop-aws). Then, custum endpoints can be configured according to docs.

Use the `hadoop-aws` package

bin/spark-shell --packages org.apache.hadoop:hadoop-aws:2.7.2

SparkContext configuration

tobilg / run_multiple_phantomjs_instances.js

Last active December 22, 2015 15:39

Async usage of multiple PhantomJS instances with NodeJS

	var phantom = require('phantom');
	var async = require('async');

	var pagesToCall = [
	['http://www.google.com', 8000],
	['http://www.allthingsd.com', 8001],
	['http://www.wired.com', 8002],
	['http://www.mashable.com', 8003],
	['http://www.stackoverflow.com', 8004]
	];

Tobi tobilg

Custom S3 endpoints with Spark

Use the hadoop-aws package

SparkContext configuration

Use the `hadoop-aws` package