Skip to content

Instantly share code, notes, and snippets.

View vinitshah24's full-sized avatar
🌌
Focusing

Vinit Shah vinitshah24

🌌
Focusing
View GitHub Profile
import subprocess
# Define the BTEQ script
bteq_script = '''
.LOGON <Teradata_Hostname>/<Username>,<Password>;
SELECT InfoData FROM DBC.DBCInfo WHERE InfoKey = 'RELEASE';
.LOGOFF;
.EXIT;
'''
import org.apache.hadoop.fs.{FileSystem, Path}
def removeFilesExceptFirstN(directory: String, prefix: String, n: Int): Unit = {
val fs = FileSystem.get(sparkContext.hadoopConfiguration)
val files = fs.globStatus(new Path(s"$directory/$prefix*")).map(_.getPath.toString).sorted
files.drop(n).foreach({ file =>
fs.delete(new Path(file), true)
})
}
@vinitshah24
vinitshah24 / randomizeElements.py
Last active July 13, 2023 02:52
Balance and randomize the categories or elements within the list
import random
# Original array
array = [
"category1/1", "category1/2", "category1/3", "category1/4", "category1/5", "category1/6",
"category2/1", "category2/2", "category2/3", "category2/4",
"category3/1", "category3/2"
]
# Separate elements by category
@vinitshah24
vinitshah24 / ycsb_hbase.sh
Created September 29, 2020 21:29 — forked from ashrithr/ycsb_hbase.sh
YCSB stress test hbase commands
#!/usr/bin/env bash
#
# Simulates mixed workload on HBase using YCSB
# Author: Ashrith (ashrith at cloudwick dot com)
# Date: Wed, 16 2014
#
#
# You may want to tweak these variables to change the workload's behavior
#