Skip to content

Instantly share code, notes, and snippets.

@kingspp
Last active August 4, 2025 14:13
Show Gist options
  • Select an option

  • Save kingspp/4867c977565ca11b7ceb9f9247eede1c to your computer and use it in GitHub Desktop.

Select an option

Save kingspp/4867c977565ca11b7ceb9f9247eede1c to your computer and use it in GitHub Desktop.

Revisions

  1. kingspp revised this gist Nov 21, 2024. 1 changed file with 16 additions and 0 deletions.
    16 changes: 16 additions & 0 deletions databricks.md
    Original file line number Diff line number Diff line change
    @@ -37,4 +37,20 @@ print(views)
    ```python
    databricksURL = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiUrl().getOrElse(None)
    myToken = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().getOrElse(None)
    ```

    ### Alter Schema of a column
    ```sql
    -- Try
    ALTER TABLE <catalog>.<schema>.<table_name> SET TBLPROPERTIES ('delta.columnMapping.mode' = 'name')
    -- If the above does not work use,
    ALTER TABLE <catalog>.<schema>.<table_name> SET TBLPROPERTIES (
    'delta.columnMapping.mode' = 'name',
    'delta.minReaderVersion' = '2',
    'delta.minWriterVersion' = '5')

    -- Change column name to make sure the schema evolution works in the subsequent run
    ALTER TABLE <catalog>.<schema>.<table_name> RENAME COLUMN <col_of_interest> TO <col_of_interest>r_v1;

    -- Merge once the run is successfull if necessary
    ```
  2. kingspp revised this gist Oct 20, 2024. 1 changed file with 18 additions and 3 deletions.
    21 changes: 18 additions & 3 deletions databricks.md
    Original file line number Diff line number Diff line change
    @@ -10,16 +10,31 @@
    # dbutils.library.restartPython()
    ```

    # Load a JAR after cluster start
    ### Load a JAR after cluster start

    ```sql
    ADD jar dbfs:/<path to jar>.jar
    ```

    # Mount a bucket
    ```
    ### Mount a bucket
    ```python
    aws_bucket_name = "bucket_name"
    mount_name = "research"
    dbutils.fs.mount(f"s3a://{aws_bucket_name}", f"/mnt/{mount_name}")
    display(dbutils.fs.ls(f"/mnt/{mount_name}/"))
    ```

    ### Fetch Views from a catalog
    ```python
    catalog = "prod"
    schema = "ml"
    spark.sql(f"USE CATALOG {catalog}")
    views = [f.viewName for f in spark.sql(f"SHOW VIEWS IN {catalog}.{schema}").collect()]
    print(views)
    ```

    ### Get token
    ```python
    databricksURL = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiUrl().getOrElse(None)
    myToken = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().getOrElse(None)
    ```
  3. kingspp revised this gist Aug 23, 2024. 1 changed file with 3 additions and 0 deletions.
    3 changes: 3 additions & 0 deletions macos-snippets.sh
    Original file line number Diff line number Diff line change
    @@ -50,3 +50,6 @@ sudo spctl --master-disable

    # Install Tree View for listing file structure, Linux Core Utils
    brew install tree coreutils

    # Identify thermal throttling
    pmset -g thermlog
  4. kingspp revised this gist Nov 3, 2023. 1 changed file with 10 additions and 2 deletions.
    12 changes: 10 additions & 2 deletions databricks.md
    Original file line number Diff line number Diff line change
    @@ -5,13 +5,21 @@


    ```python
    # !/databricks/python3/bin/python -m pip uninstall cmxdt -y
    # !/databricks/python3/bin/python -m pip install -e /dbfs/code/
    # !/databricks/python3/bin/python -m pip uninstall <package> -y
    # !/databricks/python3/bin/python -m pip install -e /dbfs/<path>/
    # dbutils.library.restartPython()
    ```

    # Load a JAR after cluster start

    ```sql
    ADD jar dbfs:/<path to jar>.jar
    ```

    # Mount a bucket
    ```
    aws_bucket_name = "bucket_name"
    mount_name = "research"
    dbutils.fs.mount(f"s3a://{aws_bucket_name}", f"/mnt/{mount_name}")
    display(dbutils.fs.ls(f"/mnt/{mount_name}/"))
    ```
  5. kingspp revised this gist Aug 2, 2023. 1 changed file with 7 additions and 1 deletion.
    8 changes: 7 additions & 1 deletion databricks.md
    Original file line number Diff line number Diff line change
    @@ -1,11 +1,17 @@
    # Databricks Snippets


    ### Load A python library dynamically
    ### Load A python library dynamically (Cluster restarts not required)


    ```python
    # !/databricks/python3/bin/python -m pip uninstall cmxdt -y
    # !/databricks/python3/bin/python -m pip install -e /dbfs/code/
    # dbutils.library.restartPython()
    ```

    # Load a JAR after cluster start

    ```sql
    ADD jar dbfs:/<path to jar>.jar
    ```
  6. kingspp revised this gist Aug 2, 2023. 1 changed file with 11 additions and 0 deletions.
    11 changes: 11 additions & 0 deletions databricks.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,11 @@
    # Databricks Snippets


    ### Load A python library dynamically


    ```python
    # !/databricks/python3/bin/python -m pip uninstall cmxdt -y
    # !/databricks/python3/bin/python -m pip install -e /dbfs/code/
    # dbutils.library.restartPython()
    ```
  7. kingspp revised this gist Jul 28, 2022. 1 changed file with 9 additions and 1 deletion.
    10 changes: 9 additions & 1 deletion py-snippet.py
    Original file line number Diff line number Diff line change
    @@ -639,4 +639,12 @@ def internal_writer(self):
    def close(self):
    self.queue.join()
    self.finished = True
    self.filewriter.close()
    self.filewriter.close()


    # Timeout Whileloop
    import time
    timeout = 10 # [seconds]
    timeout_start = time.time()
    while time.time() < timeout_start + timeout:
    pass
  8. kingspp revised this gist Mar 1, 2022. 1 changed file with 26 additions and 1 deletion.
    27 changes: 26 additions & 1 deletion py-snippet.py
    Original file line number Diff line number Diff line change
    @@ -614,4 +614,29 @@ def batch(iterable, n=1):
    finder.run_script("./main.py")
    for name, mod in finder.modules.items():
    print(name)



    # Thread Safe Writer
    class SafeWriter:
    def __init__(self, *args):
    self.filewriter = open(*args)
    self.queue = Queue()
    self.finished = False
    Thread(name="SafeWriter", target=self.internal_writer).start()

    def write(self, data):
    self.queue.put(data)

    def internal_writer(self):
    while not self.finished:
    try:
    data = self.queue.get(True, 1)
    except Empty:
    continue
    self.filewriter.write(data)
    self.queue.task_done()

    def close(self):
    self.queue.join()
    self.finished = True
    self.filewriter.close()
  9. kingspp revised this gist Jul 7, 2021. 1 changed file with 5 additions and 0 deletions.
    5 changes: 5 additions & 0 deletions docker_utils.sh
    Original file line number Diff line number Diff line change
    @@ -6,6 +6,11 @@ sudo docker load -i [image_name:tag].tar.gz

    # Delete all containers including its volumes use
    docker rm -vf $(docker ps -a -q)
    docker rm -vf $(docker-compose ps -q)

    # Delete all the images
    docker rmi -f $(docker images -a -q)
    docker rmi -f $(docker-compose images -q)

    # Delete images and containers from docker compose
    DOCKER_CONTAINERS=$(docker-compose ps -q) DOCKER_IMAGES=$(docker-compose images -q) docker rm -vf $DOCKER_CONTAINERS && docker rmi -f $DOCKER_IMAGES
  10. kingspp revised this gist Jul 5, 2021. 1 changed file with 7 additions and 1 deletion.
    8 changes: 7 additions & 1 deletion docker_utils.sh
    Original file line number Diff line number Diff line change
    @@ -2,4 +2,10 @@
    sudo docker save [image_name:tag] > [image_name:tag].tar.gz

    # Load Image
    sudo docker load -i [image_name:tag].tar.gz
    sudo docker load -i [image_name:tag].tar.gz

    # Delete all containers including its volumes use
    docker rm -vf $(docker ps -a -q)

    # Delete all the images
    docker rmi -f $(docker images -a -q)
  11. kingspp revised this gist Jun 19, 2020. 1 changed file with 6 additions and 1 deletion.
    7 changes: 6 additions & 1 deletion py-snippet.py
    Original file line number Diff line number Diff line change
    @@ -608,5 +608,10 @@ def batch(iterable, n=1):
    for x in batch(list(range(0, 10)), 3):
    print(x)


    # Find all the imported modules
    from modulefinder import ModuleFinder
    finder = ModuleFinder()
    finder.run_script("./main.py")
    for name, mod in finder.modules.items():
    print(name)

  12. kingspp revised this gist Jun 19, 2020. 1 changed file with 13 additions and 1 deletion.
    14 changes: 13 additions & 1 deletion tensorflow-utils.py
    Original file line number Diff line number Diff line change
    @@ -12,4 +12,16 @@ def get_available_devices(cpu: bool = True, gpu: bool = True):
    return devices

    # Check CUDA Installation and GPU Availability
    print(tf.config.list_physical_devices('GPU'))
    print(tf.config.list_physical_devices('GPU'))



    # Fetch row indices
    x = tf.random.normal([3,2])
    x = tf.convert_to_tensor(x)

    indices = tf.convert_to_tensor([0,1,0])
    one_hot_indices = tf.expand_dims(indices, 1)
    range = tf.expand_dims(tf.range(tf.shape(indices)[0]), 1)
    ind = tf.concat([range, one_hot_indices], 1)
    tf.gather_nd(x, ind)
  13. kingspp revised this gist Jun 9, 2020. 1 changed file with 4 additions and 0 deletions.
    4 changes: 4 additions & 0 deletions latex.tex
    Original file line number Diff line number Diff line change
    @@ -67,3 +67,7 @@
    \end{aligned}
    \end{equation}
    \end{minipage}

    % Add Paragraph break
    \par%
    \bigskip
  14. kingspp revised this gist Jun 8, 2020. 1 changed file with 69 additions and 0 deletions.
    69 changes: 69 additions & 0 deletions latex.tex
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,69 @@
    % Required Packages
    % Package Declarations
    \usepackage{arxiv}
    \usepackage[utf8]{inputenc} % allow utf-8 input
    \usepackage[T1]{fontenc} % use 8-bit T1 fonts
    \usepackage{hyperref} % hyperlinks
    \usepackage{url} % simple URL typesetting
    \usepackage{booktabs} % professional-quality tables
    \usepackage{amsfonts} % blackboard math symbols
    \usepackage{nicefrac} % compact symbols for 1/2, etc.
    \usepackage{microtype} % microtypography
    \usepackage{lipsum} % Lorem Ipsum fill text
    \usepackage{multicol} % Support for Multi columns for tables
    \usepackage{multirow} % Support for Multi rows fot tables
    \usepackage{mathtools} % Advanced mathtools
    \usepackage{caption} % Advanced caption configuration
    \usepackage{amsmath} % Math package for equations
    \usepackage{titlesec} % Title section
    \usepackage{graphicx} % For adding labels to parts of the equation
    \usepackage{stackrel} % For adding labels to parts of the equation
    \usepackage[ruled,vlined]{algorithm2e} % For Algorithms
    \usepackage{algorithm}
    \usepackage{algpseudocode}


    % Theme Configurations

    % Set section depth to 4
    % \setcounter{secnumdepth}{4}

    % \titleformat{\paragraph}
    % {\normalfont\normalsize\bfseries}{\theparagraph}{1em}{}
    % \titlespacing*{\paragraph}
    % {0pt}{3.25ex plus 1ex minus .2ex}{1.5ex plus .2ex}

    % Add padding to text below table
    \captionsetup[table]{skip=10pt}

    % Configure hat tex
    \let\oldhat\hat
    \renewcommand{\hat}[1]{\oldhat{\mathbf{#1}}}

    % Image

    \begin{figure}[H]
    \centering
    \includegraphics[scale=0.2]{assets/sdk.png}
    \captionof{figure}{Comparison between Tensorflow and Keras}
    \label{fig:workflow}
    \end{figure}


    % Two column [Image | Text]
    \noindent\begin{minipage}{.45\textwidth}
    \centering
    \includegraphics[scale=0.09]{SimpleFFN.png}
    \captionof{figure}{Feed Forward Network}
    \label{fig:SimpleFFN.png}
    \end{minipage}
    \begin{minipage}{.45\textwidth}
    \begin{equation}
    \label{eq:ffn_math_representation}
    \begin{aligned}
    Dense_{1} &= \sigma(Input \cdot \hat{W}_{1} + \hat{b}_{1}) &\\
    Dense_{2} &= \sigma(Dense_{1} \cdot \hat{W}_{2} + \hat{b}_{2}) &\\
    Dense_{3} &= \sigma(Dense_{2} \cdot \hat{W}_{3} + \hat{b}_{3})
    \end{aligned}
    \end{equation}
    \end{minipage}
  15. kingspp revised this gist Jun 5, 2020. 1 changed file with 4 additions and 1 deletion.
    5 changes: 4 additions & 1 deletion tensorflow-utils.py
    Original file line number Diff line number Diff line change
    @@ -9,4 +9,7 @@ def get_available_devices(cpu: bool = True, gpu: bool = True):
    devices = [x.name for x in local_device_protos if x.device_type == 'CPU']
    if gpu:
    devices += [x.name for x in local_device_protos if x.device_type == 'GPU']
    return devices
    return devices

    # Check CUDA Installation and GPU Availability
    print(tf.config.list_physical_devices('GPU'))
  16. kingspp revised this gist Jun 5, 2020. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions linux_snippets.sh
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,2 @@
    # Fix unmet dependencies error
    sudo apt-get -o Dpkg::Options::="--force-overwrite" install --fix-broken
  17. kingspp revised this gist Jun 4, 2020. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions jupyter_snippets.py
    Original file line number Diff line number Diff line change
    @@ -10,6 +10,7 @@
    env = gym.make('Breakout-v0')
    env.reset()
    img = plt.imshow(env.render(mode='rgb_array')) # only call this once
    plt.xticks([]),plt.yticks([])
    for _ in range(100):
    img.set_data(env.render(mode='rgb_array')) # just update the data
    display.display(plt.gcf())
  18. kingspp revised this gist Jun 4, 2020. 1 changed file with 4 additions and 1 deletion.
    5 changes: 4 additions & 1 deletion bash-snippet.sh
    Original file line number Diff line number Diff line change
    @@ -194,4 +194,7 @@ lsof -i tcp:<port number>
    shuf -n N input > output

    # Generate a random number between 1-10
    $(( ( RANDOM % 10 ) + 1 ))
    $(( ( RANDOM % 10 ) + 1 ))

    # SSH Based Port forwarding
    ssh -N -f -L localhost:3000:localhost:3000 username@ip
  19. kingspp revised this gist Jun 4, 2020. 2 changed files with 21 additions and 0 deletions.
    18 changes: 18 additions & 0 deletions jupyter_snippets.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,18 @@
    # Notebook Snippets

    # Render environment
    import gym
    from IPython import display
    import matplotlib
    import matplotlib.pyplot as plt
    %matplotlib inline

    env = gym.make('Breakout-v0')
    env.reset()
    img = plt.imshow(env.render(mode='rgb_array')) # only call this once
    for _ in range(100):
    img.set_data(env.render(mode='rgb_array')) # just update the data
    display.display(plt.gcf())
    display.clear_output(wait=True)
    action = env.action_space.sample()
    env.step(action)
    3 changes: 3 additions & 0 deletions py-snippet.py
    Original file line number Diff line number Diff line change
    @@ -607,3 +607,6 @@ def batch(iterable, n=1):

    for x in batch(list(range(0, 10)), 3):
    print(x)



  20. kingspp revised this gist Apr 5, 2020. 1 changed file with 4 additions and 1 deletion.
    5 changes: 4 additions & 1 deletion pandas.py
    Original file line number Diff line number Diff line change
    @@ -6,4 +6,7 @@
    df.columns =df.columns.map(lambda x: str(float(x)))

    # Convert a column to String
    df['ColumnID'] = df['ColumnID'].astype(str)
    df['ColumnID'] = df['ColumnID'].astype(str)

    # Check for missing values (NaN)
    df.isnull().sum().sum()
  21. kingspp revised this gist Jan 29, 2020. 1 changed file with 8 additions and 0 deletions.
    8 changes: 8 additions & 0 deletions py-snippet.py
    Original file line number Diff line number Diff line change
    @@ -599,3 +599,11 @@ def custom_metric():
    logger.error('My code logged an error')
    assert 'My code logged an error' in str(logs)

    # Simplest form of batching
    def batch(iterable, n=1):
    l = len(iterable)
    for ndx in range(0, l, n):
    yield iterable[ndx:min(ndx + n, l)]

    for x in batch(list(range(0, 10)), 3):
    print(x)
  22. kingspp revised this gist Oct 14, 2019. 1 changed file with 4 additions and 1 deletion.
    5 changes: 4 additions & 1 deletion bash-snippet.sh
    Original file line number Diff line number Diff line change
    @@ -191,4 +191,7 @@ set -o xtrace # to revert to normal - set +o xtrace
    lsof -i tcp:<port number>

    #Select random lines from a file
    shuf -n N input > output
    shuf -n N input > output

    # Generate a random number between 1-10
    $(( ( RANDOM % 10 ) + 1 ))
  23. kingspp revised this gist Oct 4, 2019. 2 changed files with 6 additions and 3 deletions.
    5 changes: 4 additions & 1 deletion bash-snippet.sh
    Original file line number Diff line number Diff line change
    @@ -188,4 +188,7 @@ set -o xtrace # to revert to normal - set +o xtrace
    # or bash -x myscript.sh

    #list process on port
    lsof -i tcp:<port number>
    lsof -i tcp:<port number>

    #Select random lines from a file
    shuf -n N input > output
    4 changes: 2 additions & 2 deletions macos-snippets.sh
    Original file line number Diff line number Diff line change
    @@ -48,5 +48,5 @@ sudo codesign -f -s - /Library/Frameworks/Python.framework/Versions/3.6/Resource
    # Disable Gatekeeper for installing apps from unidentified devs
    sudo spctl --master-disable

    # Install Tree View for listing file structure
    brew install tree
    # Install Tree View for listing file structure, Linux Core Utils
    brew install tree coreutils
  24. kingspp revised this gist Oct 4, 2019. 1 changed file with 3 additions and 0 deletions.
    3 changes: 3 additions & 0 deletions macos-snippets.sh
    Original file line number Diff line number Diff line change
    @@ -47,3 +47,6 @@ sudo codesign -f -s - /Library/Frameworks/Python.framework/Versions/3.6/Resource

    # Disable Gatekeeper for installing apps from unidentified devs
    sudo spctl --master-disable

    # Install Tree View for listing file structure
    brew install tree
  25. kingspp revised this gist Aug 28, 2019. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions macos-snippets.sh
    Original file line number Diff line number Diff line change
    @@ -45,3 +45,5 @@ caffeinate -u -t <seconds>
    # Get rid of allow incoming connections for Python.app for Pycharm
    sudo codesign -f -s - /Library/Frameworks/Python.framework/Versions/3.6/Resources/Python.app/

    # Disable Gatekeeper for installing apps from unidentified devs
    sudo spctl --master-disable
  26. kingspp revised this gist May 6, 2019. 1 changed file with 18 additions and 0 deletions.
    18 changes: 18 additions & 0 deletions mongo_utils.js
    Original file line number Diff line number Diff line change
    @@ -2,3 +2,21 @@
    # false for upsert
    # true for multiple documents
    db.<collection_name>.update({"<original_key.h1.h2>": {$exists: true}}, {$rename:{"<original_key.h1.h2>":"<new_name>"}}, false, true);

    # Merge two collections
    db.c1.find().forEach(function(item) {
    db.c2.insert(item);
    db.c1.remove(item);
    });

    # Get Unique keys in a collection
    mr = db.runCommand({
    "mapreduce" : "my_collection",
    "map" : function() {
    for (var key in this) { emit(key, null); }
    },
    "reduce" : function(key, stuff) { return null; },
    "out": "my_collection" + "_keys"
    })
    db[mr.result].distinct("_id")
    ["foo", "bar", "baz", "_id", ...]
  27. kingspp revised this gist May 4, 2019. 1 changed file with 4 additions and 0 deletions.
    4 changes: 4 additions & 0 deletions mongo_utils.js
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,4 @@
    # Rename Fields
    # false for upsert
    # true for multiple documents
    db.<collection_name>.update({"<original_key.h1.h2>": {$exists: true}}, {$rename:{"<original_key.h1.h2>":"<new_name>"}}, false, true);
  28. kingspp revised this gist Mar 18, 2019. 1 changed file with 9 additions and 0 deletions.
    9 changes: 9 additions & 0 deletions py-snippet.py
    Original file line number Diff line number Diff line change
    @@ -590,3 +590,12 @@ def custom_metric():
    import tensorflow as tf
    print(tf.metrics)

    # Test if a function is printing the required log
    import logging
    from testfixtures import LogCapture
    logger = logging.getLogger('')
    with LogCapture() as logs:
    # my awesome code
    logger.error('My code logged an error')
    assert 'My code logged an error' in str(logs)

  29. kingspp revised this gist Feb 20, 2019. 3 changed files with 6 additions and 2 deletions.
    3 changes: 3 additions & 0 deletions bash-snippet.sh
    Original file line number Diff line number Diff line change
    @@ -186,3 +186,6 @@ exit 0
    # To log every command before execution
    set -o xtrace # to revert to normal - set +o xtrace
    # or bash -x myscript.sh

    #list process on port
    lsof -i tcp:<port number>
    2 changes: 0 additions & 2 deletions mac.sh
    Original file line number Diff line number Diff line change
    @@ -1,2 +0,0 @@
    #list process on port
    lsof -i tcp:<port number>
    3 changes: 3 additions & 0 deletions macos-snippets.sh
    Original file line number Diff line number Diff line change
    @@ -42,3 +42,6 @@ caffeinate
    # Timeout
    caffeinate -u -t <seconds>

    # Get rid of allow incoming connections for Python.app for Pycharm
    sudo codesign -f -s - /Library/Frameworks/Python.framework/Versions/3.6/Resources/Python.app/

  30. kingspp revised this gist Feb 15, 2019. 1 changed file with 23 additions and 0 deletions.
    23 changes: 23 additions & 0 deletions py-snippet.py
    Original file line number Diff line number Diff line change
    @@ -567,3 +567,26 @@ def __getitem__(self, item):

    # Check if -ve no in python list
    any(n < 0 for n in any_list)

    # Disable import for a function
    from contextlib import contextmanager

    @contextmanager
    def custom_metric():
    import tensorflow
    t = tensorflow.metrics
    delattr(tensorflow, "metrics")
    yield None
    tensorflow.metrics = t


    with custom_metric():
    import tensorflow as tf
    try:
    print(tf.metrics)
    except AttributeError:
    print("Cannot import metrics inside custom metric")

    import tensorflow as tf
    print(tf.metrics)