Skip to content

Instantly share code, notes, and snippets.

View fvarno's full-sized avatar
🏟️
Building something phenomenon ...

Farshid Varno fvarno

🏟️
Building something phenomenon ...
View GitHub Profile
@fvarno
fvarno / gemm_tiling.py
Created February 12, 2024 16:25
GeMM with tiling
import torch
# multiple a MxN matrix with a NxK matrix
M, N, K = 20,10,30
A = torch.randn(M,N)
B = torch.randn(N,K)
untiled_res = torch.matmul(A, B)
tile_size=5
A_ = A.reshape(M, 1, N//tile_size, 1, tile_size)
B_ = B.reshape(N//tile_size, tile_size, K//tile_size, tile_size).permute(2,0,1,3)
@fvarno
fvarno / total_activation_size_mobilenetv2.py
Created September 7, 2023 13:42
Gives total number of activations in MobileNetV2
import torch
from torchvision.models import MobileNetV2
def main():
model = MobileNetV2()
total = [0,]
def _forward_hook(module, input, output):
total[0] += output.numel()
for _, module in model.named_modules():
module.register_forward_hook(_forward_hook)
@fvarno
fvarno / fix_gcp_nvidia_fail.sh
Created August 28, 2023 13:38
Fix GCP error "NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver."
sudo apt-get purge nvidia-*
sudo apt-get update
sudo apt-get autoremove
# then stop and start the VM again
curl https://raw.githubusercontent.com/GoogleCloudPlatform/compute-gpu-installation/main/linux/install_gpu_driver.py --output install_gpu_driver.py
sudo python3 install_gpu_driver.py
@fvarno
fvarno / Matrix.md
Created June 15, 2023 20:14 — forked from nadavrot/Matrix.md
Efficient matrix multiplication

High-Performance Matrix Multiplication

This is a short post that explains how to write a high-performance matrix multiplication program on modern processors. In this tutorial I will use a single core of the Skylake-client CPU with AVX2, but the principles in this post also apply to other processors with different instruction sets (such as AVX512).

Intro

Matrix multiplication is a mathematical operation that defines the product of

@fvarno
fvarno / jupyter.yaml
Created September 30, 2021 19:44
polyaxon jupyter service with optional number of gpus
version: 1.1
kind: operation
component:
name: notebook
inputs:
- name: gpus
isOptional: true
type: int
value: 0
-name: image
@fvarno
fvarno / vscode.yaml
Created September 30, 2021 19:42
polyaxon vscode service with optional number of gpus
version: 1.1
kind: operation
component:
name: vscode
inputs:
- name: context
description: The workspace context, defaults to the current run's outputs
isOptional: true
type: str
@fvarno
fvarno / data_loader.py
Created November 5, 2018 13:51 — forked from kevinzakka/data_loader.py
Train, Validation and Test Split for torchvision Datasets
"""
Create train, valid, test iterators for CIFAR-10 [1].
Easily extended to MNIST, CIFAR-100 and Imagenet.
[1]: https://discuss.pytorch.org/t/feedback-on-pytorch-for-kaggle-competitions/2252/4
"""
import torch
import numpy as np
@fvarno
fvarno / meta-learning-timeline.csv
Last active October 17, 2018 22:05
meta-learning timeline
name or nikname author(s) year category description
Evolutionary principles in self-referential learning or on learning how to learn: the meta-meta Schmidhuber 1987 ? ?
Meta-neural networks that learn by learning. Naik et al. 1992 ? ?