Skip to content

Instantly share code, notes, and snippets.

View netankit's full-sized avatar

Ankit Bahuguna netankit

View GitHub Profile
@thomwolf
thomwolf / loading_wikipedia.py
Last active January 12, 2025 13:34
Load full English Wikipedia dataset in HuggingFace nlp library
import os; import psutil; import timeit
from datasets import load_dataset
mem_before = psutil.Process(os.getpid()).memory_info().rss >> 20
wiki = load_dataset("wikipedia", "20200501.en", split='train')
mem_after = psutil.Process(os.getpid()).memory_info().rss >> 20
print(f"RAM memory used: {(mem_after - mem_before)} MB")
s = """batch_size = 1000
for i in range(0, len(wiki), batch_size):
@mmellison
mmellison / grpc_asyncio.py
Last active August 6, 2024 01:23
gRPC Servicer with Asyncio (Python 3.6+)
import asyncio
from concurrent import futures
import functools
import inspect
import threading
from grpc import _server
def _loop_mgr(loop: asyncio.AbstractEventLoop):
@bastman
bastman / docker-cleanup-resources.md
Created March 31, 2016 05:55
docker cleanup guide: containers, images, volumes, networks

Docker - How to cleanup (unused) resources

Once in a while, you may need to cleanup resources (containers, volumes, images, networks) ...

delete volumes

// see: https://github.com/chadoe/docker-cleanup-volumes

$ docker volume rm $(docker volume ls -qf dangling=true)

$ docker volume ls -qf dangling=true | xargs -r docker volume rm

import theano
from pylearn2.models import mlp
from pylearn2.training_algorithms import sgd
from pylearn2.termination_criteria import EpochCounter
from pylearn2.datasets.dense_design_matrix import DenseDesignMatrix
import numpy as np
from random import randint
class XOR(DenseDesignMatrix):
@bwhite
bwhite / rank_metrics.py
Created September 15, 2012 03:23
Ranking Metrics
"""Information Retrieval metrics
Useful Resources:
http://www.cs.utexas.edu/~mooney/ir-course/slides/Evaluation.ppt
http://www.nii.ac.jp/TechReports/05-014E.pdf
http://www.stanford.edu/class/cs276/handouts/EvaluationNew-handout-6-per.pdf
http://hal.archives-ouvertes.fr/docs/00/72/67/60/PDF/07-busa-fekete.pdf
Learning to Rank for Information Retrieval (Tie-Yan Liu)
"""
import numpy as np
@entaroadun
entaroadun / gist:1653794
Created January 21, 2012 20:10
Recommendation and Ratings Public Data Sets For Machine Learning

Movies Recommendation:

Music Recommendation: