Skip to content

Instantly share code, notes, and snippets.

@xmfbit
xmfbit / model.py
Last active March 5, 2023 17:03
ResNet-164 training experiment on CIFAR10 using PyTorch, see the paper: Identity Mappings in Deep Residual Networks
import torch
import torch.nn as nn
import math
## the model definition
# see HeKaiming's implementation using torch:
# https://github.com/KaimingHe/resnet-1k-layers/blob/master/README.md
class Bottleneck(nn.Module):
expansion = 4 # # output cahnnels / # input channels

A Tour of PyTorch Internals (Part I)

The fundamental unit in PyTorch is the Tensor. This post will serve as an overview for how we implement Tensors in PyTorch, such that the user can interact with it from the Python shell. In particular, we want to answer four main questions:

  1. How does PyTorch extend the Python interpreter to define a Tensor type that can be manipulated from Python code?
  2. How does PyTorch wrap the C libraries that actually define the Tensor's properties and methods?
  3. How does PyTorch cwrap work to generate code for Tensor methods?
  4. How does PyTorch's build system take all of these components to compile and generate a workable application?

Extending the Python Interpreter

PyTorch defines a new package torch. In this post we will consider the ._C module. This module is known as an "extension module" - a Python module written in C. Such modules allow us to define new built-in object types (e.g. the Tensor) and to call C/C++ functions.

@honkskillet
honkskillet / byte-sizetuts.md
Last active August 22, 2024 14:19
A series of golang tutorials with youtube videos.
@neubig
neubig / lstm-lm.py
Last active August 23, 2017 09:18
This is a minimal implementation of training for a language model using long short-term memory (LSTM) neural networks
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# This is a simplified implementation of the LSTM language model (by Graham Neubig)
#
# LSTM Neural Networks for Language Modeling
# Martin Sundermeyer, Ralf Schlüter, Hermann Ney
# InterSpeech 2012
#
# The structure of the model is extremely simple. At every time step we