Skip to content

Instantly share code, notes, and snippets.

View NilsRethmeier's full-sized avatar
:shipit:
| coding > leading > planning | < ☕

Nils Rethmeier NilsRethmeier

:shipit:
| coding > leading > planning | < ☕
View GitHub Profile
@NilsRethmeier
NilsRethmeier / multi_dimensional_crossentropy_pytorch.py
Created February 10, 2023 23:04
How to use torch.nn.CrossEntropyLoss with more than 2 dimensions -- i.e. permute the make (batch_size, num_classes, dim3, dim4). E.g. token in sequence classification
import torch
def multi_dim_cross_entropy_test():
""" This loss needs the shapes out of prediction order, rather than bs, seq_len, num_labels it needs:
input: bs, num_labels, seq_len
out: bs, seq_len ... where there is one label is = 0.. N per seq_len position
"""
# Example of target with class indices
torch.manual_seed(0)
loss = torch.nn.CrossEntropyLoss(reduction='none')
@NilsRethmeier
NilsRethmeier / bash_history_to_zsh_history.sh
Created September 13, 2022 15:08
copy bash history to zsh history using awk -- found in a user commet
sort ~/.bash_history | uniq | awk '{print ": :0:;"$0}' >> ~/.zsh_history
@NilsRethmeier
NilsRethmeier / gradient_accumulation.py
Created September 1, 2020 13:22 — forked from thomwolf/gradient_accumulation.py
PyTorch gradient accumulation training loop
model.zero_grad() # Reset gradients tensors
for i, (inputs, labels) in enumerate(training_set):
predictions = model(inputs) # Forward pass
loss = loss_function(predictions, labels) # Compute loss function
loss = loss / accumulation_steps # Normalize our loss (if averaged)
loss.backward() # Backward pass
if (i+1) % accumulation_steps == 0: # Wait for several backward steps
optimizer.step() # Now we can do an optimizer step
model.zero_grad() # Reset gradients tensors
if (i+1) % evaluation_steps == 0: # Evaluate the model when we...