Daniel Vila Suero dvsrepo

Building data tools at Hugging Face (previously CEO & co-founder of Argilla, acquired by Hugging Face)

dvsrepo / zephyr-7b-spin-iter1-v0-Nous.md

Created March 7, 2024 11:07

Model	AGIEval	GPT4All	TruthfulQA	Bigbench
zephyr-7b-spin-iter1-v0	Error: File does not exist	Error: File does not exist	Error: File does not exist	Error: File does not exist

Average: Error: File does not exist%

dvsrepo / zephyr-7b-spin-iter1-v0-Nous.md

Created March 7, 2024 11:03

Model	AGIEval	GPT4All	TruthfulQA	Bigbench
zephyr-7b-spin-iter1-v0	Error: File does not exist	Error: File does not exist	Error: File does not exist	Error: File does not exist

Average: Error: File does not exist%

dvsrepo / rubrix-stanza.ipynb

Created December 19, 2021 21:59

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

dvsrepo / rubrix_shap_example.py

Last active September 22, 2022 09:29

	import transformers
	from datasets import load_dataset

	from sklearn.preprocessing import MinMaxScaler

	import shap

	from rubrix import TextClassificationRecord, TokenAttributions

	import rubrix as rb

dvsrepo / rubrix_interpret.py

Last active October 23, 2021 11:11

	from transformers import AutoModelForSequenceClassification, AutoTokenizer
	from transformers_interpret import SequenceClassificationExplainer
	from datasets import load_dataset

	import rubrix as rb
	from rubrix import TokenAttributions

	# Load Stanford sentiment treebank test set
	dataset = load_dataset("sst", "default", split="test")

dvsrepo / huggingface_rubrix_example_load_train.py

Last active June 2, 2021 11:17

	from datasets import Dataset
	import rubrix as rb

	# load rubrix dataset
	df = rb.load('unlabelled_dataset_zeroshot')

	# inputs can be dicts to support multifield classifiers, we just use the text here.
	df['text'] = df.inputs.transform(lambda r: r['text'])

	# we flatten the annotations and create a dict for turning labels into numeric ids

dvsrepo / huggingface_rubrix_example.py

Last active June 2, 2021 11:22

	from transformers import AutoModelForSequenceClassification
	from transformers import AutoTokenizer
	from transformers import Trainer

	# from here, it's just regular fine-tuning with 🤗 transformers
	tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
	model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=4)

	def tokenize_function(examples):
	return tokenizer(examples["text"], padding="max_length", truncation=True)

dvsrepo / rubrix_ner.py

Last active June 1, 2021 21:39

	text = "I love the song Computer Love from Kraftwerk"

	record = rb.TokenClassificationRecord(
	text=text,
	tokens=[t for t in text.split(' ')],
	prediction=[("SONG", 16, 29), ("BAND", 35, 44)],
	prediction_agent="my_ner_model_v1"
	)
	rb.log(record, name="ner_bands_dataset")

dvsrepo / rubrix-example.py

Last active June 1, 2021 20:11

	from transformers import pipeline
	from datasets import load_dataset
	import rubrix as rb

	model = pipeline('zero-shot-classification', model="typeform/squeezebert-mnli")
	dataset = load_dataset("ag_news", split='test')
	# Labels are: 'World', 'Sports', 'Business', 'Sci/Tech'
	labels = dataset.features["label"].names

	for example in dataset:

dvsrepo / example_0D_duration.ttl

Last active April 22, 2021 08:07

	@base <https://www.food.com/recipe/> .
	@prefix ind: <http://purl.org/heals/ingredient/> .
	@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
	@prefix wtm: <http://purl.org/heals/food/> .
	@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

	ind:Almond a wtm:Ingredient ;
	skos:definition "the nutlike kernel of the fruit of either of two trees, Prunus dulcis (sweet almond) or P. dulcis amara (bitter almond), which grow in warm temperate regions" ;
	skos:prefLabel "almond" .