feihugis / matmul_fp8.py

Created February 7, 2023 23:28

	import numpy as np
	import time
	import torch
	import triton
	import triton.language as tl
	import pytest

	import torch

	import triton

feihugis / tf_func_perf.py

Last active June 11, 2020 06:25

	import tensorflow as tf
	import tensorflow_hub as hub
	import numpy as np

	import enum
	import os
	import time
	import timeit

	class InferMode(enum.Enum):

feihugis / vscode_shortcuts.md

Created March 16, 2020 15:51 — forked from bradtraversy/vscode_shortcuts.md

Helpful shortcuts for VSCode

VSCode Shortcuts

List of helpful shortcuts for faster coding

If you have any other helpful shortcuts, feel free to add in the comments of this gist :)

Official List of all commands

Windows
Mac

feihugis / profiling_trace_intel_tf_1.14.0.json

Created October 11, 2019 22:48

LSTM Profiling on TensorFlow 1.14.0

This file has been truncated, but you can view the full file.

	{
	"traceEvents": [
	{
	"name": "process_name",
	"ph": "M",
	"pid": 0,
	"args": {
	"name": "Allocators"
	}
	},

feihugis / Log for the test_case_7 (num_parallel_calls=1, num_thread=2) of ParallelMapDatasetOp

Created April 8, 2019 18:46

	exec ${PAGER:-/usr/bin/less} "$0" \|\| exit 1
	Executing tests from //tensorflow/core/kernels/data:parallel_map_dataset_op_test
	-----------------------------------------------------------------------------
	2019-04-08 18:44:50.418358: I tensorflow/core/platform/cloud/gcs_file_system.cc:688] GCS cache max size = 16777216 ; block size = 16777216 ; max staleness = 0
	2019-04-08 18:44:50.419029: I ./tensorflow/core/platform/cloud/ram_file_block_cache.h:63] GCS file block cache is enabled
	2019-04-08 18:44:50.419044: I tensorflow/core/platform/cloud/gcs_file_system.cc:728] GCS DNS cache is disabled, because GCS_RESOLVE_REFRESH_SECS = 0 (or is not set)
	2019-04-08 18:44:50.419054: I tensorflow/core/platform/cloud/gcs_file_system.cc:758] GCS additional header DISABLED. No environment variable set.
	2019-04-08 18:44:50.432563: I tensorflow/core/platform/cloud/gcs_file_system.cc:688] GCS cache max size = 16777216 ; block size = 16777216 ; max staleness = 0
	2019-04-08 18:44:50.432601: I ./tensorflow/core/platform/cloud/ram_fil

feihugis / Log for the test_case_7 (num_parallel_calls=num_thread=2) of ParallelMapDatasetOp

Last active April 8, 2019 18:41

	exec ${PAGER:-/usr/bin/less} "$0" \|\| exit 1
	Executing tests from //tensorflow/core/kernels/data:parallel_map_dataset_op_test
	-----------------------------------------------------------------------------
	2019-04-08 17:36:26.766535: I tensorflow/core/platform/cloud/gcs_file_system.cc:688] GCS cache max size = 16777216 ; block size = 16777216 ; max staleness = 0
	2019-04-08 17:36:26.767225: I ./tensorflow/core/platform/cloud/ram_file_block_cache.h:63] GCS file block cache is enabled
	2019-04-08 17:36:26.767242: I tensorflow/core/platform/cloud/gcs_file_system.cc:728] GCS DNS cache is disabled, because GCS_RESOLVE_REFRESH_SECS = 0 (or is not set)
	2019-04-08 17:36:26.767254: I tensorflow/core/platform/cloud/gcs_file_system.cc:758] GCS additional header DISABLED. No environment variable set.
	2019-04-08 17:36:26.783000: I tensorflow/core/platform/cloud/gcs_file_system.cc:688] GCS cache max size = 16777216 ; block size = 16777216 ; max staleness = 0
	2019-04-08 17:36:26.783048: I ./tensorflow/core/platform/cloud/ram_fil

feihugis / Logs

Last active February 1, 2019 00:06

Logs from `bazel test -c dbg //tensorflow/python/data/kernel_tests:map_test`

	INFO: From Linking tensorflow/libtensorflow_framework.so [for host]:
	ld: warning: text-based stub file /System/Library/Frameworks//IOKit.framework/IOKit.tbd and library file /System/Library/Frameworks//IOKit.framework/IOKit are out of sync. Falling back to library file for linking.
	INFO: From Linking tensorflow/libtensorflow_framework.so:
	ld: warning: text-based stub file /System/Library/Frameworks//IOKit.framework/IOKit.tbd and library file /System/Library/Frameworks//IOKit.framework/IOKit are out of sync. Falling back to library file for linking.
	[5,815 / 5,816] Linking tensorflow/python/_pywrap_tensorflow_internal.so; 54s local

	......

	ld: warning: cannot export hidden symbol std::__1::__function::__base<tensorflow::Status (tensorflow::ReaderInterface**)>::~__base() from bazel-out/darwin-dbg/bin/tensorflow/core/kernels/liblmdb_reader_op.lo(lmdb_reader_op.pic.o)
	ld: warning: cannot export hidden symbol std::__1::__function::__base<tensorflow::ReaderInterface* ()>::~__base() from bazel-out/darwin-dbg/bin/te

feihugis / building_tensorflow.md

Created September 21, 2018 05:50 — forked from kmhofmann/building_tensorflow.md

Building TensorFlow from source

The official instructions on building TensorFlow are here: https://www.tensorflow.org/install/install_sources

Prerequisites

We are assuming a build with CUDA support, as well as including SIMD optimizations (SSE3, SSE4, AVX, AVX2, FMA), on a Debian-like system (e.g. Ubuntu Linux).

On new systems, one will have to install CUDA, CuDNN, plus the following dependencies:

feihugis / TENSORFLOW_DEBUG.md

Created September 21, 2018 05:42 — forked from Mistobaan/TENSORFLOW_DEBUG.md

Tensorflow Internals Debugging Techniques

Machine Setup August 2016

Linux Ubuntu 2016.

1080 GTX
SDK 8.0
CuDNN 5.1

ENABLE Core dumps

ulimit -c unlimited

Fei Hu feihugis

VSCode Shortcuts

Official List of all commands

Building TensorFlow from source

Prerequisites

Machine Setup August 2016

ENABLE Core dumps