Skip to content

Instantly share code, notes, and snippets.

@Jaid
Last active June 20, 2024 00:15
Show Gist options
  • Select an option

  • Save Jaid/5ddee6cf20145da1b1b8745ef23b59e1 to your computer and use it in GitHub Desktop.

Select an option

Save Jaid/5ddee6cf20145da1b1b8745ef23b59e1 to your computer and use it in GitHub Desktop.

Revisions

  1. Jaid revised this gist Jun 20, 2024. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion best-open-source-ai-models.md
    Original file line number Diff line number Diff line change
    @@ -18,4 +18,5 @@ Text to Speech ๐Ÿ‡บ๐Ÿ‡ธ|[Parler](https://github.com/huggingface/parler-tts)|0.6
    Speech to Text ๐Ÿ‡บ๐Ÿ‡ธ|[Whisper Large v3](https://huggingface.co/openai/whisper-large-v3)|1.54
    Text to Speech ๐Ÿ‡ฉ๐Ÿ‡ช|
    Speech to Text ๐Ÿ‡ฉ๐Ÿ‡ช|
    Speech to Language|[Lang ID](https://huggingface.co/speechbrain/lang-id-commonlanguage_ecapa)
    Speech to Language|[Lang ID](https://huggingface.co/speechbrain/lang-id-commonlanguage_ecapa)
    Speaking animation for portraits|[Hallo](https://github.com/fudan-generative-vision/hallo)
  2. Jaid revised this gist Jun 19, 2024. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions best-open-source-ai-models.md
    Original file line number Diff line number Diff line change
    @@ -6,8 +6,8 @@ Image Object Detection|[DETR-DC5 R101](https://huggingface.co/facebook/detr-resn
    Image Masking|[Segment Anything + ViT Huge](https://huggingface.co/facebook/sam-vit-huge)|0.641
    Image Depth Map Creation|[Depth Anything v2 Huge](https://huggingface.co/collections/depth-anything/depth-anything-v2-666b22412f18a6dbfde23a93)|1.3|not released yet, only Small to Large
    Caption to Image|[Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)|2.6
    Caption to Video|[Stable Video Diffusion XT](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt), [Open-Sora](https://github.com/hpcaitech/Open-Sora#model-weights)
    Image to Video|
    Caption to Video|[Open-Sora](https://github.com/hpcaitech/Open-Sora#model-weights)
    Image to Video|[Stable Video Diffusion XT](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)
    Caption to Sound|
    Image to 3D|[InstantMesh](https://huggingface.co/TencentARC/InstantMesh)||taken from [3D Arena](https://huggingface.co/spaces/dylanebert/3d-arena) / maybe [SV3D](https://huggingface.co/stabilityai/sv3d)
    Coding LLM (Instruct)|[DeepSeek Coder v2 Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct)|236
  3. Jaid revised this gist Jun 19, 2024. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion best-open-source-ai-models.md
    Original file line number Diff line number Diff line change
    @@ -6,7 +6,7 @@ Image Object Detection|[DETR-DC5 R101](https://huggingface.co/facebook/detr-resn
    Image Masking|[Segment Anything + ViT Huge](https://huggingface.co/facebook/sam-vit-huge)|0.641
    Image Depth Map Creation|[Depth Anything v2 Huge](https://huggingface.co/collections/depth-anything/depth-anything-v2-666b22412f18a6dbfde23a93)|1.3|not released yet, only Small to Large
    Caption to Image|[Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)|2.6
    Caption to Video|[Stable Video Diffusion XT](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)
    Caption to Video|[Stable Video Diffusion XT](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt), [Open-Sora](https://github.com/hpcaitech/Open-Sora#model-weights)
    Image to Video|
    Caption to Sound|
    Image to 3D|[InstantMesh](https://huggingface.co/TencentARC/InstantMesh)||taken from [3D Arena](https://huggingface.co/spaces/dylanebert/3d-arena) / maybe [SV3D](https://huggingface.co/stabilityai/sv3d)
  4. Jaid revised this gist Jun 18, 2024. 1 changed file with 10 additions and 8 deletions.
    18 changes: 10 additions & 8 deletions best-open-source-ai-models.md
    Original file line number Diff line number Diff line change
    @@ -3,17 +3,19 @@
    Task|Model|Params (billions)|Notes
    ---|---|--:|---
    Image Object Detection|[DETR-DC5 R101](https://huggingface.co/facebook/detr-resnet-101)|0.607
    Depth Map Creation|[Depth Anything v2 Huge](https://huggingface.co/collections/depth-anything/depth-anything-v2-666b22412f18a6dbfde23a93)|1.3|not released yet, only Small to Large
    Image Masking|[Segment Anything + ViT Huge](https://huggingface.co/facebook/sam-vit-huge)|0.641
    Image Depth Map Creation|[Depth Anything v2 Huge](https://huggingface.co/collections/depth-anything/depth-anything-v2-666b22412f18a6dbfde23a93)|1.3|not released yet, only Small to Large
    Caption to Image|[Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)|2.6
    Caption to Video|
    Caption to Video|[Stable Video Diffusion XT](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)
    Image to Video|
    Caption to Sound|
    Masking|[Segment Anything + ViT Huge](https://huggingface.co/facebook/sam-vit-huge)|0.641
    Image to 3D|[InstantMesh](https://huggingface.co/TencentARC/InstantMesh)
    Image to 3D|[InstantMesh](https://huggingface.co/TencentARC/InstantMesh)||taken from [3D Arena](https://huggingface.co/spaces/dylanebert/3d-arena) / maybe [SV3D](https://huggingface.co/stabilityai/sv3d)
    Coding LLM (Instruct)|[DeepSeek Coder v2 Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct)|236
    Code LLM (Completion / Filling holes)|[DeepSeek Coder v2 Base](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Base)|236
    General LLM (Instruct)|[Llama 3](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)|70.6
    General LLM (Completion / Filling holes)|[Roberta Large](https://huggingface.co/FacebookAI/roberta-large)|0.355
    Text to Speech ๐Ÿ‡บ๐Ÿ‡ธ|
    Speech to Text (en)|[Whisper Large v3](https://huggingface.co/openai/whisper-large-v3)|1.54
    Text to Speech (de)|
    Speech to Text (de)|
    Text to Speech ๐Ÿ‡บ๐Ÿ‡ธ|[Parler](https://github.com/huggingface/parler-tts)|0.6
    Speech to Text ๐Ÿ‡บ๐Ÿ‡ธ|[Whisper Large v3](https://huggingface.co/openai/whisper-large-v3)|1.54
    Text to Speech ๐Ÿ‡ฉ๐Ÿ‡ช|
    Speech to Text ๐Ÿ‡ฉ๐Ÿ‡ช|
    Speech to Language|[Lang ID](https://huggingface.co/speechbrain/lang-id-commonlanguage_ecapa)
  5. Jaid created this gist Jun 18, 2024.
    19 changes: 19 additions & 0 deletions best-open-source-ai-models.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,19 @@
    # Current best open-source AI models

    Task|Model|Params (billions)|Notes
    ---|---|--:|---
    Image Object Detection|[DETR-DC5 R101](https://huggingface.co/facebook/detr-resnet-101)|0.607
    Depth Map Creation|[Depth Anything v2 Huge](https://huggingface.co/collections/depth-anything/depth-anything-v2-666b22412f18a6dbfde23a93)|1.3|not released yet, only Small to Large
    Caption to Image|[Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)|2.6
    Caption to Video|
    Caption to Sound|
    Masking|[Segment Anything + ViT Huge](https://huggingface.co/facebook/sam-vit-huge)|0.641
    Image to 3D|[InstantMesh](https://huggingface.co/TencentARC/InstantMesh)
    Coding LLM (Instruct)|[DeepSeek Coder v2 Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct)|236
    Code LLM (Completion / Filling holes)|[DeepSeek Coder v2 Base](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Base)|236
    General LLM (Instruct)|[Llama 3](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)|70.6
    General LLM (Completion / Filling holes)|[Roberta Large](https://huggingface.co/FacebookAI/roberta-large)|0.355
    Text to Speech ๐Ÿ‡บ๐Ÿ‡ธ|
    Speech to Text (en)|[Whisper Large v3](https://huggingface.co/openai/whisper-large-v3)|1.54
    Text to Speech (de)|
    Speech to Text (de)|