Jaid · June 20, 2024 00:15 · Jun 20, 2024 · Jun 19, 2024 · Jun 19, 2024 · Jun 18, 2024
diff --git a/best-open-source-ai-models.md b/best-open-source-ai-models.md
@@ -18,4 +18,5 @@ Text to Speech 🇺🇸|[Parler](https://github.com/huggingface/parler-tts)|0.6
 Speech to Text 🇺🇸|[Whisper Large v3](https://huggingface.co/openai/whisper-large-v3)|1.54
 Text to Speech 🇩🇪|
 Speech to Text 🇩🇪|
-Speech to Language|[Lang ID](https://huggingface.co/speechbrain/lang-id-commonlanguage_ecapa)
+Speech to Language|[Lang ID](https://huggingface.co/speechbrain/lang-id-commonlanguage_ecapa)
+Speaking animation for portraits|[Hallo](https://github.com/fudan-generative-vision/hallo)
diff --git a/best-open-source-ai-models.md b/best-open-source-ai-models.md
@@ -6,8 +6,8 @@ Image Object Detection|[DETR-DC5 R101](https://huggingface.co/facebook/detr-resn
 Image Masking|[Segment Anything + ViT Huge](https://huggingface.co/facebook/sam-vit-huge)|0.641
 Image Depth Map Creation|[Depth Anything v2 Huge](https://huggingface.co/collections/depth-anything/depth-anything-v2-666b22412f18a6dbfde23a93)|1.3|not released yet, only Small to Large
 Caption to Image|[Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)|2.6
-Caption to Video|[Stable Video Diffusion XT](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt), [Open-Sora](https://github.com/hpcaitech/Open-Sora#model-weights)
-Image to Video|
+Caption to Video|[Open-Sora](https://github.com/hpcaitech/Open-Sora#model-weights)
+Image to Video|[Stable Video Diffusion XT](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)
 Caption to Sound|
 Image to 3D|[InstantMesh](https://huggingface.co/TencentARC/InstantMesh)||taken from [3D Arena](https://huggingface.co/spaces/dylanebert/3d-arena) / maybe [SV3D](https://huggingface.co/stabilityai/sv3d)
 Coding LLM (Instruct)|[DeepSeek Coder v2 Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct)|236

diff --git a/best-open-source-ai-models.md b/best-open-source-ai-models.md
@@ -6,7 +6,7 @@ Image Object Detection|[DETR-DC5 R101](https://huggingface.co/facebook/detr-resn
 Image Masking|[Segment Anything + ViT Huge](https://huggingface.co/facebook/sam-vit-huge)|0.641
 Image Depth Map Creation|[Depth Anything v2 Huge](https://huggingface.co/collections/depth-anything/depth-anything-v2-666b22412f18a6dbfde23a93)|1.3|not released yet, only Small to Large
 Caption to Image|[Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)|2.6
-Caption to Video|[Stable Video Diffusion XT](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)
+Caption to Video|[Stable Video Diffusion XT](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt), [Open-Sora](https://github.com/hpcaitech/Open-Sora#model-weights)
 Image to Video|
 Caption to Sound|
 Image to 3D|[InstantMesh](https://huggingface.co/TencentARC/InstantMesh)||taken from [3D Arena](https://huggingface.co/spaces/dylanebert/3d-arena) / maybe [SV3D](https://huggingface.co/stabilityai/sv3d)

diff --git a/best-open-source-ai-models.md b/best-open-source-ai-models.md
@@ -3,17 +3,19 @@
 Task|Model|Params (billions)|Notes
 ---|---|--:|---
 Image Object Detection|[DETR-DC5 R101](https://huggingface.co/facebook/detr-resnet-101)|0.607
-Depth Map Creation|[Depth Anything v2 Huge](https://huggingface.co/collections/depth-anything/depth-anything-v2-666b22412f18a6dbfde23a93)|1.3|not released yet, only Small to Large
+Image Masking|[Segment Anything + ViT Huge](https://huggingface.co/facebook/sam-vit-huge)|0.641
+Image Depth Map Creation|[Depth Anything v2 Huge](https://huggingface.co/collections/depth-anything/depth-anything-v2-666b22412f18a6dbfde23a93)|1.3|not released yet, only Small to Large
 Caption to Image|[Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)|2.6
-Caption to Video|
+Caption to Video|[Stable Video Diffusion XT](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)
+Image to Video|
 Caption to Sound|
-Masking|[Segment Anything + ViT Huge](https://huggingface.co/facebook/sam-vit-huge)|0.641
-Image to 3D|[InstantMesh](https://huggingface.co/TencentARC/InstantMesh)
+Image to 3D|[InstantMesh](https://huggingface.co/TencentARC/InstantMesh)||taken from [3D Arena](https://huggingface.co/spaces/dylanebert/3d-arena) / maybe [SV3D](https://huggingface.co/stabilityai/sv3d)
 Coding LLM (Instruct)|[DeepSeek Coder v2 Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct)|236
 Code LLM (Completion / Filling holes)|[DeepSeek Coder v2 Base](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Base)|236
 General LLM (Instruct)|[Llama 3](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)|70.6
 General LLM (Completion / Filling holes)|[Roberta Large](https://huggingface.co/FacebookAI/roberta-large)|0.355
-Text to Speech 🇺🇸|
-Speech to Text (en)|[Whisper Large v3](https://huggingface.co/openai/whisper-large-v3)|1.54
-Text to Speech (de)|
-Speech to Text (de)|
+Text to Speech 🇺🇸|[Parler](https://github.com/huggingface/parler-tts)|0.6
+Speech to Text 🇺🇸|[Whisper Large v3](https://huggingface.co/openai/whisper-large-v3)|1.54
+Text to Speech 🇩🇪|
+Speech to Text 🇩🇪|
+Speech to Language|[Lang ID](https://huggingface.co/speechbrain/lang-id-commonlanguage_ecapa)
diff --git a/best-open-source-ai-models.md b/best-open-source-ai-models.md
@@ -0,0 +1,19 @@
+# Current best open-source AI models
+
+Task|Model|Params (billions)|Notes
+---|---|--:|---
+Image Object Detection|[DETR-DC5 R101](https://huggingface.co/facebook/detr-resnet-101)|0.607
+Depth Map Creation|[Depth Anything v2 Huge](https://huggingface.co/collections/depth-anything/depth-anything-v2-666b22412f18a6dbfde23a93)|1.3|not released yet, only Small to Large
+Caption to Image|[Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)|2.6
+Caption to Video|
+Caption to Sound|
+Masking|[Segment Anything + ViT Huge](https://huggingface.co/facebook/sam-vit-huge)|0.641
+Image to 3D|[InstantMesh](https://huggingface.co/TencentARC/InstantMesh)
+Coding LLM (Instruct)|[DeepSeek Coder v2 Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct)|236
+Code LLM (Completion / Filling holes)|[DeepSeek Coder v2 Base](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Base)|236
+General LLM (Instruct)|[Llama 3](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)|70.6
+General LLM (Completion / Filling holes)|[Roberta Large](https://huggingface.co/FacebookAI/roberta-large)|0.355
+Text to Speech 🇺🇸|
+Speech to Text (en)|[Whisper Large v3](https://huggingface.co/openai/whisper-large-v3)|1.54
+Text to Speech (de)|
+Speech to Text (de)|
No results found