Last active
October 7, 2025 11:11
-
-
Save aireilly/ec0690c6a3925aad1d64e8997061adb1 to your computer and use it in GitHub Desktop.
Revisions
-
aireilly revised this gist
Oct 7, 2025 . 1 changed file with 5 additions and 10 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -10,9 +10,13 @@ Develop interactive Jupyter notebooks demonstrating key RHAIIS features and work ## Add model/accelerator compatibility matrix to RHAIIS documentation Create a comprehensive table in RHAIIS docs showing model and accelerator support, similar to vLLM's supported models page but enhanced with accelerator information. Matrix shows RHAIIS version, accelerator type, and vLLM version Compatibility notes and limitations documented Reference: https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models Related issue: https://issues.redhat.com/browse/RHAIRFE-576 ## Expand vLLM and LLM Compressor docs with optimal parameters for priority models Enhance documentation for vLLM and LLM Compressor focusing on priority models (Llama3, Qwen2, etc.) with optimal running methods, parameters, and configurations. @@ -28,15 +32,6 @@ Runtime expectations provided for common hardware Minimum/recommended hardware specs specified Troubleshooting tips for memory issues included ## Improve vLLM server arguments documentation with customer use cases for RHAIIS https://issues.redhat.com/browse/INFERENG-2006 -
aireilly created this gist
Oct 6, 2025 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,57 @@ # RHAIIS docs improvements ## Develop user guides for building AI agents with LangChain/LangGraph Create practical guides showing common end-user scenarios for building AI agents using LangChain and LangGraph frameworks with RHAIIS. ## Create Jupyter notebook examples for RHAIIS enablement Develop interactive Jupyter notebooks demonstrating key RHAIIS features and workflows, similar to the reference example (https://colab.research.google.com/drive/1JnVdTtIPC2M0ybD2Tz06HEctiLEak0Vw). ## Add model/accelerator compatibility matrix to RHAIIS documentation Create a comprehensive table in RHAIIS docs showing model and accelerator support, similar to vLLM's supported models page but enhanced with accelerator information. Reference: https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models ## Expand vLLM and LLM Compressor docs with optimal parameters for priority models Enhance documentation for vLLM and LLM Compressor focusing on priority models (Llama3, Qwen2, etc.) with optimal running methods, parameters, and configurations. * Optimal parameters specified for each model * Running methods and best practices included * Performance benchmarks provided where applicable ## Document runtime and memory requirements per model and hardware configuration Memory requirements documented per model Runtime expectations provided for common hardware Minimum/recommended hardware specs specified Troubleshooting tips for memory issues included ## Document vLLM version compatibility matrix per accelerator and RHAIIS version Create documentation showing which vLLM version is supported for each accelerator type across different RHAIIS versions. Related Issue: https://issues.redhat.com/browse/RHAIRFE-576 Matrix shows RHAIIS version, accelerator type, and vLLM version Compatibility notes and limitations documented ## Improve vLLM server arguments documentation with customer use cases for RHAIIS https://issues.redhat.com/browse/INFERENG-2006 RHAIIS docs lack clear guidance on how to configure vLLM server arguments for real-world customer scenarios. To address this, we should expand the vLLM server arguments documentation by mapping arguments and configurations to customer use cases. The goal is to provide practical, Red Hat–specific value that helps customers confidently deploy and tune inference workloads on RHAIIS. ## Document how to use tool calling and mounting custom chat templates for RHAIIS https://issues.redhat.com/browse/INFERENG-2434 Document how to mount and include custom chat templates, and document existing supported chat templates. Must include details for gpt-oss and Qwen3 models. Also document tool calling for supported models and hardware: Spyre: https://www.ibm.com/granite/docs/run/granite-with-vllm-containerized#4-enabling-tool-calling-and-other-extended-capabilities