Create practical guides showing common end-user scenarios for building AI agents using LangChain and LangGraph frameworks with RHAIIS.
Develop interactive Jupyter notebooks demonstrating key RHAIIS features and workflows, similar to the reference example (https://colab.research.google.com/drive/1JnVdTtIPC2M0ybD2Tz06HEctiLEak0Vw).
Create a comprehensive table in RHAIIS docs showing model and accelerator support, similar to vLLM's supported models page but enhanced with accelerator information. Matrix shows RHAIIS version, accelerator type, and vLLM version Compatibility notes and limitations documented
Reference: https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models
Related issue: https://issues.redhat.com/browse/RHAIRFE-576
Enhance documentation for vLLM and LLM Compressor focusing on priority models (Llama3, Qwen2, etc.) with optimal running methods, parameters, and configurations.
- Optimal parameters specified for each model
- Running methods and best practices included
- Performance benchmarks provided where applicable
Memory requirements documented per model Runtime expectations provided for common hardware Minimum/recommended hardware specs specified Troubleshooting tips for memory issues included
https://issues.redhat.com/browse/INFERENG-2006
RHAIIS docs lack clear guidance on how to configure vLLM server arguments for real-world customer scenarios.
To address this, we should expand the vLLM server arguments documentation by mapping arguments and configurations to customer use cases. The goal is to provide practical, Red Hat–specific value that helps customers confidently deploy and tune inference workloads on RHAIIS.
https://issues.redhat.com/browse/INFERENG-2434
Document how to mount and include custom chat templates, and document existing supported chat templates. Must include details for gpt-oss and Qwen3 models.
Also document tool calling for supported models and hardware: