RHAIIS docs improvements

Develop user guides for building AI agents with LangChain/LangGraph

Create practical guides showing common end-user scenarios for building AI agents using LangChain and LangGraph frameworks with RHAIIS.

Create Jupyter notebook examples for RHAIIS enablement

Develop interactive Jupyter notebooks demonstrating key RHAIIS features and workflows, similar to the reference example (https://colab.research.google.com/drive/1JnVdTtIPC2M0ybD2Tz06HEctiLEak0Vw).

Add model/accelerator compatibility matrix to RHAIIS documentation

Create a comprehensive table in RHAIIS docs showing model and accelerator support, similar to vLLM's supported models page but enhanced with accelerator information. Matrix shows RHAIIS version, accelerator type, and vLLM version Compatibility notes and limitations documented

Reference: https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models

Expand vLLM and LLM Compressor docs with optimal parameters for priority models

Enhance documentation for vLLM and LLM Compressor focusing on priority models (Llama3, Qwen2, etc.) with optimal running methods, parameters, and configurations.

Optimal parameters specified for each model
Running methods and best practices included
Performance benchmarks provided where applicable

Document runtime and memory requirements per model and hardware configuration

Memory requirements documented per model Runtime expectations provided for common hardware Minimum/recommended hardware specs specified Troubleshooting tips for memory issues included

Improve vLLM server arguments documentation with customer use cases for RHAIIS

https://issues.redhat.com/browse/INFERENG-2006

RHAIIS docs lack clear guidance on how to configure vLLM server arguments for real-world customer scenarios.

To address this, we should expand the vLLM server arguments documentation by mapping arguments and configurations to customer use cases. The goal is to provide practical, Red Hat–specific value that helps customers confidently deploy and tune inference workloads on RHAIIS.

Document how to use tool calling and mounting custom chat templates for RHAIIS

https://issues.redhat.com/browse/INFERENG-2434

Document how to mount and include custom chat templates, and document existing supported chat templates. Must include details for gpt-oss and Qwen3 models.

Also document tool calling for supported models and hardware:

Spyre: https://www.ibm.com/granite/docs/run/granite-with-vllm-containerized#4-enabling-tool-calling-and-other-extended-capabilities

aireilly/rhaiis-jiras.md

Select an option

No results found