Skip to content

Instantly share code, notes, and snippets.

@aireilly
Last active October 7, 2025 11:11
Show Gist options
  • Select an option

  • Save aireilly/ec0690c6a3925aad1d64e8997061adb1 to your computer and use it in GitHub Desktop.

Select an option

Save aireilly/ec0690c6a3925aad1d64e8997061adb1 to your computer and use it in GitHub Desktop.
RHAIIS docs improvements JIRAs

RHAIIS docs improvements

Develop user guides for building AI agents with LangChain/LangGraph

Create practical guides showing common end-user scenarios for building AI agents using LangChain and LangGraph frameworks with RHAIIS.

Create Jupyter notebook examples for RHAIIS enablement

Develop interactive Jupyter notebooks demonstrating key RHAIIS features and workflows, similar to the reference example (https://colab.research.google.com/drive/1JnVdTtIPC2M0ybD2Tz06HEctiLEak0Vw).

Add model/accelerator compatibility matrix to RHAIIS documentation

Create a comprehensive table in RHAIIS docs showing model and accelerator support, similar to vLLM's supported models page but enhanced with accelerator information. Matrix shows RHAIIS version, accelerator type, and vLLM version Compatibility notes and limitations documented

Reference: https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models

Related issue: https://issues.redhat.com/browse/RHAIRFE-576

Expand vLLM and LLM Compressor docs with optimal parameters for priority models

Enhance documentation for vLLM and LLM Compressor focusing on priority models (Llama3, Qwen2, etc.) with optimal running methods, parameters, and configurations.

  • Optimal parameters specified for each model
  • Running methods and best practices included
  • Performance benchmarks provided where applicable

Document runtime and memory requirements per model and hardware configuration

Memory requirements documented per model Runtime expectations provided for common hardware Minimum/recommended hardware specs specified Troubleshooting tips for memory issues included

Improve vLLM server arguments documentation with customer use cases for RHAIIS

https://issues.redhat.com/browse/INFERENG-2006

RHAIIS docs lack clear guidance on how to configure vLLM server arguments for real-world customer scenarios.

To address this, we should expand the vLLM server arguments documentation by mapping arguments and configurations to customer use cases. The goal is to provide practical, Red Hat–specific value that helps customers confidently deploy and tune inference workloads on RHAIIS.

Document how to use tool calling and mounting custom chat templates for RHAIIS

https://issues.redhat.com/browse/INFERENG-2434

Document how to mount and include custom chat templates, and document existing supported chat templates. Must include details for gpt-oss and Qwen3 models.

Also document tool calling for supported models and hardware:

Spyre: https://www.ibm.com/granite/docs/run/granite-with-vllm-containerized#4-enabling-tool-calling-and-other-extended-capabilities

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment