Skip to content

Instantly share code, notes, and snippets.

@aireilly
Last active October 7, 2025 11:11
Show Gist options
  • Select an option

  • Save aireilly/ec0690c6a3925aad1d64e8997061adb1 to your computer and use it in GitHub Desktop.

Select an option

Save aireilly/ec0690c6a3925aad1d64e8997061adb1 to your computer and use it in GitHub Desktop.

Revisions

  1. aireilly revised this gist Oct 7, 2025. 1 changed file with 5 additions and 10 deletions.
    15 changes: 5 additions & 10 deletions rhaiis-jiras.md
    Original file line number Diff line number Diff line change
    @@ -10,9 +10,13 @@ Develop interactive Jupyter notebooks demonstrating key RHAIIS features and work

    ## Add model/accelerator compatibility matrix to RHAIIS documentation

    Create a comprehensive table in RHAIIS docs showing model and accelerator support, similar to vLLM's supported models page but enhanced with accelerator information.
    Create a comprehensive table in RHAIIS docs showing model and accelerator support, similar to vLLM's supported models page but enhanced with accelerator information. Matrix shows RHAIIS version, accelerator type, and vLLM version
    Compatibility notes and limitations documented

    Reference: https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models

    Related issue: https://issues.redhat.com/browse/RHAIRFE-576

    ## Expand vLLM and LLM Compressor docs with optimal parameters for priority models

    Enhance documentation for vLLM and LLM Compressor focusing on priority models (Llama3, Qwen2, etc.) with optimal running methods, parameters, and configurations.
    @@ -28,15 +32,6 @@ Runtime expectations provided for common hardware
    Minimum/recommended hardware specs specified
    Troubleshooting tips for memory issues included

    ## Document vLLM version compatibility matrix per accelerator and RHAIIS version

    Create documentation showing which vLLM version is supported for each accelerator type across different RHAIIS versions.

    Related Issue: https://issues.redhat.com/browse/RHAIRFE-576

    Matrix shows RHAIIS version, accelerator type, and vLLM version
    Compatibility notes and limitations documented

    ## Improve vLLM server arguments documentation with customer use cases for RHAIIS

    https://issues.redhat.com/browse/INFERENG-2006
  2. aireilly created this gist Oct 6, 2025.
    57 changes: 57 additions & 0 deletions rhaiis-jiras.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,57 @@
    # RHAIIS docs improvements

    ## Develop user guides for building AI agents with LangChain/LangGraph

    Create practical guides showing common end-user scenarios for building AI agents using LangChain and LangGraph frameworks with RHAIIS.

    ## Create Jupyter notebook examples for RHAIIS enablement

    Develop interactive Jupyter notebooks demonstrating key RHAIIS features and workflows, similar to the reference example (https://colab.research.google.com/drive/1JnVdTtIPC2M0ybD2Tz06HEctiLEak0Vw).

    ## Add model/accelerator compatibility matrix to RHAIIS documentation

    Create a comprehensive table in RHAIIS docs showing model and accelerator support, similar to vLLM's supported models page but enhanced with accelerator information.
    Reference: https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models

    ## Expand vLLM and LLM Compressor docs with optimal parameters for priority models

    Enhance documentation for vLLM and LLM Compressor focusing on priority models (Llama3, Qwen2, etc.) with optimal running methods, parameters, and configurations.

    * Optimal parameters specified for each model
    * Running methods and best practices included
    * Performance benchmarks provided where applicable

    ## Document runtime and memory requirements per model and hardware configuration

    Memory requirements documented per model
    Runtime expectations provided for common hardware
    Minimum/recommended hardware specs specified
    Troubleshooting tips for memory issues included

    ## Document vLLM version compatibility matrix per accelerator and RHAIIS version

    Create documentation showing which vLLM version is supported for each accelerator type across different RHAIIS versions.

    Related Issue: https://issues.redhat.com/browse/RHAIRFE-576

    Matrix shows RHAIIS version, accelerator type, and vLLM version
    Compatibility notes and limitations documented

    ## Improve vLLM server arguments documentation with customer use cases for RHAIIS

    https://issues.redhat.com/browse/INFERENG-2006

    RHAIIS docs lack clear guidance on how to configure vLLM server arguments for real-world customer scenarios.

    To address this, we should expand the vLLM server arguments documentation by mapping arguments and configurations to customer use cases. The goal is to provide practical, Red Hat–specific value that helps customers confidently deploy and tune inference workloads on RHAIIS.

    ## Document how to use tool calling and mounting custom chat templates for RHAIIS

    https://issues.redhat.com/browse/INFERENG-2434

    Document how to mount and include custom chat templates, and document existing supported chat templates. Must include details for gpt-oss and Qwen3 models.

    Also document tool calling for supported models and hardware:

    Spyre:
    https://www.ibm.com/granite/docs/run/granite-with-vllm-containerized#4-enabling-tool-calling-and-other-extended-capabilities