Skip to content

Instantly share code, notes, and snippets.

@dims
Last active August 29, 2025 12:16
Show Gist options
  • Save dims/1fc254d0e4d043ef18bbc84596785de2 to your computer and use it in GitHub Desktop.
Save dims/1fc254d0e4d043ef18bbc84596785de2 to your computer and use it in GitHub Desktop.
Extract Endpoint information from audit logs

Kubernetes API Audit Log Analysis

Tools for analyzing Kubernetes API coverage from E2E test audit logs.

Copy the two python scripts from https://github.com/kubernetes/test-infra/tree/master/experiment/audit to your local directory

The example below uses kubernetes/kubernetes#133132 which is a PR to add a new DRA test to conformance.

Quick Start

  1. Download audit logs from GCS:

    gsutil -m cp -R gs://kubernetes-ci-logs/pr-logs/pull/133132/pull-kubernetes-audit-kind-conformance/1960293546019786752/artifacts/audit/ .
  2. Parse audit logs:

    python3 audit_log_parser.py --audit-logs audit/audit*.log --output audit/audit-endpoints.txt --audit-operations-json audit/audit-operations.json
    Loading ineligible endpoints from: https://raw.githubusercontent.com/kubernetes/kubernetes/master/test/conformance/testdata/ineligible_endpoints.yaml
    Loaded 99 ineligible endpoints
    Loading Swagger specification from: https://raw.githubusercontent.com/kubernetes/kubernetes/refs/heads/master/api/openapi-spec/swagger.json
    Using cached Swagger specification
    Extracting resource types from Swagger specification...
    Extracted 68 resource types from Swagger spec
    Building path to operation mapping...
    Loaded 1050 API operations from Swagger spec
    Found 203 deprecated operations
    Parsing 2 audit log file(s):
    [1/2] audit/audit-2025-08-26T12-04-42.943.log
    [2/2] audit/audit.log
    
    Processing file 1/2: audit/audit-2025-08-26T12-04-42.943.log
    Processed 10000 lines from audit/audit-2025-08-26T12-04-42.943.log...
    Processed 20000 lines from audit/audit-2025-08-26T12-04-42.943.log...
    Processed 30000 lines from audit/audit-2025-08-26T12-04-42.943.log...
    Processed 40000 lines from audit/audit-2025-08-26T12-04-42.943.log...
    Processed 50000 lines from audit/audit-2025-08-26T12-04-42.943.log...
    Processed 60000 lines from audit/audit-2025-08-26T12-04-42.943.log...
    Processed 70000 lines from audit/audit-2025-08-26T12-04-42.943.log...
    Processed 80000 lines from audit/audit-2025-08-26T12-04-42.943.log...
    Processed 90000 lines from audit/audit-2025-08-26T12-04-42.943.log...
    Processed 100000 lines from audit/audit-2025-08-26T12-04-42.943.log...
    Completed audit/audit-2025-08-26T12-04-42.943.log: 106851 entries processed
    Processing file 2/2: audit/audit.log
    Processed 10000 lines from audit/audit.log...
    Processed 20000 lines from audit/audit.log...
    Processed 30000 lines from audit/audit.log...
    Processed 40000 lines from audit/audit.log...
    Processed 50000 lines from audit/audit.log...
    Processed 60000 lines from audit/audit.log...
    Processed 70000 lines from audit/audit.log...
    Processed 80000 lines from audit/audit.log...
    Completed audit/audit.log: 89464 entries processed
    
    Parsing complete:
    Total log entries: 196315
    Swagger-based matches: 87105
    Fallback matches: 731
    Unique endpoints found: 715
    Total API calls: 87836
    Skipped entries: 108479
    
    Results written to: audit/audit-endpoints.txt
    Generated audit/audit-operations.json with 602 operations and 1739 sample audit entries   
    
  3. Compare against CI baseline:

    python3 kubernetes_api_analysis.py --pull-audit-endpoints audit/audit-endpoints.txt
    Kubernetes API Operations Analysis
    ==================================
    
    Step 1: Extracting operationIds from swagger.json...
    Swagger URL: https://raw.githubusercontent.com/kubernetes/kubernetes/refs/heads/master/api/openapi-spec/swagger.json
    Output file: swagger_operations.txt
    Downloading swagger specification...
    Extracted 1062 operationIds to swagger_operations.txt
    
    No CI audit endpoints file specified, auto-discovering latest from GCS...
    Searching for latest CI audit run...
    Enumerating directories in gs://kubernetes-ci-logs/logs/ci-kubernetes-audit-kind-conformance...
    Found directory with finished.json: gs://kubernetes-ci-logs/logs/ci-kubernetes-audit-kind-conformance/1960318661684105216/
    Found audit file at: gs://kubernetes-ci-logs/logs/ci-kubernetes-audit-kind-conformance/1960318661684105216/artifacts/audit/audit-endpoints.txt
    Downloaded to: ci-audit-kind-conformance-audit-endpoints.txt
    
    Step 2: Comparing audit endpoint files...
    CI File: ci-audit-kind-conformance-audit-endpoints.txt
    Pull File: audit/audit-endpoints.txt
    
    Extracting operations from audit files (filtering by swagger operations)...
    SUMMARY
    =======
    Total Operations in Swagger:  1062
    Operations in CI:             508
    Operations in Pull:           517
    Operations Added:             9
    Operations Removed:           0
    Net Change:                   +9
    
    OPERATIONS ADDED IN PULL (NOT IN CI)
    ====================================
    Count: 9
    
      1. createResourceV1DeviceClass
      2. createResourceV1NamespacedResourceClaim
      3. createResourceV1NamespacedResourceClaimTemplate
      4. createResourceV1ResourceSlice
      5. deleteResourceV1DeviceClass
      6. deleteResourceV1NamespacedResourceClaim
      7. readResourceV1NamespacedResourceClaim
      8. replaceResourceV1NamespacedResourceClaim
      9. replaceResourceV1NamespacedResourceClaimStatus
    
    OPERATIONS REMOVED FROM PULL (IN CI BUT NOT PULL)
    =================================================
    Count: 0
    
    No operations removed.
    
    STABLE ENDPOINTS NOT FOUND IN PULL AUDIT LOG
    ============================================
    Count: 29
    
    These are stable, non-deprecated API endpoints defined in the Swagger spec
    but not exercised in the pull request audit log:
    
      1. connectCoreV1PostNamespacedPodExec
      2. connectCoreV1PostNamespacedPodPortforward
      3. createStorageV1VolumeAttributesClass
      4. deleteResourceV1CollectionDeviceClass
      5. deleteResourceV1NamespacedResourceClaimTemplate
      6. deleteResourceV1ResourceSlice
      7. deleteStorageV1CollectionVolumeAttributesClass
      8. deleteStorageV1VolumeAttributesClass
      9. getResourceV1APIResources
     10. listResourceV1ResourceClaimTemplateForAllNamespaces
     11. listStorageV1VolumeAttributesClass
     12. patchCoreV1NamespacedPodResize
     13. patchResourceV1DeviceClass
     14. patchResourceV1NamespacedResourceClaim
     15. patchResourceV1NamespacedResourceClaimStatus
     16. patchResourceV1NamespacedResourceClaimTemplate
     17. patchResourceV1ResourceSlice
     18. patchStorageV1VolumeAttributesClass
     19. readCoreV1NamespacedPodResize
     20. readResourceV1DeviceClass
     21. readResourceV1NamespacedResourceClaimStatus
     22. readResourceV1NamespacedResourceClaimTemplate
     23. readResourceV1ResourceSlice
     24. readStorageV1VolumeAttributesClass
     25. replaceCoreV1NamespacedPodResize
     26. replaceResourceV1DeviceClass
     27. replaceResourceV1NamespacedResourceClaimTemplate
     28. replaceResourceV1ResourceSlice
     29. replaceStorageV1VolumeAttributesClass
    
    Analysis complete!
    Generated files:
    - swagger_operations.txt (swagger operations list)
    

Outputs

audit_log_parser.py

  • audit/audit-endpoints.txt: Human-readable report (602 operations, 79K API calls)
  • audit/audit-operations.json: JSON with up to 5 audit samples per operation

kubernetes_api_analysis.py

  • Console output: Comparison showing added/removed operations vs CI baseline
  • swagger_operations.txt: Complete list of 1062 Swagger operations

How It Works

Each operation flows through this pipeline:

swagger.json → audit_log_parser.py → audit-endpoints.txt → kubernetes_api_analysis.py
     ↓              ↓                    ↓                    ↓
   POST /apis/   requestURI match    Line 98: | 1        "OPERATIONS ADDED"
   resource...   → operationId       (1 API call)        (new in this PR)

Example: createResourceV1NamespacedResourceClaimTemplate

  • Swagger: POST /apis/resource.k8s.io/v1/namespaces/{namespace}/resourceclaimtemplates
  • Audit Log: requestURI: "/apis/resource.k8s.io/v1/namespaces/dra-9508/resourceclaimtemplates"
  • Parser: Maps URI → operation ID via pattern matching
  • Output: Shows as "ADDED" (found in PR, missing from CI baseline)
# Kubernetes Conformance Audit
The Kubernetes conformance audit system ensures that new API endpoints reaching General Availability (GA) are properly covered by conformance tests, preventing technical debt accumulation and maintaining API testing integrity across the project.
## Overview
The conformance audit process automatically tracks API endpoint usage through audit logs and compares it against conformance test coverage, flagging gaps that need attention from contributors and maintainers.
### Key Components
1. **Swagger/OpenAPI Specification**: The authoritative definition of Kubernetes APIs in [`swagger.json`](https://github.com/kubernetes/kubernetes/blob/master/api/openapi-spec/swagger.json)
2. **Audit Log Analysis**: Scripts that parse Kubernetes audit logs to identify API endpoint usage
3. **Endpoint Tracking Files**: YAML files that categorize endpoints as pending, ineligible, or conformance-ready
4. **CI Jobs**: Automated jobs that run the analysis and report on compliance
## CI Jobs
### Periodic Job: `ci-kubernetes-audit-kind-conformance`
**Purpose**: Establishes the baseline for API endpoint coverage by running conformance tests and generating audit logs.
**When it runs**: Scheduled periodically to maintain current baseline data
**What it does**:
- Creates a KIND (Kubernetes in Docker) cluster
- Runs all 425+ conformance tests
- Generates audit logs of API endpoint usage during test execution
- Parses logs to create endpoint usage reports
**Artifacts generated**:
- `audit*.log`: Raw audit log files (~90-100MB)
- `audit-endpoints.txt`: Human-readable endpoint usage summary (~27KB)
- `audit-operations.json`: JSON mapping of operations to audit entries (~2MB)
- `policy.yaml`: Audit policy configuration
**Monitoring**: [TestGrid Dashboard](https://testgrid.k8s.io/sig-arch-conformance#kind-conformance-audit)
### Presubmit Job: `pull-kubernetes-audit-kind-conformance`
**Purpose**: Analyzes changes in pull requests to identify new API endpoints that need conformance test coverage.
**When it runs**: Triggered automatically when `swagger.json` is modified in a pull request
**What it does**:
- Runs the same conformance tests as the periodic job
- Compares audit results against the latest periodic job baseline
- Identifies newly added, removed, or stable-but-untested endpoints
- Validates endpoint categorization in tracking files
**Additional analysis**:
- Highlights API operation differences between PR and baseline
- Checks for proper categorization of endpoints
- Reports on conformance test gaps
**Monitoring**: [TestGrid Dashboard](https://testgrid.k8s.io/sig-arch-conformance#presubmit-kind-conformance-audit)
## Endpoint Classification System
### Conformance-Eligible Endpoints
Endpoints that should eventually be covered by conformance tests but are not yet tested.
**File**: [`pending_eligible_endpoints.yaml`](https://github.com/kubernetes/kubernetes/blob/master/test/conformance/testdata/pending_eligible_endpoints.yaml)
**Contains**: API endpoints awaiting conformance test development, typically for:
- Recently added GA features
- Resource management operations (pod resizing, device classes)
- Core API operations not yet covered
### Ineligible Endpoints
Endpoints explicitly excluded from conformance testing for valid technical or policy reasons.
**File**: [`ineligible_endpoints.yaml`](https://github.com/kubernetes/kubernetes/blob/master/test/conformance/testdata/ineligible_endpoints.yaml)
**Categories**:
- **Deprecated endpoints**: Soon-to-be-removed functionality
- **Optional features**: Components not required in all Kubernetes distributions (NetworkPolicy, HPA)
- **Debug features**: Development and troubleshooting tools (port forwarding, pod attach)
- **Administrative endpoints**: Operations that distributions may restrict for security
- **Unstable features**: APIs that lack stable implementations across providers
Each entry includes the endpoint name, exclusion reason, and link to relevant issue discussion.
## Audit Analysis Scripts
### `audit_log_parser.py`
**Purpose**: Parses Kubernetes audit logs and maps entries to official API specifications.
**Key functions**:
- Downloads current `swagger.json` specification
- Processes audit log JSON entries
- Maps log entries to OpenAPI operation IDs
- Generates usage statistics and operation samples
**Outputs**:
- Console report with endpoint counts and matches
- `audit-endpoints.txt`: Sorted list of endpoints with usage counts
- `audit-operations.json`: Sample audit entries for each operation
**Common errors**:
- Network issues downloading swagger specification
- Malformed JSON in audit log files
- Missing or unreadable audit log files
- Unexpected log entry formats
### `kubernetes_api_analysis.py`
**Purpose**: Compares audit results between pull requests and CI baseline to identify API changes.
**Key functions**:
- Downloads latest CI audit data from Google Cloud Storage
- Compares local PR audit results against CI baseline
- Identifies added, removed, and stable-but-unused operations
- Provides detailed change analysis
**Analysis types**:
- **New operations**: API endpoints introduced in the PR
- **Removed operations**: Previously audited endpoints no longer called
- **Stable unused**: GA endpoints not exercised by conformance tests
**Common errors**:
- Missing `gsutil` CLI tool
- Permissions issues accessing GCS buckets
- Network connectivity problems
- Swagger specification parsing failures
## Enforcement Process
### For Pull Request Authors
When the presubmit job identifies issues, contributors must:
1. **Stable endpoints not in pending_eligible_endpoints.yaml**:
- Add the endpoint to `pending_eligible_endpoints.yaml`
- Include justification for why conformance testing is deferred
- Create or reference issue for future conformance test development
2. **Endpoints in pending_eligible_endpoints.yaml that are now tested**:
- Remove the endpoint from `pending_eligible_endpoints.yaml`
- Verify the endpoint is properly covered by new conformance tests
3. **New ineligible endpoints**:
- Add to `ineligible_endpoints.yaml` with clear reasoning
- Link to discussion or issue explaining the exclusion
- Ensure the exclusion follows established patterns
### Error Messages and Resolution
**"Stable endpoint X not found in pending_eligible_endpoints.yaml"**
- **Cause**: New GA endpoint lacks conformance test coverage
- **Resolution**: Add endpoint to pending list or create conformance test
**"Endpoint X found in pending_eligible_endpoints.yaml but is being tested"**
- **Cause**: Endpoint now has conformance test coverage
- **Resolution**: Remove from pending list as no longer needed
**"Operation count mismatch between PR and CI baseline"**
- **Cause**: API endpoints added/removed without proper categorization
- **Resolution**: Review endpoint changes and update tracking files accordingly
## Troubleshooting
### CI Job Failures
**Build failures**: Check build logs for:
- KIND cluster creation issues
- Test execution failures
- Script execution errors
- Artifact generation problems
**Analysis failures**: Common issues:
- Swagger specification download problems
- Audit log parsing errors
- GCS access permission issues
- File format inconsistencies
### Script Execution Issues
**Local development**:
```bash
# Run audit log parser
python3 audit_log_parser.py --audit-logs audit.log
# Compare against CI baseline
python3 kubernetes_api_analysis.py
```
**Requirements**:
- Python 3.x
- Network access for downloading specifications
- `gsutil` installed for GCS operations
- Valid audit log files in JSON format
### Common Resolution Steps
1. **Check network connectivity** for specification downloads
2. **Verify file permissions** for audit logs and output files
3. **Validate JSON format** of audit log entries
4. **Ensure gsutil authentication** for GCS access
5. **Review script error output** for specific failure details
## Best Practices
### For API Developers
- **Plan conformance tests early**: Consider test coverage during API design
- **Update tracking files promptly**: Don't let endpoints accumulate in pending status
- **Provide clear justifications**: Document reasons for ineligible classifications
- **Test locally**: Run audit scripts before submitting PRs
### For Maintainers
- **Monitor CI job health**: Address persistent failures promptly
- **Review categorization**: Ensure endpoint classifications remain accurate
- **Update documentation**: Keep troubleshooting guides current
- **Coordinate with SIGs**: Work with feature owners on conformance test development
## Resources
- [Conformance Test Requirements](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [CI Job Configuration](https://github.com/kubernetes/test-infra/blob/master/config/jobs/kubernetes/sig-arch/conformance-audit.yaml)
- [Audit Scripts](https://github.com/kubernetes/test-infra/tree/master/experiment/audit)
- [TestGrid Monitoring](https://testgrid.k8s.io/sig-arch-conformance)
- [Script Usage Examples](https://gist.github.com/dims/1fc254d0e4d043ef18bbc84596785de2/raw/73ae40e831367a757e8e2b47ff10836286db7d03/README.md)
The conformance audit system represents a crucial safeguard in Kubernetes development, ensuring that the project's commitment to API stability and comprehensive testing remains intact as the platform evolves.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment