Explain Dynamic Test Grouping System

This document describes the dynamic test grouping system that automatically optimizes test execution in CI by balancing test groups based on execution time and test count.

Overview

The dynamic test grouping system replaces hardcoded test groups with an intelligent algorithm that:

Automatically discovers all test files in the repository
Uses a greedy algorithm to balance test execution times across groups
Caches results to avoid unnecessary regeneration
Learns from execution data to improve future groupings
Targets 10-13 minutes execution time per group

Architecture

Core Components

Dynamic Test Grouper (scripts/dynamic_test_grouper.py)
- Main script implementing the grouping algorithm
- Handles caching, timing collection, and group generation
Makefile Integration
- Dynamic targets that replace hardcoded test groups
- Automatic cache invalidation and regeneration
CI Workflow Integration
- GitHub Actions steps for cache management
- Automatic timing data collection
- Performance monitoring and optimization
Management Scripts
- Update script for CI environments
- Performance analysis and reporting

Algorithm Details

The system uses a greedy bin-packing algorithm:

Algorithm Details

1. Discover all test files automatically
2. Load historical timing data (if available)
3. Sort tests by execution time (descending)
4. For each test:
   - Find group with minimum total time
   - Assign test to that group
   - Update group's total time
5. Cache results for future use

File Structure

.test_groups_cache/           # Cache directory (git-ignored)
├── unit_groups.json         # Cached unit test groups
├── unit_groups.mk          # Makefile format for unit groups
├── unit_timings.json       # Historical timing data for unit tests
├── integration_groups.json # Cached integration test groups
├── integration_groups.mk  # Makefile format for integration groups
└── integration_timings.json # Historical timing data for integration tests

scripts/
└── dynamic_test_grouper.py  # Main grouping algorithm

.github/
├── scripts/
│   └── update_test_groups.sh # CI update script
└── workflows/
    ├── pull_request.yml      # Updated CI workflow
    └── test-groups-management.yml # Management workflow

Usage

Manual Group Generation

# Generate unit test groups (6 groups)
python3 scripts/dynamic_test_grouper.py --type=unit --groups=6

# Generate integration test groups (9 groups)  
python3 scripts/dynamic_test_grouper.py --type=integration --groups=9

# Force regeneration (ignore cache)
python3 scripts/dynamic_test_grouper.py --type=unit --groups=6 --force

# Update timing data from JUnit results
python3 scripts/dynamic_test_grouper.py --type=unit --update-timings --junit-dir=tests/unit/outputs

Makefile Targets

# Run a specific test group (uses dynamic groups)
make unit_test_group TEST_GROUP=1
make integration_test_group TEST_GROUP=3

# Force regenerate all groups
make regenerate_test_groups

# Clean cache
make clean_test_groups

CI Integration

The system automatically:

Checks for new tests before each CI run
Updates groups if tests have been added/removed
Collects timing data after test execution
Improves groupings over time

Configuration

Target Execution Time

The default target is 10-13 minutes per group. To modify:

Target Execution Time

# In scripts/dynamic_test_grouper.py
MAX_GROUP_TIME = 13 * 60  # 13 minutes in seconds

Default Test Time

For new tests without timing data:

Default Test Time

# In scripts/dynamic_test_grouper.py
DEFAULT_TEST_TIME = 30    # 30 seconds

Number of Groups

Configured in CI workflow and Makefile:

Unit tests: 6 groups
Integration tests: 9 groups

To change, update:

.github/workflows/pull_request.yml (matrix strategy)
Makefile calls to the grouper script

Monitoring & Analysis

Performance Reports

The system generates statistics showing:

Performance Reports

# Group Statistics:
# Group 1: 12 tests, 8m 45s
# Group 2: 10 tests, 9m 12s
# Group 3: 11 tests, 8m 56s
# ...
# Max group time: 9.2m, Min: 8.7m, Avg: 9.0m

Automated Monitoring

Weekly optimization runs via GitHub Actions
Performance reports generated as artifacts
Automatic issue creation for balance problems

Manual Analysis

# View current group statistics
./.github/scripts/update_test_groups.sh --stats

# Force regeneration with analysis
make regenerate_test_groups

Troubleshooting

Cache Issues

If groups seem incorrect:

Cache Issues

# Clear cache and regenerate
make clean_test_groups
make regenerate_test_groups

Missing Timing Data

For new repositories or after major changes:

Missing Timing Data

# The system will use default times initially
# After first CI run, timing data will be collected automatically

Group Imbalance

If groups exceed target time:

Increase number of groups in CI configuration
Identify slow tests for optimization
Use the management workflow for analysis

CI Integration Issues

Check that:

Cache keys in CI workflow match current setup
Python 3 is available in the CI environment
File permissions allow script execution

Cache Invalidation

The cache is invalidated when:

New tests are added or existing tests removed
Script is modified (dynamic_test_grouper.py)
Force regeneration is requested
Cache files are corrupted or missing

Performance Benefits

Before (Hardcoded Groups)

Manual group assignment
Unbalanced execution times
New tests forgotten in CI
No optimization over time

After (Dynamic Groups)

Automatic test discovery
Optimal load balancing
Self-improving over time
Target execution time control
Zero maintenance required

Development

Adding New Features

To extend the system:

Modify the Python script for algorithm changes
Update CI workflows for new caching strategies
Test with various scenarios before deployment

Algorithm Improvements

Possible enhancements:

Machine learning for better time prediction
Dependency-aware grouping for related tests
Dynamic group sizing based on available runners
Test prioritization for faster feedback

Limitations

Initial runs use estimated times (30s default)
Perl dependency for the grouping script
Cache storage consumes some CI storage
JUnit XML format required for timing collection

Explain Dynamic Test Grouping System

Overview

Architecture

Core Components

Algorithm Details

File Structure

Usage

Manual Group Generation

Makefile Targets

CI Integration

Configuration

Target Execution Time

Default Test Time

Number of Groups

Monitoring & Analysis

Performance Reports

Automated Monitoring

Manual Analysis

Troubleshooting

Cache Issues

Missing Timing Data

Group Imbalance

CI Integration Issues

Cache Invalidation

Performance Benefits

Before (Hardcoded Groups)

After (Dynamic Groups)

Development

Adding New Features

Algorithm Improvements

Limitations

References

On this page