Explain Dynamic Test Grouping System#

This document describes the dynamic test grouping system that automatically optimizes test execution in CI by balancing test groups based on execution time and test count.

Overview#

The dynamic test grouping system replaces hardcoded test groups with an intelligent algorithm that:

Automatically discovers all test files in the repository
Uses a greedy algorithm to balance test execution times across groups
Caches results to avoid unnecessary regeneration
Learns from execution data to improve future groupings
Targets 10-13 minutes execution time per group

Architecture#

Core Components#

Dynamic Test Grouper (scripts/dynamic_test_grouper.py)
- Main script implementing the grouping algorithm
- Handles caching, timing collection, and group generation
Makefile Integration
- Dynamic targets that replace hardcoded test groups
- Automatic cache invalidation and regeneration
CI Workflow Integration
- GitHub Actions steps for cache management
- Automatic timing data collection
- Performance monitoring and optimization
Management Scripts
- Update script for CI environments
- Performance analysis and reporting

Algorithm Details#

The system uses a greedy bin-packing algorithm:

1. Discover all test files automatically
2. Load historical timing data (if available)
3. Sort tests by execution time (descending)
4. For each test:
   - Find group with minimum total time
   - Assign test to that group
   - Update group's total time
5. Cache results for future use

File Structure#

.test_groups_cache/           # Cache directory (git-ignored)
├── unit_groups.json         # Cached unit test groups
├── unit_groups.mk          # Makefile format for unit groups
├── unit_timings.json       # Historical timing data for unit tests
├── integration_groups.json # Cached integration test groups
├── integration_groups.mk  # Makefile format for integration groups
└── integration_timings.json # Historical timing data for integration tests

scripts/
└── dynamic_test_grouper.py  # Main grouping algorithm

.github/
├── scripts/
│   └── update_test_groups.sh # CI update script
└── workflows/
    ├── pull_request.yml      # Updated CI workflow
    └── test-groups-management.yml # Management workflow

Usage#

Manual Group Generation#

# Generate unit test groups (6 groups)
python3 scripts/dynamic_test_grouper.py --type=unit --groups=6

# Generate integration test groups (9 groups)  
python3 scripts/dynamic_test_grouper.py --type=integration --groups=9

# Force regeneration (ignore cache)
python3 scripts/dynamic_test_grouper.py --type=unit --groups=6 --force

# Update timing data from JUnit results
python3 scripts/dynamic_test_grouper.py --type=unit --update-timings --junit-dir=tests/unit/outputs

Makefile Targets#

# Run a specific test group (uses dynamic groups)
make unit_test_group TEST_GROUP=1
make integration_test_group TEST_GROUP=3

# Force regenerate all groups
make regenerate_test_groups

# Clean cache
make clean_test_groups

CI Integration#

The system automatically:

Checks for new tests before each CI run
Updates groups if tests have been added/removed
Collects timing data after test execution
Improves groupings over time

Configuration#

Target Execution Time#

The default target is 10-13 minutes per group. To modify:

# In scripts/dynamic_test_grouper.py
MAX_GROUP_TIME = 13 * 60  # 13 minutes in seconds

Default Test Time#

For new tests without timing data:

# In scripts/dynamic_test_grouper.py
DEFAULT_TEST_TIME = 30    # 30 seconds

Number of Groups#

Configured in CI workflow and Makefile:

Unit tests: 6 groups
Integration tests: 9 groups

To change, update:

.github/workflows/pull_request.yml (matrix strategy)
Makefile calls to the grouper script

Monitoring & Analysis#

Performance Reports#

The system generates statistics showing:

# Group Statistics:
# Group 1: 12 tests, 8m 45s
# Group 2: 10 tests, 9m 12s
# Group 3: 11 tests, 8m 56s
# ...
# Max group time: 9.2m, Min: 8.7m, Avg: 9.0m

Automated Monitoring#

Weekly optimization runs via GitHub Actions
Performance reports generated as artifacts
Automatic issue creation for balance problems

Manual Analysis#

# View current group statistics
./.github/scripts/update_test_groups.sh --stats

# Force regeneration with analysis
make regenerate_test_groups

Troubleshooting#

Cache Issues#

If groups seem incorrect:

# Clear cache and regenerate
make clean_test_groups
make regenerate_test_groups

Missing Timing Data#

For new repositories or after major changes:

# The system will use default times initially
# After first CI run, timing data will be collected automatically

Group Imbalance#

If groups exceed target time:

Increase number of groups in CI configuration
Identify slow tests for optimization
Use the management workflow for analysis

CI Integration Issues#

Check that:

Cache keys in CI workflow match current setup
Python 3 is available in the CI environment
File permissions allow script execution

Cache Invalidation#

The cache is invalidated when:

New tests are added or existing tests removed
Script is modified (dynamic_test_grouper.py)
Force regeneration is requested
Cache files are corrupted or missing

Performance Benefits#

Before (Hardcoded Groups)#

Manual group assignment
Unbalanced execution times
New tests forgotten in CI
No optimization over time

After (Dynamic Groups)#

Automatic test discovery
Optimal load balancing
Self-improving over time
Target execution time control
Zero maintenance required

Development#

Adding New Features#

To extend the system:

Modify the Python script for algorithm changes
Update CI workflows for new caching strategies
Test with various scenarios before deployment

Algorithm Improvements#

Possible enhancements:

Machine learning for better time prediction
Dependency-aware grouping for related tests
Dynamic group sizing based on available runners
Test prioritization for faster feedback

Limitations#

Initial runs use estimated times (30s default)
Perl dependency for the grouping script
Cache storage consumes some CI storage
JUnit XML format required for timing collection