Skip to content

Instantly share code, notes, and snippets.

@zdrummond
Created January 13, 2026 03:10
Show Gist options
  • Select an option

  • Save zdrummond/ef4347ebb63168bf12ae8644a5f74799 to your computer and use it in GitHub Desktop.

Select an option

Save zdrummond/ef4347ebb63168bf12ae8644a5f74799 to your computer and use it in GitHub Desktop.
AI Preflight Checks
# Preflight Checker - Complete Check Reference
**Document Objective**: Comprehensive categorized list of all checks performed by the preflight checker for H100 ML deployment quality assurance.
**Version**: 1.11.7
**Last Updated**: 2025-11-30
---
## Check Count Summary
| Category | Pattern-Based | AST-Based | Total |
|----------|---------------|-----------|-------|
| 1. GPU Efficiency | 5 | 6 | **11** |
| 2. Training Issues | 8 | 3 | **11** |
| 3. Loss Function | 7 | 3 | **10** |
| 4. Evaluation Mode | 3 | 1 | **4** |
| 5. In-Place Operations | - | 4 | **4** |
| 6. Performance/CPU | - | 6 | **6** |
| 7. File I/O | - | 3 | **3** |
| 8. Code Quality | 17 | 9 | **26** |
| 9. Import Validation | - | 7 | **7** |
| 10. Test Quality | 7 | 2 | **9** |
| 11. Fixture Analysis | - | 14 | **14** |
| 12. Constants Enforcement | - | 1 | **1** |
| 13. External Tools (Ruff) | - | 1 | **1** |
| **TOTAL** | **47** | **60** | **107** |
---
## Table of Contents
1. [GPU Efficiency Checks](#1-gpu-efficiency-checks)
2. [Training Issues](#2-training-issues)
3. [Loss Function Checks](#3-loss-function-checks)
4. [Evaluation Mode Checks](#4-evaluation-mode-checks)
5. [PyTorch In-Place Operation Detection](#5-pytorch-in-place-operation-detection)
6. [Performance & CPU Bottleneck Checks](#6-performance--cpu-bottleneck-checks)
7. [File I/O Bottleneck Detection](#7-file-io-bottleneck-detection)
8. [Code Quality Checks](#8-code-quality-checks)
9. [Import Validation](#9-import-validation)
10. [Test Quality Checks](#10-test-quality-checks)
11. [Fixture Analysis](#11-fixture-analysis)
12. [Constants Enforcement](#12-constants-enforcement)
13. [External Tool Integration](#13-external-tool-integration)
---
## 1. GPU Efficiency Checks
**Category**: `gpu_efficiency`, `gpu_underutilization`, `gpu_synchronization`, `cpu_transfer`
Ensures code is optimized for H100 GPU with 95GB memory.
### Pattern-Based Checks
| Check | Message | Category |
|-------|---------|----------|
| `device = "cpu"` | Forcing CPU device - use cuda | `gpu_efficiency` |
| `for...to(device)` | Moving to device in loop - batch transfer | `gpu_efficiency` |
| `.cuda()...for` | GPU transfer in loop - vectorize first | `gpu_efficiency` |
| `batch_size = N` (N < 100) | Small batch size - use >=256 for H100 | `gpu_efficiency` |
| `BATCH_SIZE = N` (N < 100) | Small batch size constant - use >=256 for H100 | `gpu_efficiency` |
### AST-Based Checks (GPUUtilizationAnalyzer)
| Check | Description | Category |
|-------|-------------|----------|
| Small batch size assignment | Batch sizes < 128 flagged for H100 | `gpu_underutilization` |
| `.item()` in loop | Forces GPU sync, causes stalls | `gpu_synchronization` |
| `.cpu()` transfer | Unnecessary CPU transfer detected | `cpu_transfer` |
| `.to("cpu")` | Transfer to CPU detected | `cpu_transfer` |
| No batch size config | Training file without batch size configuration | `gpu_underutilization` |
| DataLoader missing `pin_memory` | Should use pin_memory=True for faster GPU transfer | `gpu_efficiency` |
---
## 2. Training Issues
**Category**: `training_issues`
Detects common mistakes in training loops and gradient handling.
### Pattern-Based Checks
| Check | Message | Category |
|-------|---------|----------|
| Multiple `.backward()` | Multiple backward passes - usually wrong | `training_issues` |
| `optimizer.step()` without `zero_grad` | step() without zero_grad nearby | `training_issues` |
| `loss.item()...backward()` | Getting item before backward - breaks graph | `training_issues` |
| `model.train()...model.eval()` | Mixing train/eval in same scope | `training_issues` |
| `torch.no_grad...backward` | Backward in no_grad context | `training_issues` |
| `dropout...eval()` | Dropout active during eval | `training_issues` |
| `DataLoader...num_workers=0` | Single-threaded data loading - use workers | `training_issues` |
| `for...in dataset` (no DataLoader) | Iterating dataset without DataLoader | `training_issues` |
### AST-Based Checks (MLFunctionAnalyzer)
| Check | Description | Category |
|-------|-------------|----------|
| Training loop without `zero_grad()` | Missing gradient clearing | `training_issues` |
| Training loop without `step()` | Missing optimizer weight update | `training_issues` |
| `optimizer.step()` without nearby `zero_grad()` | step() called without clearing gradients | `training_issues` |
---
## 3. Loss Function Checks
**Category**: `loss_issues`, `semantic_error`, `loss_routing`
Ensures loss functions compute real values, not placeholders.
### Pattern-Based Checks
| Check | Message | Category |
|-------|---------|----------|
| `loss.*return N.N` | Hardcoded loss value - compute from inputs | `loss_issues` |
| `return N.N #.*loss` | Hardcoded loss value - compute from inputs | `loss_issues` |
| `loss = N.N` | Hardcoded loss assignment | `loss_issues` |
| `MSELoss.*+.*CrossEntropy` | Mixing incompatible losses | `loss_issues` |
| `torch.log()` without epsilon | Log without stability epsilon | `loss_issues` |
| `torch.sqrt()` without epsilon | Sqrt without stability epsilon | `loss_issues` |
| `1 / torch` | Division without stability check | `loss_issues` |
### AST-Based Checks (MLFunctionAnalyzer, LossRoutingAnalyzer)
| Check | Description | Category |
|-------|-------------|----------|
| Loss function returns constant | Loss function returns a literal value | `semantic_error` |
| Loss function ignores inputs | Critical parameters like `input`, `target`, `pred` not used | `semantic_error` |
| Wrong loss routing | Loss type doesn't match specification for view_kind | `loss_routing` |
---
## 4. Evaluation Mode Checks
**Category**: `eval_issues`
Ensures models are in proper evaluation mode during inference.
### Pattern-Based Checks
| Check | Message | Category |
|-------|---------|----------|
| `@torch.no_grad` not on eval function | no_grad decorator not on eval function | `eval_issues` |
| `accuracy =...item() / len` | Computing metrics with .item() in loop | `eval_issues` |
| `torch.mean...dim=None` | Mean without dimension - specify dim | `eval_issues` |
### AST-Based Checks (EvalModeChecker)
| Check | Description | Category |
|-------|-------------|----------|
| Model inference without `.eval()` | Model called without eval() in scope | `eval_issues` |
**Smart Detection Features:**
- Recognizes training contexts (won't flag during training)
- Tracks `.eval()` calls within function scope
- Excludes non-ML generators (password generators, data factories, etc.)
- Recognizes methods that handle eval internally (e.g., `SentenceTransformer.encode()`)
---
## 5. PyTorch In-Place Operation Detection
**Category**: `pytorch_inplace`
Detects in-place tensor operations that may break autograd during backward passes.
### Detection Rules (InPlaceOperationAnalyzer)
| Rule | Check | Description |
|------|-------|-------------|
| **Rule 1** | Augmented assignments | `+=`, `-=`, `*=`, `/=`, `//=`, `**=` on tensors |
| **Rule 2** | Underscore methods | `.add_()`, `.mul_()`, `.zero_()`, `.fill_()`, `.clamp_()`, etc. |
| **Rule 3** | Buffer modification in `forward()` | `self.buffer[idx] = value` or `self.cache = value` |
| **Rule 4** | `.to()` without `.clone()` | In loss functions, `.to()` may return a view |
**Smart Detection Features:**
- Ignores operations inside `torch.no_grad()` and `torch.inference_mode()` contexts
- Recognizes `.to().clone()` chains as safe pattern
- Distinguishes integer/float counters from tensors
- Tracks variable initializations for type inference
---
## 6. Performance & CPU Bottleneck Checks
**Category**: `performance`, `cpu_bottleneck`
Detects CPU-bound operations that bottleneck GPU training.
### AST-Based Checks (DataPipelineAnalyzer, PairResolverAnalyzer)
| Check | Description | Category |
|-------|-------------|----------|
| Sequential data processing | Class processes data in loops without parallelization | `cpu_bottleneck` |
| DataLoader with `num_workers=0` | Single-threaded data loading | `cpu_bottleneck` |
| DataLoader without `num_workers` | Missing num_workers specification | `cpu_bottleneck` |
| PairResolver sequential processing | PairResolver uses sequential loops without multiprocessing | `performance` |
### AST-Based Checks (TensorOperationAnalyzer)
| Check | Description | Category |
|-------|-------------|----------|
| `.item()` in loop | GPU synchronization in loop | `performance` |
| Building tensor in loop | Creating tensors in loop - use vectorized ops | `performance` |
---
## 7. File I/O Bottleneck Detection
**Category**: `io_bottleneck`, `hot_path_io`, `loop_io`
Detects file I/O operations in performance-critical paths.
### Detected Operations (FileIOAnalyzer)
- **Built-in**: `open()`, `read()`, `write()`, `close()`
- **Path**: `read_text()`, `write_text()`, `read_bytes()`, `write_bytes()`
- **JSON**: `json.dump()`, `json.load()` (not `dumps`/`loads`)
- **Pickle**: `pickle.dump()`, `pickle.load()`
- **PyTorch**: `torch.save()`, `torch.load()`, `save_checkpoint()`
- **NumPy**: `np.save()`, `np.load()`, `np.savetxt()`, `np.loadtxt()`
- **Pandas**: `to_csv()`, `read_csv()`, `to_hdf()`, `read_hdf()`
### Context-Aware Detection
| Context | Severity | Category |
|---------|----------|----------|
| I/O in loop within hot path | Critical | `io_bottleneck` |
| I/O directly in hot path | Severe | `hot_path_io` |
| I/O in any loop | Warning | `loop_io` |
**Hot Path Functions**: `train`, `training_step`, `train_epoch`, `fit`, `forward`, `__call__`, `process_batch`, `collate_fn`, `__getitem__`, `__iter__`, `__next__`
---
## 8. Code Quality Checks
**Category**: `placeholders`, `debug_code`, `aliasing`, `backward_compatibility`, `duplicate_class`, `naming_collision`, `dead_code`, `unreachable_code`, `magic_number`
### Pattern-Based Checks (Placeholders & Debug)
| Check | Message | Category |
|-------|---------|----------|
| `raise NotImplementedError` | Placeholder implementation | `placeholders` |
| `pass # TODO` | TODO with pass statement | `placeholders` |
| `# FIXME` | FIXME comment - address before deployment | `placeholders` |
| `# HACK` | HACK comment - fix properly | `placeholders` |
| `# XXX` | XXX marker - needs attention | `placeholders` |
| `torch.randn...#...placeholder` | Random tensor marked as placeholder | `placeholders` |
| `return torch.randn` | Returning random tensor - likely placeholder | `placeholders` |
| `# TODO` | TODO comment - complete before deployment | `placeholders` |
| `# Placeholder` | Placeholder comment found | `placeholders` |
| `# Simplified` | Simplified/temporary implementation | `placeholders` |
| `breakpoint()` | Breakpoint left in code | `debug_code` |
| `import pdb` | Debugger import left in code | `debug_code` |
| `print(.*debug` | Debug print statement | `debug_code` |
### Pattern-Based Checks (Aliasing & Compatibility)
| Check | Message | Category |
|-------|---------|----------|
| `= # alias` | Variable aliasing detected | `aliasing` |
| `ALIAS =` | ALIAS constant detected | `aliasing` |
| `# for backward compat` | Backward compatibility - upgrade instead | `backward_compatibility` |
| `# maintain backward compat` | Backward compatibility - upgrade instead | `backward_compatibility` |
### AST-Based Checks (Cross-File Analysis)
| Check | Description | Category |
|-------|-------------|----------|
| Duplicate class definitions | Same class defined in multiple files | `duplicate_class` |
| Naming collisions | Class name aliased to different class | `naming_collision` |
| Dead code detection | Methods never called (production or test-only) | `dead_code` |
| Unreachable code | Code after return or always-false conditions | `unreachable_code` |
| Duplicate conditions | Same condition in if-elif chain | `unreachable_code` |
### Magic Number Detection (MagicNumberAnalyzer)
| Context | Description | Category |
|---------|-------------|----------|
| ML layer definitions | `Linear`, `Conv2d`, `Dropout`, etc. | `magic_number` |
| Optimizer parameters | `lr`, `weight_decay`, `momentum` | `magic_number` |
| Comparison thresholds | Probability/score comparisons | `magic_number` |
| Scaling factors | Multipliers > 10 | `magic_number` |
---
## 9. Import Validation
**Category**: `import_location`, `import_error`
### Import Location Checks (ImportLocationAnalyzer)
| Check | Description | Category |
|-------|-------------|----------|
| Import inside function | Imports should be at module level | `import_location` |
| Import inside loop | Imports inside for/while loops | `import_location` |
| Import inside if block | Conditional imports (except platform checks) | `import_location` |
| Import inside with block | Imports in context managers | `import_location` |
**Exceptions:**
- Platform/version checks (`sys.platform`, `sys.version_info`)
- Optional import patterns (`try/except ImportError`)
- Lazy import patterns (`if var is None: import`)
- Test files (function-level imports allowed)
### Import Resolution Checks (ImportValidator)
| Check | Description | Category |
|-------|-------------|----------|
| Cannot import name | Name not found in module | `import_error` |
| Test file importing prod constants | Test-specific constants from prod module | `import_error` |
| Production file importing test module | Prod code importing from test modules | `import_error` |
---
## 10. Test Quality Checks
**Category**: `test_issues`, `test_quality`
### Pattern-Based Checks
| Check | Message | Category |
|-------|---------|----------|
| `from unittest.mock` | Mock forbidden - use real models on H100 | `test_issues` |
| `@patch(` | Patch decorator forbidden - use real implementations | `test_issues` |
| `MagicMock(` | MagicMock forbidden - use real objects | `test_issues` |
| `torch.randn(` | Random tensors in test - use real data | `test_issues` |
| `np.random.randn` | Unseeded random data in test | `test_issues` |
| `assert True` | Meaningless assertion | `test_issues` |
| `except...pass` | Swallowing exceptions in test | `test_issues` |
### AST-Based Checks (TestQualityAnalyzer)
| Check | Description | Category |
|-------|-------------|----------|
| Test without assertions | `test_*` function has no assert/pytest.raises/unittest.assert | `test_quality` |
| Assert True | `assert True` is meaningless | `test_quality` |
---
## 11. Fixture Analysis
**Category**: `unused_fixture`, `duplicate_setup`, `fixture_antipattern`, `forbidden_direct_creation`, `missing_fixture_imports`, `undefined_fixture`
### Cross-File Fixture Checks (FixtureAnalyzer)
| Check | Description | Category |
|-------|-------------|----------|
| Unused fixtures | Fixtures defined but never used | `unused_fixture` |
| Duplicate setup code | Similar setup code in multiple tests | `duplicate_setup` |
| Manual tempfile usage | Use pytest's `tmp_path` fixture instead | `fixture_antipattern` |
| Manual patch usage | Consider fixtures for repeated mocking | `fixture_antipattern` |
| Manual database connections | Use fixture with proper cleanup | `fixture_antipattern` |
| Manual client initialization | Use fixture for consistent setup | `fixture_antipattern` |
| try/finally cleanup | Use fixture with yield for cleanup | `fixture_antipattern` |
### Forbidden Direct Creation
| Class | Suggestion | Category |
|-------|------------|----------|
| `MeasurementSystem()` | Use untrained_model or fresh_model fixture | `forbidden_direct_creation` |
| `UnlabeledTrainer()` | Use trainer_factory or fresh_trainer fixture | `forbidden_direct_creation` |
| `ModelTrainer()` | Use trainer fixtures instead | `forbidden_direct_creation` |
| `NeuralNetwork()` | Use model fixtures instead | `forbidden_direct_creation` |
| `DeepLearningModel()` | Use model fixtures instead | `forbidden_direct_creation` |
### Fixture Import Checks
| Check | Description | Category |
|-------|-------------|----------|
| Missing fixture imports | Fixtures defined but not imported in conftest.py | `missing_fixture_imports` |
| Undefined fixtures | Fixtures used but not defined anywhere | `undefined_fixture` |
---
## 12. Constants Enforcement
**Category**: `constants_enforcement`
Ensures magic numbers are defined in designated constants files.
### Rules (ConstantsEnforcer)
| Rule | Description |
|------|-------------|
| Allowed files | `constants.py`, `test_constants.py` only |
| Universally allowed | -1, 0, 1, 2, True, False, None |
| Allowed contexts | `range()`, `enumerate()`, `len()`, array indexing, slicing |
| Requires constant | All other numeric literals |
### Error Messages
| Context | Example Hint |
|---------|--------------|
| Floats 0-1 | `DROPOUT_RATE`, `LEARNING_RATE`, `PROBABILITY_THRESHOLD` |
| Large floats | `MAX_ITERATIONS`, `BUFFER_SIZE`, `TIMEOUT_MS` |
| Small floats | `EPSILON`, `THRESHOLD`, `SCALING_FACTOR` |
| Large integers | `BATCH_SIZE`, `NUM_EPOCHS`, `HIDDEN_DIM` |
| Negative integers | `ERROR_CODE`, `INVALID_INDEX`, `PENALTY` |
| Other integers | `MIN_THRESHOLD`, `MAX_RETRIES`, `NUM_LAYERS` |
---
## 13. External Tool Integration
### Ruff Code Quality (RuffChecker)
**Category**: `code_quality`
| Check | Description |
|-------|-------------|
| Ruff linting | Runs `ruff check .` on project |
| Linting errors | Reports count and suggests `ruff check . --fix` |
**Note**: Ruff is optional. If not installed, preflight passes this check.
---
## Skip Mechanism
All checks support skip comments for legitimate exceptions:
```python
# preflight-skip: reason for skipping (min 8 chars)
code_to_skip()
```
### File-Level Skips
```python
# preflight: skip-magic-numbers (grandfathered code)
```
---
## Configuration
Checks can be enabled/disabled via `preflight.json`:
```json
{
"project_type": "neural_network",
"source_dirs": ["src"],
"min_batch_size": 256,
"ml_checks": {
"check_tensor_operations": true,
"check_data_loaders": true,
"detect_unreachable": true,
"detect_magic_numbers": true,
"detect_duplicate_classes": true,
"detect_naming_collisions": true,
"detect_dead_code": true,
"detect_unused_fixtures": true,
"detect_duplicate_setup": true,
"detect_fixture_factories": true,
"detect_fixture_antipatterns": true,
"detect_forbidden_direct_creation": true,
"detect_missing_fixture_imports": true,
"suggest_fixtures": false
}
}
```
---
## Category Summary
| Category | Count | Severity |
|----------|-------|----------|
| `placeholders` | 11 patterns | Critical |
| `debug_code` | 3 patterns | Critical |
| `loss_issues` | 7+ patterns | High |
| `training_issues` | 10+ patterns | High |
| `gpu_efficiency` | 6+ patterns | Medium |
| `eval_issues` | 4+ patterns | Medium |
| `pytorch_inplace` | 4 rules | High |
| `performance` | 4+ checks | Medium |
| `io_bottleneck` | Context-aware | Medium |
| `test_issues` | 7 patterns | Medium |
| `code_quality` | Multiple | Low-Medium |
| `import_error` | 3 checks | Medium |
| `fixture_*` | 6+ checks | Low-Medium |
| `constants_enforcement` | 1 analyzer | Low |
---
*This document is auto-generated from source code analysis. For implementation details, see the analyzer source files in `src/preflight/analyzers/`.*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment