Current Testing Status¶
This document describes the current state of test coverage in mcp-scan.
Executive Summary¶
| Metric | Value | Target |
|---|---|---|
| Test files | 18 | - |
| Total tests | ~150 | 200+ |
| Estimated coverage | ~70% | 80%+ |
| Benchmarks | 8 | 15+ |
Coverage by Package¶
High Coverage (>80%)¶
| Package | Test File | Tests | Status |
|---|---|---|---|
internal/pattern |
engine_test.go |
50+ | Excellent |
internal/callgraph |
graph_test.go |
15 | Good |
internal/ml |
classifier_test.go |
12 | Good |
internal/imports |
resolver_test.go |
12 | Good |
internal/typeinfo |
types_test.go |
8 | Good |
Medium Coverage (50-80%)¶
| Package | Test File | Tests | Gaps |
|---|---|---|---|
internal/parser |
parser_test.go |
15 | Missing Go parser |
internal/taint |
engine_test.go |
12 | Partial interprocedural |
pkg/scanner |
scanner_test.go |
8 | More integration |
internal/surface |
surface_test.go |
6 | More frameworks |
internal/catalog |
catalog_test.go |
6 | Validate completeness |
Low Coverage (<50%)¶
| Package | Test File | Tests | Needs |
|---|---|---|---|
internal/reporter |
reporter_test.go |
4 | SARIF validation |
internal/baseline |
baseline_test.go |
3 | Diff testing |
internal/msss |
msss_test.go |
2 | Scoring validation |
internal/config |
config_test.go |
3 | Edge cases |
Analysis by Component¶
1. Parser (internal/parser/)¶
Status: Partial
What is tested: - Basic Python parsing - Basic TypeScript parsing - Function extraction - MCP decorator detection - Async functions - Imports - Classes and methods - Typed parameters
What is missing: - Go parser (not implemented) - JavaScript parser (partial) - Class inheritance - Complex closures and lambdas - Advanced type annotations - Error recovery
Priority: High - Go parser needed
2. Pattern Engine (internal/pattern/)¶
Status: Excellent
What is tested:
| Detector | Tests | Status |
|---|---|---|
| PromptInjectionDetector | 8 | Complete |
| UnicodeDetector | 6 | Complete |
| HardcodedSecretDetector | 10 | Complete |
| DirectShellDetector | 8 | Complete |
| DangerousFunctionDetector | 6 | Complete |
| SQLConcatDetector | 8 | Complete |
| SecretLoggingDetector | 6 | Complete |
| WeakJWTDetector | 6 | Complete |
| OAuthStateDetector | 4 | Good |
| ToolShadowingDetector | 6 | Complete |
| UntrustedDependencyDetector | 6 | Complete |
| PathTraversalPatternDetector | 6 | Complete |
| InsecureCookieDetector | 6 | Complete |
| UnvalidatedURLDetector | 4 | Good |
What is missing: - Detector combination tests - Performance tests under load - More edge cases
Priority: Low - well covered
3. Taint Engine (internal/taint/)¶
Status: Partial
What is tested: - Direct source -> sink flow - Propagation by assignment - Propagation by concatenation - Function calls (basic) - Sanitizers
What is missing: - Complete interprocedural analysis - Flows through callbacks - Flows through closures - Propagation in data structures - Context sensitivity - Taint in external function returns
Priority: High - core analysis
4. Call Graph (internal/callgraph/)¶
Status: Good
What is tested: - Graph creation and manipulation - Adding/removing nodes and edges - Finding callees/callers - Reachability (BFS) - Cycle detection - Functions by file - Statistics
What is missing: - Callbacks and higher-order functions - Virtual calls (class methods) - Dynamic resolution
Priority: Medium
5. ML Classifier (internal/ml/)¶
Status: Good
What is tested: - Rule-based classifier - Weighted classifier - Ensemble classifier - Loading weights from JSON - Confidence levels - JSON serialization - Performance benchmarks
What is missing: - Tests with large datasets - Precision/recall validation - Robustness tests (adversarial) - Cross-validation
Priority: Medium
6. Scanner (pkg/scanner/)¶
Status: Partial
What is tested: - Creation with options - Single file scanning - Directory scanning - Exclusion patterns - MCP surface detection - Finding generation
What is missing: - Complete E2E tests - Deep vs fast mode - Integration with all detectors - Performance with large projects - Multiple mixed languages
Priority: High
7. Reporter (internal/reporter/)¶
Status: Low
What is tested: - Basic JSON generation - Basic SARIF generation
What is missing: - SARIF schema validation - Evidence bundle - Additional output formats - Pretty printing - Streaming output
Priority: Medium
Fixtures¶
Status by Class¶
| Class | Vulnerable | Benign | Python | TS | Go | Status |
|---|---|---|---|---|---|---|
| A (RCE) | 4 | 2 | Yes | Yes | No | Good |
| B (Path) | 3 | 1 | Yes | Yes | No | Good |
| C (SSRF) | 2 | 1 | Yes | No | No | Partial |
| D (SQLi) | 4 | 2 | Yes | Yes | No | Good |
| E (Secrets) | 5 | 1 | Yes | Yes | No | Good |
| F (Auth) | 4 | 2 | Yes | Yes | No | Good |
| G (Poisoning) | 4 | 1 | Yes | No | No | Good |
| H (Decl/Behav) | 2 | 1 | Yes | No | No | Partial |
| I (Multi-tool) | 2 | 0 | Yes | No | No | Minimal |
| J (Memory) | 1 | 0 | Yes | No | No | Minimal |
| K (Task) | 1 | 0 | Yes | No | No | Minimal |
| L (Lifecycle) | 1 | 0 | Yes | No | No | Minimal |
| M (Network) | 1 | 0 | Yes | No | No | Minimal |
| N (Supply) | 2 | 1 | Yes | No | No | Partial |
Needs¶
- TypeScript: Add TS fixtures for classes C, G-N
- Go: Add Go fixtures for all classes
- Benign: Add more benign fixtures to validate FP rate
- Edge cases: Edge cases for each class
DVMCP¶
Detection Status¶
| Challenge | Expected | Detected | Status |
|---|---|---|---|
| 1 (Prompt Inj) | 1-2 | 2 | OK |
| 2 (Poisoning) | 1-2 | 1 | OK |
| 3 (Permissions) | 1 | 0 | MISSING |
| 4 (Rug Pull) | 1 | 0 | MISSING |
| 5 (Shadowing) | 1-2 | 1 | OK |
| 6 (Indirect) | 1-2 | 1 | PARTIAL |
| 7 (Token) | 2-3 | 2 | OK |
| 8 (Code Exec) | 2-3 | 2 | OK |
| 9 (Backdoor) | 2-3 | 1 | PARTIAL |
| 10 (Multi) | 5+ | 3 | PARTIAL |
Detection rate: ~65%
Benchmarks¶
Existing¶
| Benchmark | Package | ns/op | Memory |
|---|---|---|---|
| RuleBasedClassifier | ml | ~500 | ~1KB |
| FeatureExtraction | ml | ~200 | ~512B |
Needed¶
| Benchmark | Package | Reason |
|---|---|---|
| Parser_Python | parser | Baseline parsing |
| Parser_TypeScript | parser | Baseline parsing |
| TaintEngine_Simple | taint | Taint performance |
| TaintEngine_Complex | taint | Scaling |
| Scanner_SmallProject | scanner | E2E baseline |
| Scanner_MediumProject | scanner | E2E scaling |
| Scanner_LargeProject | scanner | Limits |
Improvement Plan¶
Short Term (Current sprint)¶
- Increase taint coverage:
- Add interprocedural tests
-
Callback and closure tests
-
Complete TypeScript fixtures:
-
Classes C, G, H, I
-
Scanner E2E:
- Complete integration tests
- Validate all detectors
Medium Term (Next month)¶
- Go parser and fixtures:
- Implement Go parser
-
Go fixtures for main classes
-
Complete reporter:
- SARIF schema validation
-
Evidence bundle testing
-
Complete benchmarks:
- Parser benchmarks
- Scanner benchmarks
Long Term (Quarter)¶
- DVMCP 90% coverage:
- Detect challenges 3, 4
-
Improve multi-vector detection
-
ML validation:
- Validation dataset
-
Precision/recall metrics
-
Fuzzing:
- Parser fuzzing
- Detector fuzzing
Quality Metrics¶
Current¶
| Metric | Value | Target |
|---|---|---|
| Line coverage | ~70% | 80% |
| Tests per package | 8-50 | 15+ |
| CI time | ~2min | <3min |
| Flaky tests | 0 | 0 |
Tracking¶
# Generate coverage report
go test -coverprofile=coverage.out ./...
go tool cover -func=coverage.out | tail -1
# Count tests
go test -v ./... 2>&1 | grep -c "=== RUN"
# Verify flaky
for i in {1..10}; do go test ./... || echo "FLAKY"; done
Contributing¶
To improve test coverage:
- Identify gaps in this document
- Prioritize according to the priority table
- Create issue describing tests to add
- Implement following existing patterns
- Update this document
Issue Template¶
## Add tests for [component]
### Context
[Description of identified gap]
### Tests to add
- [ ] Test 1: description
- [ ] Test 2: description
- [ ] Test 3: description
### Needed fixtures
- [ ] Vulnerable fixture
- [ ] Benign fixture
### Acceptance criteria
- Component coverage >= 80%
- All tests pass
- Documentation updated
Last updated: January 2026