Skip to content

Current Testing Status

This document describes the current state of test coverage in mcp-scan.


Executive Summary

Metric Value Target
Test files 18 -
Total tests ~150 200+
Estimated coverage ~70% 80%+
Benchmarks 8 15+

Coverage by Package

High Coverage (>80%)

Package Test File Tests Status
internal/pattern engine_test.go 50+ Excellent
internal/callgraph graph_test.go 15 Good
internal/ml classifier_test.go 12 Good
internal/imports resolver_test.go 12 Good
internal/typeinfo types_test.go 8 Good

Medium Coverage (50-80%)

Package Test File Tests Gaps
internal/parser parser_test.go 15 Missing Go parser
internal/taint engine_test.go 12 Partial interprocedural
pkg/scanner scanner_test.go 8 More integration
internal/surface surface_test.go 6 More frameworks
internal/catalog catalog_test.go 6 Validate completeness

Low Coverage (<50%)

Package Test File Tests Needs
internal/reporter reporter_test.go 4 SARIF validation
internal/baseline baseline_test.go 3 Diff testing
internal/msss msss_test.go 2 Scoring validation
internal/config config_test.go 3 Edge cases

Analysis by Component

1. Parser (internal/parser/)

Status: Partial

What is tested: - Basic Python parsing - Basic TypeScript parsing - Function extraction - MCP decorator detection - Async functions - Imports - Classes and methods - Typed parameters

What is missing: - Go parser (not implemented) - JavaScript parser (partial) - Class inheritance - Complex closures and lambdas - Advanced type annotations - Error recovery

Priority: High - Go parser needed


2. Pattern Engine (internal/pattern/)

Status: Excellent

What is tested:

Detector Tests Status
PromptInjectionDetector 8 Complete
UnicodeDetector 6 Complete
HardcodedSecretDetector 10 Complete
DirectShellDetector 8 Complete
DangerousFunctionDetector 6 Complete
SQLConcatDetector 8 Complete
SecretLoggingDetector 6 Complete
WeakJWTDetector 6 Complete
OAuthStateDetector 4 Good
ToolShadowingDetector 6 Complete
UntrustedDependencyDetector 6 Complete
PathTraversalPatternDetector 6 Complete
InsecureCookieDetector 6 Complete
UnvalidatedURLDetector 4 Good

What is missing: - Detector combination tests - Performance tests under load - More edge cases

Priority: Low - well covered


3. Taint Engine (internal/taint/)

Status: Partial

What is tested: - Direct source -> sink flow - Propagation by assignment - Propagation by concatenation - Function calls (basic) - Sanitizers

What is missing: - Complete interprocedural analysis - Flows through callbacks - Flows through closures - Propagation in data structures - Context sensitivity - Taint in external function returns

Priority: High - core analysis


4. Call Graph (internal/callgraph/)

Status: Good

What is tested: - Graph creation and manipulation - Adding/removing nodes and edges - Finding callees/callers - Reachability (BFS) - Cycle detection - Functions by file - Statistics

What is missing: - Callbacks and higher-order functions - Virtual calls (class methods) - Dynamic resolution

Priority: Medium


5. ML Classifier (internal/ml/)

Status: Good

What is tested: - Rule-based classifier - Weighted classifier - Ensemble classifier - Loading weights from JSON - Confidence levels - JSON serialization - Performance benchmarks

What is missing: - Tests with large datasets - Precision/recall validation - Robustness tests (adversarial) - Cross-validation

Priority: Medium


6. Scanner (pkg/scanner/)

Status: Partial

What is tested: - Creation with options - Single file scanning - Directory scanning - Exclusion patterns - MCP surface detection - Finding generation

What is missing: - Complete E2E tests - Deep vs fast mode - Integration with all detectors - Performance with large projects - Multiple mixed languages

Priority: High


7. Reporter (internal/reporter/)

Status: Low

What is tested: - Basic JSON generation - Basic SARIF generation

What is missing: - SARIF schema validation - Evidence bundle - Additional output formats - Pretty printing - Streaming output

Priority: Medium


Fixtures

Status by Class

Class Vulnerable Benign Python TS Go Status
A (RCE) 4 2 Yes Yes No Good
B (Path) 3 1 Yes Yes No Good
C (SSRF) 2 1 Yes No No Partial
D (SQLi) 4 2 Yes Yes No Good
E (Secrets) 5 1 Yes Yes No Good
F (Auth) 4 2 Yes Yes No Good
G (Poisoning) 4 1 Yes No No Good
H (Decl/Behav) 2 1 Yes No No Partial
I (Multi-tool) 2 0 Yes No No Minimal
J (Memory) 1 0 Yes No No Minimal
K (Task) 1 0 Yes No No Minimal
L (Lifecycle) 1 0 Yes No No Minimal
M (Network) 1 0 Yes No No Minimal
N (Supply) 2 1 Yes No No Partial

Needs

  1. TypeScript: Add TS fixtures for classes C, G-N
  2. Go: Add Go fixtures for all classes
  3. Benign: Add more benign fixtures to validate FP rate
  4. Edge cases: Edge cases for each class

DVMCP

Detection Status

Challenge Expected Detected Status
1 (Prompt Inj) 1-2 2 OK
2 (Poisoning) 1-2 1 OK
3 (Permissions) 1 0 MISSING
4 (Rug Pull) 1 0 MISSING
5 (Shadowing) 1-2 1 OK
6 (Indirect) 1-2 1 PARTIAL
7 (Token) 2-3 2 OK
8 (Code Exec) 2-3 2 OK
9 (Backdoor) 2-3 1 PARTIAL
10 (Multi) 5+ 3 PARTIAL

Detection rate: ~65%


Benchmarks

Existing

Benchmark Package ns/op Memory
RuleBasedClassifier ml ~500 ~1KB
FeatureExtraction ml ~200 ~512B

Needed

Benchmark Package Reason
Parser_Python parser Baseline parsing
Parser_TypeScript parser Baseline parsing
TaintEngine_Simple taint Taint performance
TaintEngine_Complex taint Scaling
Scanner_SmallProject scanner E2E baseline
Scanner_MediumProject scanner E2E scaling
Scanner_LargeProject scanner Limits

Improvement Plan

Short Term (Current sprint)

  1. Increase taint coverage:
  2. Add interprocedural tests
  3. Callback and closure tests

  4. Complete TypeScript fixtures:

  5. Classes C, G, H, I

  6. Scanner E2E:

  7. Complete integration tests
  8. Validate all detectors

Medium Term (Next month)

  1. Go parser and fixtures:
  2. Implement Go parser
  3. Go fixtures for main classes

  4. Complete reporter:

  5. SARIF schema validation
  6. Evidence bundle testing

  7. Complete benchmarks:

  8. Parser benchmarks
  9. Scanner benchmarks

Long Term (Quarter)

  1. DVMCP 90% coverage:
  2. Detect challenges 3, 4
  3. Improve multi-vector detection

  4. ML validation:

  5. Validation dataset
  6. Precision/recall metrics

  7. Fuzzing:

  8. Parser fuzzing
  9. Detector fuzzing

Quality Metrics

Current

Metric Value Target
Line coverage ~70% 80%
Tests per package 8-50 15+
CI time ~2min <3min
Flaky tests 0 0

Tracking

# Generate coverage report
go test -coverprofile=coverage.out ./...
go tool cover -func=coverage.out | tail -1

# Count tests
go test -v ./... 2>&1 | grep -c "=== RUN"

# Verify flaky
for i in {1..10}; do go test ./... || echo "FLAKY"; done

Contributing

To improve test coverage:

  1. Identify gaps in this document
  2. Prioritize according to the priority table
  3. Create issue describing tests to add
  4. Implement following existing patterns
  5. Update this document

Issue Template

## Add tests for [component]

### Context
[Description of identified gap]

### Tests to add
- [ ] Test 1: description
- [ ] Test 2: description
- [ ] Test 3: description

### Needed fixtures
- [ ] Vulnerable fixture
- [ ] Benign fixture

### Acceptance criteria
- Component coverage >= 80%
- All tests pass
- Documentation updated

Last updated: January 2026