1621 lines
No EOL
56 KiB
Markdown
1621 lines
No EOL
56 KiB
Markdown
# Detector Worker - Architecture & Workflow Documentation
|
|
|
|
## Table of Contents
|
|
|
|
1. [Architecture Overview](#architecture-overview)
|
|
2. [Module Structure](#module-structure)
|
|
3. [System Startup Flow](#system-startup-flow)
|
|
4. [WebSocket Communication Flow](#websocket-communication-flow)
|
|
5. [Detection Pipeline Flow](#detection-pipeline-flow)
|
|
6. [Data Storage Flow](#data-storage-flow)
|
|
7. [Error Handling & Recovery](#error-handling--recovery)
|
|
8. [Testing Architecture](#testing-architecture)
|
|
9. [Development Workflow](#development-workflow)
|
|
|
|
## Architecture Overview
|
|
|
|
The Detector Worker has been refactored from a monolithic 4,115-line application into a modular, maintainable system with clear separation of concerns. The new architecture follows modern software engineering principles including dependency injection, thread safety, and comprehensive testing.
|
|
|
|
### High-Level System Diagram
|
|
|
|
```mermaid
|
|
graph TB
|
|
Client[WebSocket Client] --> WS[WebSocket Handler]
|
|
WS --> MP[Message Processor]
|
|
MP --> SM[Stream Manager]
|
|
MP --> MM[Model Manager]
|
|
MP --> PE[Pipeline Executor]
|
|
|
|
SM --> FR[Frame Reader]
|
|
FR --> YD[YOLO Detector]
|
|
YD --> TM[Tracking Manager]
|
|
TM --> SV[Stability Validator]
|
|
|
|
PE --> AE[Action Executor]
|
|
PE --> FM[Field Mapper]
|
|
AE --> DM[Database Manager]
|
|
AE --> RC[Redis Client]
|
|
AE --> SC[Session Cache]
|
|
|
|
subgraph "Singleton Managers"
|
|
MSM[Model State Manager]
|
|
SSM[Stream State Manager]
|
|
SeSM[Session State Manager]
|
|
CSM[Cache State Manager]
|
|
CaSM[Camera State Manager]
|
|
PSM[Pipeline State Manager]
|
|
end
|
|
|
|
subgraph "Storage Layer"
|
|
PostgreSQL[(PostgreSQL)]
|
|
Redis[(Redis)]
|
|
LocalCache[Local Cache]
|
|
end
|
|
|
|
DM --> PostgreSQL
|
|
RC --> Redis
|
|
SC --> LocalCache
|
|
```
|
|
|
|
### Key Architectural Principles
|
|
|
|
1. **Modular Design**: Each module has a single responsibility
|
|
2. **Dependency Injection**: IoC container manages object dependencies
|
|
3. **Thread Safety**: Singleton managers with proper locking
|
|
4. **Async/Await**: Non-blocking I/O operations throughout
|
|
5. **Type Safety**: Comprehensive type hints and validation
|
|
6. **Error Resilience**: Proper exception handling and recovery
|
|
|
|
## Module Structure
|
|
|
|
### Core Modules (`detector_worker/core/`)
|
|
|
|
#### `config.py` - Configuration Management
|
|
```python
|
|
# Central configuration with multi-source loading
|
|
Configuration()
|
|
├── load_from_file(path) # JSON/YAML config files
|
|
├── load_from_env() # Environment variables
|
|
└── validate_config() # Configuration validation
|
|
|
|
ConfigurationProvider(ABC) # Abstract base for config sources
|
|
├── JSONConfigProvider
|
|
├── YAMLConfigProvider
|
|
└── EnvironmentConfigProvider
|
|
```
|
|
|
|
#### `dependency_injection.py` - IoC Container
|
|
```python
|
|
ServiceContainer()
|
|
├── register_singleton() # Single instance per container
|
|
├── register_transient() # New instance per request
|
|
├── register_scoped() # Instance per scope
|
|
└── resolve() # Dependency resolution with auto-injection
|
|
```
|
|
|
|
#### `singleton_managers.py` - Thread-Safe State Management
|
|
```python
|
|
# Six singleton managers replacing global dictionaries
|
|
ModelStateManager() # Model loading states
|
|
StreamStateManager() # Active stream connections
|
|
SessionStateManager() # Client session tracking
|
|
CacheStateManager() # Cache state management
|
|
CameraStateManager() # Camera connection states
|
|
PipelineStateManager() # Pipeline execution states
|
|
```
|
|
|
|
#### `exceptions.py` - Exception Hierarchy
|
|
```python
|
|
DetectorWorkerError # Base exception
|
|
├── ConfigurationError # Configuration issues
|
|
├── StreamError # Stream-related errors
|
|
├── ModelError # Model loading/inference errors
|
|
├── PipelineError # Pipeline execution errors
|
|
├── DatabaseError # Database operation errors
|
|
├── RedisError # Redis operation errors
|
|
└── MessageProcessingError # WebSocket message errors
|
|
```
|
|
|
|
### Communication Layer (`detector_worker/communication/`)
|
|
|
|
#### `websocket_handler.py` - WebSocket Management
|
|
```python
|
|
WebSocketHandler()
|
|
├── handle_websocket() # Main WebSocket connection handler
|
|
├── _heartbeat_loop() # Keep-alive mechanism
|
|
└── _cleanup_connections() # Connection cleanup
|
|
|
|
ConnectionManager()
|
|
├── add_connection() # Register new client
|
|
├── remove_connection() # Cleanup disconnected client
|
|
├── broadcast() # Send to all clients
|
|
└── broadcast_to_subscription() # Send to specific subscription
|
|
|
|
WebSocketConnection() # Per-client connection wrapper
|
|
├── accept() # Accept WebSocket connection
|
|
├── send_message() # Send JSON/text message
|
|
├── receive_message() # Receive and parse message
|
|
└── ping() # Send keep-alive ping
|
|
```
|
|
|
|
#### `message_processor.py` - Message Processing Pipeline
|
|
```python
|
|
MessageProcessor()
|
|
├── parse_message() # Message parsing and validation
|
|
├── _validate_set_subscription_list() # Validate declarative subscriptions
|
|
├── _validate_set_session() # Session management validation
|
|
├── _validate_patch_session() # Patch session validation
|
|
├── _validate_progression_stage() # Progression stage validation
|
|
└── _validate_patch_session_result() # Backend response validation
|
|
|
|
MessageType(Enum) # Protocol-compliant message types per worker.md
|
|
├── SET_SUBSCRIPTION_LIST # Primary declarative subscription command
|
|
├── REQUEST_STATE # System state requests
|
|
├── SET_SESSION_ID # Session association (supports null clearing)
|
|
├── PATCH_SESSION # Session modification requests
|
|
├── SET_PROGRESSION_STAGE # Real-time progression updates
|
|
├── PATCH_SESSION_RESULT # Backend responses to patch requests
|
|
├── IMAGE_DETECTION # Detection results (worker->backend)
|
|
└── STATE_REPORT # Heartbeat messages (worker->backend)
|
|
|
|
# Protocol-Compliant Payload Classes
|
|
SubscriptionObject() # Individual subscription per worker.md specification
|
|
├── subscription_identifier # Format: "displayId;cameraId"
|
|
├── rtsp_url # Required RTSP stream URL
|
|
├── model_url # Required fresh model URL (1-hour TTL)
|
|
├── model_id, model_name # Model identification
|
|
├── snapshot_url # Optional HTTP snapshot URL
|
|
├── snapshot_interval # Required if snapshot_url provided
|
|
└── crop_x1/y1/x2/y2 # Optional crop coordinates
|
|
|
|
SetSubscriptionListPayload() # Declarative subscription command payload
|
|
└── subscriptions: List[SubscriptionObject] # Complete desired state
|
|
|
|
SessionPayload() # Session management payload
|
|
├── display_identifier # Target display ID
|
|
├── session_id # Session ID (can be null for clearing)
|
|
└── data # Optional session patch data
|
|
|
|
ProgressionPayload() # Progression stage payload
|
|
├── display_identifier # Target display ID
|
|
└── progression_stage # Stage: welcome|car_fueling|car_waitpayment|null
|
|
```
|
|
|
|
### Stream Management (`detector_worker/streams/`)
|
|
|
|
#### `stream_manager.py` - Stream Lifecycle Management
|
|
```python
|
|
StreamManager()
|
|
├── create_stream() # Create RTSP/HTTP stream
|
|
├── remove_stream() # Stop and cleanup stream
|
|
├── get_latest_frame() # Get current frame
|
|
├── reconnect_stream() # Handle connection failures
|
|
└── stop_all_streams() # Cleanup all streams
|
|
|
|
StreamConfig() # Stream configuration
|
|
├── stream_url # RTSP/HTTP URL
|
|
├── stream_type # "rtsp" or "http_snapshot"
|
|
├── target_fps # Target frame rate
|
|
└── reconnect_interval # Reconnection delay
|
|
|
|
StreamReader() # Individual stream handler
|
|
├── start() # Start frame capture
|
|
├── stop() # Stop and cleanup
|
|
├── get_latest_frame() # Get most recent frame
|
|
└── _reader_loop() # Main capture loop
|
|
```
|
|
|
|
#### `frame_reader.py` - Frame Capture Implementation
|
|
```python
|
|
RTSPReader() # RTSP stream handler
|
|
├── connect() # Establish RTSP connection
|
|
├── read_frame() # Capture single frame
|
|
└── handle_reconnection() # Connection recovery
|
|
|
|
HTTPSnapshotReader() # HTTP snapshot handler
|
|
├── fetch_snapshot() # HTTP GET request
|
|
├── decode_image() # Image decoding
|
|
└── schedule_next() # Schedule next capture
|
|
```
|
|
|
|
### Detection System (`detector_worker/detection/`)
|
|
|
|
#### `yolo_detector.py` - Object Detection
|
|
```python
|
|
YOLODetector()
|
|
├── load_model() # Load YOLO model
|
|
├── detect() # Run inference
|
|
├── _preprocess_frame() # Input preprocessing
|
|
├── _postprocess_results() # Output processing
|
|
└── _filter_detections() # Confidence filtering
|
|
|
|
DetectionResult() # Detection output structure
|
|
├── class_name # Detected class
|
|
├── confidence # Detection confidence
|
|
├── bounding_box # Spatial coordinates
|
|
├── track_id # Tracking identifier
|
|
└── timestamp # Detection timestamp
|
|
```
|
|
|
|
#### `tracking_manager.py` - Multi-Object Tracking
|
|
```python
|
|
TrackingManager()
|
|
├── update_tracks() # Update tracker with detections
|
|
├── _associate_detections() # Data association
|
|
├── _create_new_tracks() # Initialize new tracks
|
|
├── _update_existing_tracks() # Update track states
|
|
└── _cleanup_lost_tracks() # Remove stale tracks
|
|
|
|
Track() # Individual object track
|
|
├── update() # Update with new detection
|
|
├── predict() # Predict next state
|
|
├── is_confirmed() # Track confirmation status
|
|
└── time_since_update() # Track age
|
|
```
|
|
|
|
#### `stability_validator.py` - Detection Validation
|
|
```python
|
|
StabilityValidator()
|
|
├── add_detection() # Add detection to history
|
|
├── is_detection_stable() # Check stability criteria
|
|
├── _calculate_stability() # Stability metrics
|
|
└── _cleanup_old_detections() # History management
|
|
```
|
|
|
|
### Pipeline System (`detector_worker/pipeline/`)
|
|
|
|
#### `pipeline_executor.py` - ML Pipeline Orchestration
|
|
```python
|
|
PipelineExecutor()
|
|
├── execute_pipeline() # Main pipeline execution
|
|
├── _run_detection_stage() # Object detection phase
|
|
├── _run_classification_branches() # Parallel classification
|
|
├── _execute_actions() # Post-processing actions
|
|
└── _wait_for_branches() # Synchronization
|
|
|
|
PipelineContext() # Execution context
|
|
├── camera_id # Camera identifier
|
|
├── session_id # Session identifier
|
|
├── frame # Input frame
|
|
├── timestamp # Processing timestamp
|
|
└── intermediate_results # Shared results
|
|
```
|
|
|
|
#### `action_executor.py` - Action Processing
|
|
```python
|
|
ActionExecutor()
|
|
├── execute_action() # Execute single action
|
|
├── _redis_save_image() # Redis image storage
|
|
├── _postgresql_create() # Database record creation
|
|
├── _postgresql_update() # Database record update
|
|
└── _publish_message() # Message publishing
|
|
|
|
ActionType(Enum) # Supported action types
|
|
├── REDIS_SAVE_IMAGE
|
|
├── POSTGRESQL_CREATE
|
|
├── POSTGRESQL_UPDATE
|
|
└── PUBLISH_MESSAGE
|
|
```
|
|
|
|
#### `field_mapper.py` - Dynamic Field Resolution
|
|
```python
|
|
FieldMapper()
|
|
├── resolve_fields() # Resolve template fields
|
|
├── _substitute_variables() # Variable substitution
|
|
├── _resolve_branch_results() # Branch result mapping
|
|
└── _validate_mapping() # Mapping validation
|
|
```
|
|
|
|
### Storage Layer (`detector_worker/storage/`)
|
|
|
|
#### `database_manager.py` - PostgreSQL Operations
|
|
```python
|
|
DatabaseManager()
|
|
├── connect() # Database connection
|
|
├── create_record() # INSERT operations
|
|
├── update_record() # UPDATE operations
|
|
├── get_record() # SELECT operations
|
|
├── execute_query() # Raw SQL execution
|
|
└── _handle_connection_error() # Error recovery
|
|
|
|
DatabaseConfig() # Database configuration
|
|
├── host, port, database # Connection params
|
|
├── user, password # Authentication
|
|
└── connection_pool_size # Pool configuration
|
|
```
|
|
|
|
#### `redis_client.py` - Redis Operations & Image Storage
|
|
```python
|
|
RedisClient()
|
|
├── connect() # Redis connection
|
|
├── set/get/delete() # Basic operations
|
|
├── pipeline() # Batch operations
|
|
└── scan_keys() # Key scanning
|
|
|
|
RedisImageStorage() # Image-specific operations
|
|
├── store_image() # Store with compression
|
|
├── retrieve_image() # Retrieve and decode
|
|
├── delete_image() # Delete image
|
|
└── cleanup_expired() # Cleanup expired images
|
|
|
|
RedisPublisher/Subscriber() # Pub/Sub messaging
|
|
├── publish() # Publish message
|
|
├── subscribe() # Subscribe to channel
|
|
└── listen() # Message listening
|
|
```
|
|
|
|
#### `session_cache.py` - High-Performance Caching
|
|
```python
|
|
SessionCacheManager() # Singleton cache manager
|
|
├── cache_detection() # Cache detection results
|
|
├── cache_pipeline_result() # Cache pipeline outputs
|
|
├── create_session() # Create session entry
|
|
├── update_session() # Update session data
|
|
└── cleanup_expired() # Cache maintenance
|
|
|
|
SessionCache() # LRU cache implementation
|
|
├── put/get/remove() # Basic cache operations
|
|
├── _evict_lru() # LRU eviction
|
|
├── _check_memory_limit() # Memory management
|
|
└── get_stats() # Cache statistics
|
|
```
|
|
|
|
### Model Management (`detector_worker/models/`)
|
|
|
|
#### `model_manager.py` - Model Loading & Caching
|
|
```python
|
|
ModelManager()
|
|
├── load_model() # Load and cache model
|
|
├── get_model() # Retrieve cached model
|
|
├── unload_model() # Remove from cache
|
|
├── cleanup_unused() # Cache maintenance
|
|
└── get_memory_usage() # Memory tracking
|
|
|
|
ModelCache() # Model cache implementation
|
|
├── put/get/remove() # Cache operations
|
|
├── _estimate_memory() # Memory estimation
|
|
├── _evict_unused() # Memory-based eviction
|
|
└── get_cache_stats() # Cache metrics
|
|
```
|
|
|
|
#### `pipeline_loader.py` - MPTA Pipeline Loading
|
|
```python
|
|
PipelineLoader()
|
|
├── load_pipeline() # Load MPTA file
|
|
├── _extract_archive() # ZIP extraction
|
|
├── _parse_config() # Configuration parsing
|
|
├── _validate_pipeline() # Pipeline validation
|
|
└── _load_models() # Load pipeline models
|
|
```
|
|
|
|
## System Startup Flow
|
|
|
|
### Application Initialization Sequence
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Main as app.py
|
|
participant Config as Configuration
|
|
participant Container as ServiceContainer
|
|
participant Managers as Singleton Managers
|
|
participant FastAPI as FastAPI App
|
|
|
|
Main->>Config: load_configuration()
|
|
Config->>Config: load_from_file()
|
|
Config->>Config: load_from_env()
|
|
Config->>Config: validate_config()
|
|
|
|
Main->>Container: ServiceContainer()
|
|
Main->>Container: register_services()
|
|
Container->>Managers: initialize_singletons()
|
|
|
|
Main->>FastAPI: create_app()
|
|
FastAPI->>FastAPI: setup_lifespan()
|
|
FastAPI->>FastAPI: add_websocket_routes()
|
|
FastAPI->>FastAPI: add_http_routes()
|
|
|
|
Note over Main,FastAPI: Application Ready
|
|
FastAPI->>Main: uvicorn.run()
|
|
```
|
|
|
|
### Detailed Startup Process
|
|
|
|
1. **Configuration Loading** (`app.py:15-25`)
|
|
```python
|
|
# Load configuration from multiple sources
|
|
config = Configuration()
|
|
config.load_from_file("config.json") # Primary config
|
|
config.load_from_env() # Environment overrides
|
|
config.validate_config() # Validation
|
|
```
|
|
|
|
2. **Dependency Injection Setup** (`app.py:27-45`)
|
|
```python
|
|
# Create and configure IoC container
|
|
container = ServiceContainer()
|
|
|
|
# Register core services
|
|
container.register_singleton(Configuration, lambda: config)
|
|
container.register_singleton(StreamManager, StreamManager)
|
|
container.register_singleton(ModelManager, ModelManager)
|
|
container.register_singleton(PipelineExecutor, PipelineExecutor)
|
|
```
|
|
|
|
3. **Singleton Manager Initialization** (`app.py:47-55`)
|
|
```python
|
|
# Initialize thread-safe singleton managers
|
|
model_state = ModelStateManager()
|
|
stream_state = StreamStateManager()
|
|
session_state = SessionStateManager()
|
|
# ... other managers
|
|
```
|
|
|
|
4. **FastAPI Application Creation** (`app.py:57-75`)
|
|
```python
|
|
# Create FastAPI app with lifespan management
|
|
@asynccontextmanager
|
|
async def lifespan(app: FastAPI):
|
|
# Startup logic
|
|
await initialize_services()
|
|
yield
|
|
# Shutdown logic
|
|
await cleanup_services()
|
|
|
|
app = FastAPI(lifespan=lifespan)
|
|
```
|
|
|
|
5. **Route Registration** (`app.py:77-85`)
|
|
```python
|
|
# WebSocket endpoint
|
|
@app.websocket("/")
|
|
async def websocket_endpoint(websocket: WebSocket):
|
|
ws_handler = container.resolve(WebSocketHandler)
|
|
await ws_handler.handle_connection(websocket)
|
|
|
|
# HTTP endpoints
|
|
@app.get("/camera/{camera_id}/image")
|
|
async def get_camera_image(camera_id: str):
|
|
return stream_manager.get_latest_frame(camera_id)
|
|
```
|
|
|
|
## Protocol Compliance (worker.md Implementation)
|
|
|
|
### Key Protocol Features Implemented
|
|
|
|
The WebSocket communication layer has been **fully updated** to comply with the worker.md protocol specification, replacing deprecated patterns with modern declarative subscription management.
|
|
|
|
#### ✅ **Protocol-Compliant Message Types**
|
|
|
|
| Message Type | Direction | Purpose | Status |
|
|
|--------------|-----------|---------|---------|
|
|
| `setSubscriptionList` | Backend→Worker | **Primary subscription command** - declarative management | ✅ **Implemented** |
|
|
| `setSessionId` | Backend→Worker | Associate session with display (supports null clearing) | ✅ **Implemented** |
|
|
| `setProgressionStage` | Backend→Worker | Real-time progression updates for context-aware processing | ✅ **Implemented** |
|
|
| `requestState` | Backend→Worker | Request immediate state report | ✅ **Implemented** |
|
|
| `patchSessionResult` | Backend→Worker | Response to worker's patch session request | ✅ **Implemented** |
|
|
| `stateReport` | Worker→Backend | **Heartbeat with performance metrics** (every 2 seconds) | ✅ **Implemented** |
|
|
| `imageDetection` | Worker→Backend | Real-time detection results with session context | ✅ **Implemented** |
|
|
| `patchSession` | Worker→Backend | Request modification to session data | ✅ **Implemented** |
|
|
|
|
#### ❌ **Deprecated Message Types Removed**
|
|
|
|
- `subscribe` - Replaced by declarative `setSubscriptionList`
|
|
- `unsubscribe` - Handled by empty `setSubscriptionList` array
|
|
|
|
#### 🔧 **Key Protocol Implementations**
|
|
|
|
##### 1. **Declarative Subscription Management**
|
|
```python
|
|
# setSubscriptionList provides complete desired state
|
|
{
|
|
"type": "setSubscriptionList",
|
|
"subscriptions": [
|
|
{
|
|
"subscriptionIdentifier": "display-001;cam-001",
|
|
"rtspUrl": "rtsp://192.168.1.100/stream1",
|
|
"modelUrl": "http://storage/models/vehicle-id.mpta?token=fresh-token",
|
|
"modelId": 201,
|
|
"modelName": "Vehicle Identification",
|
|
"cropX1": 100, "cropY1": 200, "cropX2": 300, "cropY2": 400
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
**Worker Reconciliation Logic:**
|
|
- **Add new subscriptions** not in current state
|
|
- **Remove obsolete subscriptions** not in desired state
|
|
- **Update existing subscriptions** with fresh model URLs (handles S3 expiration)
|
|
- **Single stream optimization** - share RTSP streams across multiple subscriptions
|
|
|
|
##### 2. **Protocol-Compliant State Reports**
|
|
```python
|
|
# stateReport with flat structure per worker.md specification
|
|
{
|
|
"type": "stateReport",
|
|
"cpuUsage": 75.5,
|
|
"memoryUsage": 40.2,
|
|
"gpuUsage": 60.0,
|
|
"gpuMemoryUsage": 25.1,
|
|
"cameraConnections": [
|
|
{
|
|
"subscriptionIdentifier": "display-001;cam-001",
|
|
"modelId": 101,
|
|
"modelName": "General Object Detection",
|
|
"online": true,
|
|
"cropX1": 100, "cropY1": 200, "cropX2": 300, "cropY2": 400
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
##### 3. **Session Context in Detection Results**
|
|
```python
|
|
# imageDetection with sessionId for proper session linking
|
|
{
|
|
"type": "imageDetection",
|
|
"subscriptionIdentifier": "display-001;cam-001",
|
|
"timestamp": "2025-09-12T10:00:00.000Z",
|
|
"sessionId": 12345, # Critical for CMS session tracking
|
|
"data": {
|
|
"detection": {
|
|
"carBrand": "Honda",
|
|
"carModel": "CR-V",
|
|
"licensePlateText": "ABC-123"
|
|
},
|
|
"modelId": 201,
|
|
"modelName": "Vehicle Identification"
|
|
}
|
|
}
|
|
```
|
|
|
|
#### 🚀 **Enhanced Features**
|
|
|
|
1. **State Recovery**: Complete subscription restoration on worker reconnection
|
|
2. **Fresh Model URLs**: Automatic S3 URL refresh via subscription updates
|
|
3. **Multi-Display Support**: Proper display identifier parsing and session management
|
|
4. **Load Balancing Ready**: Backend can distribute subscriptions across workers
|
|
5. **Session Continuity**: Session IDs maintained across worker disconnections
|
|
|
|
## WebSocket Communication Flow
|
|
|
|
### Client Connection Lifecycle (Protocol-Compliant)
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Backend as CMS Backend
|
|
participant WS as WebSocketHandler
|
|
participant MP as MessageProcessor
|
|
participant SM as StreamManager
|
|
participant HB as Heartbeat Loop
|
|
|
|
Backend->>WS: WebSocket Connection
|
|
WS->>WS: handle_connection()
|
|
WS->>HB: start_heartbeat_loop()
|
|
WS->>Backend: Connection Accepted
|
|
|
|
loop Heartbeat (every 2 seconds)
|
|
HB->>Backend: stateReport {cpuUsage, memoryUsage, gpuUsage, cameraConnections[]}
|
|
end
|
|
|
|
loop Message Processing (Protocol Commands)
|
|
Backend->>WS: JSON Message
|
|
WS->>MP: parse_message()
|
|
|
|
alt setSubscriptionList (Declarative)
|
|
MP->>MP: validate_subscription_list()
|
|
WS->>WS: reconcile_subscriptions()
|
|
WS->>SM: add/remove/update_streams()
|
|
SM->>SM: handle_stream_lifecycle()
|
|
WS->>HB: update_camera_connections()
|
|
|
|
else setSessionId
|
|
MP->>MP: validate_set_session()
|
|
WS->>WS: store_session_for_display()
|
|
WS->>WS: apply_to_all_display_subscriptions()
|
|
|
|
else setProgressionStage
|
|
MP->>MP: validate_progression_stage()
|
|
WS->>WS: store_stage_for_display()
|
|
WS->>WS: apply_context_aware_processing()
|
|
|
|
else requestState
|
|
WS->>Backend: stateReport (immediate response)
|
|
|
|
else patchSessionResult
|
|
MP->>MP: validate_patch_result()
|
|
WS->>WS: log_patch_response()
|
|
end
|
|
end
|
|
|
|
Backend->>WS: Disconnect
|
|
WS->>SM: cleanup_all_streams()
|
|
WS->>HB: stop_heartbeat_loop()
|
|
```
|
|
|
|
### Message Processing Detail
|
|
|
|
#### 1. setSubscriptionList Flow (`websocket_handler.py:355-453`) - Protocol Compliant
|
|
|
|
```python
|
|
async def _handle_set_subscription_list(self, data: Dict[str, Any]) -> None:
|
|
"""Handle setSubscriptionList command - declarative subscription management"""
|
|
|
|
subscriptions = data.get("subscriptions", [])
|
|
|
|
# 1. Get current and desired subscription states
|
|
current_subscriptions = set(subscription_to_camera.keys())
|
|
desired_subscriptions = set()
|
|
subscription_configs = {}
|
|
|
|
for sub_config in subscriptions:
|
|
sub_id = sub_config.get("subscriptionIdentifier")
|
|
if sub_id:
|
|
desired_subscriptions.add(sub_id)
|
|
subscription_configs[sub_id] = sub_config
|
|
|
|
# Extract display ID for session management
|
|
parts = sub_id.split(";")
|
|
if len(parts) >= 2:
|
|
display_id = parts[0]
|
|
self.display_identifiers.add(display_id)
|
|
|
|
# 2. Calculate reconciliation changes
|
|
to_add = desired_subscriptions - current_subscriptions
|
|
to_remove = current_subscriptions - desired_subscriptions
|
|
to_update = desired_subscriptions & current_subscriptions
|
|
|
|
# 3. Remove obsolete subscriptions
|
|
for sub_id in to_remove:
|
|
camera_id = subscription_to_camera.get(sub_id)
|
|
if camera_id:
|
|
await self.stream_manager.stop_stream(camera_id)
|
|
self.model_manager.unload_models(camera_id)
|
|
subscription_to_camera.pop(sub_id, None)
|
|
self.session_cache.clear_session(camera_id)
|
|
|
|
# 4. Add new subscriptions
|
|
for sub_id in to_add:
|
|
await self._start_subscription(sub_id, subscription_configs[sub_id])
|
|
|
|
# 5. Update existing subscriptions (handle S3 URL refresh)
|
|
for sub_id in to_update:
|
|
current_config = subscription_to_camera.get(sub_id)
|
|
new_config = subscription_configs[sub_id]
|
|
|
|
# Restart if model URL changed (handles S3 expiration)
|
|
current_model_url = getattr(current_config, 'model_url', None)
|
|
new_model_url = new_config.get("modelUrl")
|
|
|
|
if current_model_url != new_model_url:
|
|
camera_id = subscription_to_camera.get(sub_id)
|
|
if camera_id:
|
|
await self.stream_manager.stop_stream(camera_id)
|
|
self.model_manager.unload_models(camera_id)
|
|
await self._start_subscription(sub_id, new_config)
|
|
|
|
async def _start_subscription(self, subscription_id: str, config: Dict[str, Any]) -> None:
|
|
"""Start individual subscription with protocol-compliant configuration"""
|
|
|
|
# Extract camera ID from subscription identifier
|
|
parts = subscription_id.split(";")
|
|
camera_id = parts[1] if len(parts) >= 2 else subscription_id
|
|
|
|
# Store subscription mapping
|
|
subscription_to_camera[subscription_id] = camera_id
|
|
|
|
# Start camera stream with full config
|
|
await self.stream_manager.start_stream(camera_id, config)
|
|
|
|
# Load ML model from fresh URL
|
|
model_id = config.get("modelId")
|
|
model_url = config.get("modelUrl")
|
|
if model_id and model_url:
|
|
await self.model_manager.load_model(camera_id, model_id, model_url)
|
|
```
|
|
|
|
#### 2. Detection Result Broadcasting (`websocket_handler.py:589-615`) - Protocol Compliant
|
|
|
|
```python
|
|
async def _send_detection_result(self, camera_id: str, stream_info: Dict[str, Any],
|
|
detection_result: DetectionResult) -> None:
|
|
"""Send detection result with protocol-compliant format"""
|
|
|
|
# Get session ID for this display (protocol requirement)
|
|
subscription_id = stream_info["subscriptionIdentifier"]
|
|
display_id = subscription_id.split(";")[0] if ";" in subscription_id else subscription_id
|
|
session_id = self.session_ids.get(display_id) # Can be None for null sessions
|
|
|
|
# Protocol-compliant imageDetection format per worker.md
|
|
detection_data = {
|
|
"type": "imageDetection",
|
|
"subscriptionIdentifier": subscription_id, # Required at root level
|
|
"timestamp": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
|
|
"sessionId": session_id, # Required for session linking
|
|
"data": {
|
|
"detection": detection_result.to_dict(), # Flat detection object
|
|
"modelId": stream_info["modelId"],
|
|
"modelName": stream_info["modelName"]
|
|
}
|
|
}
|
|
|
|
# Send to backend via WebSocket
|
|
await self.websocket.send_json(detection_data)
|
|
```
|
|
|
|
#### 3. State Report Broadcasting (`websocket_handler.py:201-209`) - Protocol Compliant
|
|
|
|
```python
|
|
async def _send_heartbeat(self) -> None:
|
|
"""Send protocol-compliant stateReport every 2 seconds"""
|
|
|
|
while self.connected:
|
|
# Get system metrics
|
|
metrics = get_system_metrics()
|
|
|
|
# Build cameraConnections array as required by protocol
|
|
camera_connections = []
|
|
with self.stream_manager.streams_lock:
|
|
for camera_id, stream_info in self.stream_manager.streams.items():
|
|
is_online = self.stream_manager.is_stream_active(camera_id)
|
|
|
|
connection_info = {
|
|
"subscriptionIdentifier": stream_info.get("subscriptionIdentifier", camera_id),
|
|
"modelId": stream_info.get("modelId", 0),
|
|
"modelName": stream_info.get("modelName", "Unknown Model"),
|
|
"online": is_online
|
|
}
|
|
|
|
# Add crop coordinates if available (optional per protocol)
|
|
for coord in ["cropX1", "cropY1", "cropX2", "cropY2"]:
|
|
if coord in stream_info:
|
|
connection_info[coord] = stream_info[coord]
|
|
|
|
camera_connections.append(connection_info)
|
|
|
|
# Protocol-compliant stateReport format (worker.md lines 169-189)
|
|
state_data = {
|
|
"type": "stateReport", # Message type
|
|
"cpuUsage": metrics.get("cpu_percent", 0), # CPU percentage
|
|
"memoryUsage": metrics.get("memory_percent", 0), # Memory percentage
|
|
"gpuUsage": metrics.get("gpu_percent", 0), # GPU percentage
|
|
"gpuMemoryUsage": metrics.get("gpu_memory_percent", 0), # GPU memory
|
|
"cameraConnections": camera_connections # Camera details array
|
|
}
|
|
|
|
await self.websocket.send_json(state_data)
|
|
await asyncio.sleep(2) # 2-second heartbeat interval per protocol
|
|
```
|
|
|
|
## Detection Pipeline Flow
|
|
|
|
### Complete Detection Workflow
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
A[Frame Captured] --> B[YOLO Detection]
|
|
B --> C{Expected Classes Found?}
|
|
C -->|No| D[Skip Processing]
|
|
C -->|Yes| E[Multi-Object Tracking]
|
|
E --> F[Stability Validation]
|
|
F --> G{Stable Detection?}
|
|
G -->|No| H[Continue Tracking]
|
|
G -->|Yes| I[Create Database Record]
|
|
I --> J[Execute Redis Actions]
|
|
J --> K[Start Classification Branches]
|
|
|
|
K --> L[Brand Classification]
|
|
K --> M[Body Type Classification]
|
|
K --> N[License Plate Recognition]
|
|
|
|
L --> O[Wait for All Branches]
|
|
M --> O
|
|
N --> O
|
|
|
|
O --> P[Field Mapping & Resolution]
|
|
P --> Q[Database Update Combined]
|
|
Q --> R[Broadcast Results]
|
|
R --> S[Update Session Cache]
|
|
```
|
|
|
|
### Detailed Pipeline Execution (`pipeline_executor.py:85-250`)
|
|
|
|
#### 1. Detection Stage
|
|
```python
|
|
async def _run_detection_stage(self, pipeline_config: Dict,
|
|
context: PipelineContext) -> List[DetectionResult]:
|
|
"""Execute object detection stage"""
|
|
|
|
# 1. Load detection model
|
|
model = await self.model_manager.load_model(
|
|
ModelConfig.from_dict(pipeline_config)
|
|
)
|
|
|
|
# 2. Run YOLO inference
|
|
detector = YOLODetector()
|
|
raw_detections = detector.detect(
|
|
frame=context.frame,
|
|
confidence_threshold=pipeline_config["minConfidence"]
|
|
)
|
|
|
|
# 3. Filter expected classes
|
|
expected_classes = pipeline_config["expectedClasses"]
|
|
filtered_detections = [
|
|
det for det in raw_detections
|
|
if det.class_name in expected_classes
|
|
]
|
|
|
|
# 4. Update tracking
|
|
if len(filtered_detections) > 0:
|
|
tracking_manager = TrackingManager()
|
|
tracked_detections = tracking_manager.update_tracks(
|
|
filtered_detections, context.frame_id
|
|
)
|
|
|
|
# 5. Validate stability
|
|
stability_validator = StabilityValidator()
|
|
stable_detections = []
|
|
for det in tracked_detections:
|
|
if stability_validator.is_detection_stable(det):
|
|
stable_detections.append(det)
|
|
|
|
return stable_detections
|
|
|
|
return []
|
|
```
|
|
|
|
#### 2. Action Execution Stage
|
|
```python
|
|
async def _execute_actions(self, actions: List[Dict],
|
|
context: PipelineContext) -> Dict:
|
|
"""Execute pipeline actions"""
|
|
|
|
action_results = {}
|
|
action_executor = ActionExecutor()
|
|
|
|
for action in actions:
|
|
action_type = action["type"]
|
|
|
|
if action_type == "redis_save_image":
|
|
# Save cropped image to Redis
|
|
result = await action_executor.redis_save_image(
|
|
frame=context.frame,
|
|
region=action["region"], # e.g., "Frontal"
|
|
key_template=action["key"],
|
|
context=context,
|
|
expire_seconds=action.get("expire_seconds", 3600)
|
|
)
|
|
|
|
elif action_type == "postgresql_create_record":
|
|
# Create initial database record
|
|
result = await action_executor.postgresql_create(
|
|
table=action["table"],
|
|
fields=action["fields"],
|
|
context=context
|
|
)
|
|
|
|
action_results[action_type] = result
|
|
|
|
return action_results
|
|
```
|
|
|
|
#### 3. Classification Branch Execution
|
|
```python
|
|
async def _run_classification_branches(self, branches: List[Dict],
|
|
context: PipelineContext) -> Dict:
|
|
"""Execute parallel classification branches"""
|
|
|
|
if not branches:
|
|
return {}
|
|
|
|
# 1. Create parallel tasks for each branch
|
|
branch_tasks = []
|
|
for branch in branches:
|
|
if branch.get("parallel", False):
|
|
task = asyncio.create_task(
|
|
self._execute_branch(branch, context)
|
|
)
|
|
branch_tasks.append((branch["modelId"], task))
|
|
|
|
# 2. Wait for all branches to complete
|
|
branch_results = {}
|
|
for model_id, task in branch_tasks:
|
|
try:
|
|
result = await task
|
|
branch_results[model_id] = result
|
|
except Exception as e:
|
|
logging.error(f"Branch {model_id} failed: {e}")
|
|
branch_results[model_id] = {"error": str(e)}
|
|
|
|
return branch_results
|
|
|
|
async def _execute_branch(self, branch_config: Dict,
|
|
context: PipelineContext) -> Dict:
|
|
"""Execute single classification branch"""
|
|
|
|
# 1. Load classification model
|
|
model = await self.model_manager.load_model(
|
|
ModelConfig.from_dict(branch_config)
|
|
)
|
|
|
|
# 2. Prepare input (crop if specified)
|
|
input_frame = context.frame
|
|
if branch_config.get("crop", False):
|
|
crop_class = branch_config["cropClass"]
|
|
# Find detection of specified class and crop
|
|
for detection in context.detections:
|
|
if detection.class_name == crop_class:
|
|
bbox = detection.bounding_box
|
|
input_frame = context.frame[bbox.y1:bbox.y2, bbox.x1:bbox.x2]
|
|
break
|
|
|
|
# 3. Run classification
|
|
classifier = YOLODetector() # Can handle classification too
|
|
results = classifier.classify(
|
|
frame=input_frame,
|
|
confidence_threshold=branch_config["minConfidence"]
|
|
)
|
|
|
|
# 4. Format results based on model type
|
|
if "brand" in branch_config["modelId"]:
|
|
return {"brand": results.top_class, "confidence": results.confidence}
|
|
elif "bodytype" in branch_config["modelId"]:
|
|
return {"body_type": results.top_class, "confidence": results.confidence}
|
|
else:
|
|
return {"class": results.top_class, "confidence": results.confidence}
|
|
```
|
|
|
|
#### 4. Field Mapping & Database Update
|
|
```python
|
|
async def _execute_parallel_actions(self, actions: List[Dict],
|
|
context: PipelineContext,
|
|
branch_results: Dict) -> Dict:
|
|
"""Execute actions that depend on branch results"""
|
|
|
|
for action in actions:
|
|
if action["type"] == "postgresql_update_combined":
|
|
# 1. Wait for specified branches to complete
|
|
wait_for_branches = action.get("waitForBranches", [])
|
|
for branch_id in wait_for_branches:
|
|
if branch_id not in branch_results:
|
|
logging.warning(f"Branch {branch_id} not completed")
|
|
|
|
# 2. Resolve field mappings
|
|
field_mapper = FieldMapper()
|
|
resolved_fields = field_mapper.resolve_fields(
|
|
field_templates=action["fields"],
|
|
context=context,
|
|
branch_results=branch_results
|
|
)
|
|
|
|
# Example field resolution:
|
|
# "{car_brand_cls_v1.brand}" -> "Toyota"
|
|
# "{car_bodytype_cls_v1.body_type}" -> "Sedan"
|
|
|
|
# 3. Execute database update
|
|
database_manager = DatabaseManager()
|
|
await database_manager.update_record(
|
|
table=action["table"],
|
|
key_value=context.session_id,
|
|
key_field=action["key_field"],
|
|
update_data=resolved_fields
|
|
)
|
|
|
|
return {"status": "success", "fields_updated": resolved_fields}
|
|
|
|
return {}
|
|
```
|
|
|
|
## Data Storage Flow
|
|
|
|
### Database Operations Flow
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant PE as PipelineExecutor
|
|
participant AE as ActionExecutor
|
|
participant DM as DatabaseManager
|
|
participant FM as FieldMapper
|
|
participant DB as PostgreSQL
|
|
|
|
PE->>AE: postgresql_create_record
|
|
AE->>DM: create_record()
|
|
DM->>DB: INSERT INTO car_frontal_info
|
|
DB->>DM: session_id (UUID)
|
|
DM->>AE: Record created
|
|
|
|
Note over PE: Classification branches execute...
|
|
|
|
PE->>FM: resolve_fields(templates, branch_results)
|
|
FM->>FM: substitute variables
|
|
FM->>PE: resolved_fields
|
|
|
|
PE->>AE: postgresql_update_combined
|
|
AE->>DM: update_record()
|
|
DM->>DB: UPDATE car_frontal_info SET car_brand=?, car_body_type=?
|
|
DB->>DM: Update successful
|
|
DM->>AE: Record updated
|
|
```
|
|
|
|
### Redis Storage Operations
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant AE as ActionExecutor
|
|
participant RC as RedisClient
|
|
participant RIS as RedisImageStorage
|
|
participant Redis as Redis Server
|
|
|
|
AE->>RC: redis_save_image
|
|
RC->>RIS: store_image()
|
|
RIS->>RIS: crop_region_from_frame()
|
|
RIS->>RIS: compress_image()
|
|
RIS->>Redis: SET key encoded_image
|
|
RIS->>Redis: EXPIRE key 600
|
|
Redis->>RIS: OK
|
|
RIS->>RC: Storage successful
|
|
RC->>AE: Image saved
|
|
```
|
|
|
|
### Session Cache Operations
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
A[Detection Event] --> B[Cache Detection Result]
|
|
B --> C[Create Session Entry]
|
|
C --> D[Pipeline Processing]
|
|
D --> E[Update Session with Branch Results]
|
|
E --> F[Cache Pipeline Result]
|
|
F --> G[Broadcast to Clients]
|
|
|
|
subgraph "Cache Types"
|
|
H[Detection Cache<br/>Latest detection per camera]
|
|
I[Pipeline Cache<br/>Pipeline execution results]
|
|
J[Session Cache<br/>Session tracking data]
|
|
end
|
|
|
|
B --> H
|
|
F --> I
|
|
C --> J
|
|
E --> J
|
|
```
|
|
|
|
## Error Handling & Recovery
|
|
|
|
### Exception Hierarchy & Handling
|
|
|
|
```mermaid
|
|
classDiagram
|
|
DetectorWorkerError <|-- ConfigurationError
|
|
DetectorWorkerError <|-- StreamError
|
|
DetectorWorkerError <|-- ModelError
|
|
DetectorWorkerError <|-- PipelineError
|
|
DetectorWorkerError <|-- DatabaseError
|
|
DetectorWorkerError <|-- RedisError
|
|
DetectorWorkerError <|-- MessageProcessingError
|
|
|
|
StreamError <|-- ConnectionError
|
|
StreamError <|-- StreamTimeoutError
|
|
|
|
ModelError <|-- ModelLoadError
|
|
ModelError <|-- ModelCacheError
|
|
|
|
PipelineError <|-- ActionExecutionError
|
|
PipelineError <|-- BranchExecutionError
|
|
```
|
|
|
|
### Error Recovery Strategies
|
|
|
|
#### 1. Stream Connection Recovery (`stream_manager.py:245-285`)
|
|
```python
|
|
async def _handle_stream_error(self, stream_id: str, error: Exception):
|
|
"""Handle stream errors with exponential backoff retry"""
|
|
|
|
stream_info = self.get_stream_info(stream_id)
|
|
if not stream_info:
|
|
return
|
|
|
|
# Increment error count
|
|
stream_info.error_count += 1
|
|
stream_info.update_status("error", error_message=str(error))
|
|
|
|
# Exponential backoff retry
|
|
max_retries = stream_info.config.max_retries
|
|
if max_retries == -1 or stream_info.error_count <= max_retries:
|
|
|
|
# Calculate backoff delay
|
|
base_delay = stream_info.config.reconnect_interval
|
|
backoff_delay = base_delay * (2 ** min(stream_info.error_count - 1, 6))
|
|
|
|
logging.warning(f"Stream {stream_id} error: {error}. "
|
|
f"Retrying in {backoff_delay} seconds...")
|
|
|
|
await asyncio.sleep(backoff_delay)
|
|
|
|
try:
|
|
await self.reconnect_stream(stream_id)
|
|
stream_info.error_count = 0 # Reset on success
|
|
stream_info.update_status("active")
|
|
|
|
except Exception as retry_error:
|
|
logging.error(f"Stream {stream_id} retry failed: {retry_error}")
|
|
await self._handle_stream_error(stream_id, retry_error)
|
|
else:
|
|
# Max retries exceeded
|
|
logging.error(f"Stream {stream_id} exceeded max retries. Marking as failed.")
|
|
stream_info.update_status("failed")
|
|
await self._notify_stream_failure(stream_id)
|
|
```
|
|
|
|
#### 2. Database Connection Recovery (`database_manager.py:185-220`)
|
|
```python
|
|
async def _execute_with_retry(self, operation: Callable, *args, **kwargs):
|
|
"""Execute database operation with connection retry"""
|
|
|
|
max_retries = 3
|
|
retry_delay = 1.0
|
|
|
|
for attempt in range(max_retries + 1):
|
|
try:
|
|
return await operation(*args, **kwargs)
|
|
|
|
except psycopg2.OperationalError as e:
|
|
if attempt == max_retries:
|
|
raise DatabaseError(f"Database operation failed after {max_retries} retries: {e}")
|
|
|
|
logging.warning(f"Database operation failed (attempt {attempt + 1}): {e}")
|
|
|
|
# Try to reconnect
|
|
try:
|
|
await self.disconnect()
|
|
await asyncio.sleep(retry_delay)
|
|
await self.connect()
|
|
retry_delay *= 2 # Exponential backoff
|
|
|
|
except Exception as reconnect_error:
|
|
logging.error(f"Database reconnection failed: {reconnect_error}")
|
|
|
|
except Exception as e:
|
|
# Non-recoverable error
|
|
raise DatabaseError(f"Database operation failed: {e}")
|
|
```
|
|
|
|
#### 3. Pipeline Error Isolation (`pipeline_executor.py:325-365`)
|
|
```python
|
|
async def _execute_branch_with_isolation(self, branch_config: Dict,
|
|
context: PipelineContext) -> Dict:
|
|
"""Execute branch with error isolation"""
|
|
|
|
branch_id = branch_config["modelId"]
|
|
|
|
try:
|
|
# Set timeout for branch execution
|
|
timeout = branch_config.get("timeout_seconds", 30)
|
|
|
|
result = await asyncio.wait_for(
|
|
self._execute_branch(branch_config, context),
|
|
timeout=timeout
|
|
)
|
|
|
|
return result
|
|
|
|
except asyncio.TimeoutError:
|
|
error_msg = f"Branch {branch_id} timed out after {timeout} seconds"
|
|
logging.error(error_msg)
|
|
return {"error": error_msg, "type": "timeout"}
|
|
|
|
except ModelError as e:
|
|
error_msg = f"Branch {branch_id} model error: {e}"
|
|
logging.error(error_msg)
|
|
return {"error": error_msg, "type": "model_error"}
|
|
|
|
except Exception as e:
|
|
error_msg = f"Branch {branch_id} unexpected error: {e}"
|
|
logging.error(error_msg, exc_info=True)
|
|
return {"error": error_msg, "type": "unexpected_error"}
|
|
|
|
async def _handle_partial_branch_failure(self, branch_results: Dict,
|
|
required_branches: List[str]) -> bool:
|
|
"""Determine if pipeline can continue with partial branch failures"""
|
|
|
|
successful_branches = [
|
|
branch_id for branch_id, result in branch_results.items()
|
|
if not isinstance(result, dict) or "error" not in result
|
|
]
|
|
|
|
failed_branches = [
|
|
branch_id for branch_id, result in branch_results.items()
|
|
if isinstance(result, dict) and "error" in result
|
|
]
|
|
|
|
if failed_branches:
|
|
logging.warning(f"Failed branches: {failed_branches}")
|
|
logging.info(f"Successful branches: {successful_branches}")
|
|
|
|
# Continue if at least one required branch succeeded
|
|
required_successful = any(
|
|
branch_id in successful_branches
|
|
for branch_id in required_branches
|
|
)
|
|
|
|
return required_successful
|
|
```
|
|
|
|
## Testing Architecture
|
|
|
|
### Test Structure Overview
|
|
|
|
```
|
|
tests/
|
|
├── unit/ # Fast, isolated unit tests
|
|
│ ├── core/ # Core module tests (config, DI, singletons)
|
|
│ ├── detection/ # Detection system tests
|
|
│ ├── pipeline/ # Pipeline execution tests
|
|
│ ├── streams/ # Stream management tests
|
|
│ ├── communication/ # WebSocket & messaging tests
|
|
│ ├── storage/ # Storage layer tests
|
|
│ └── models/ # Model management tests
|
|
├── integration/ # Multi-component integration tests
|
|
│ ├── test_complete_detection_workflow.py
|
|
│ ├── test_websocket_protocol.py
|
|
│ └── test_pipeline_integration.py
|
|
├── performance/ # Performance benchmarks
|
|
│ ├── test_detection_performance.py
|
|
│ ├── test_websocket_performance.py
|
|
│ └── test_storage_performance.py
|
|
└── conftest.py # Shared fixtures and configuration
|
|
```
|
|
|
|
### Test Execution Flows
|
|
|
|
#### Unit Test Example (`tests/unit/detection/test_yolo_detector.py`)
|
|
```python
|
|
class TestYOLODetector:
|
|
"""Test YOLO detector functionality"""
|
|
|
|
def test_detection_basic_functionality(self, mock_frame):
|
|
"""Test basic detection pipeline"""
|
|
detector = YOLODetector()
|
|
|
|
with patch('torch.load') as mock_torch_load:
|
|
# Setup mock model
|
|
mock_model = Mock()
|
|
mock_result = self._create_mock_detection_result()
|
|
mock_model.return_value = mock_result
|
|
mock_torch_load.return_value = mock_model
|
|
|
|
# Execute detection
|
|
detections = detector.detect(mock_frame, confidence_threshold=0.5)
|
|
|
|
# Verify results
|
|
assert len(detections) == 2
|
|
assert detections[0].class_name == "car"
|
|
assert detections[0].confidence >= 0.5
|
|
assert isinstance(detections[0].bounding_box, BoundingBox)
|
|
```
|
|
|
|
#### Integration Test Example (`tests/integration/test_complete_detection_workflow.py`)
|
|
```python
|
|
@pytest.mark.asyncio
|
|
async def test_complete_rtsp_detection_workflow(self, temp_config_file,
|
|
sample_mpta_file, mock_frame):
|
|
"""Test complete workflow: RTSP stream -> detection -> classification -> database"""
|
|
|
|
# 1. Initialize all components
|
|
config = Configuration()
|
|
config.load_from_file(temp_config_file)
|
|
|
|
# 2. Mock external dependencies (Redis, DB, models)
|
|
with patch('cv2.VideoCapture') as mock_video_cap, \
|
|
patch('torch.load') as mock_torch_load, \
|
|
patch('psycopg2.connect') as mock_db_connect:
|
|
|
|
# Setup mocks...
|
|
|
|
# 3. Execute complete workflow
|
|
stream_manager = StreamManager()
|
|
pipeline_executor = PipelineExecutor()
|
|
|
|
# Create stream
|
|
stream_info = await stream_manager.create_stream(camera_id, config, sub_id)
|
|
|
|
# Run pipeline
|
|
result = await pipeline_executor.execute_pipeline(pipeline_config, context)
|
|
|
|
# 4. Verify end-to-end results
|
|
assert result["status"] == "completed"
|
|
assert "detections" in result
|
|
assert "classification_results" in result
|
|
|
|
# Verify database operations occurred
|
|
assert mock_db_cursor.execute.called
|
|
|
|
# Verify Redis operations occurred
|
|
assert mock_redis.set.called
|
|
```
|
|
|
|
#### Performance Test Example (`tests/performance/test_detection_performance.py`)
|
|
```python
|
|
def test_yolo_detection_speed(self, sample_frame, performance_config):
|
|
"""Benchmark YOLO detection speed"""
|
|
detector = YOLODetector()
|
|
|
|
# Warm up
|
|
for _ in range(5):
|
|
detector.detect(sample_frame, confidence_threshold=0.5)
|
|
|
|
# Benchmark
|
|
detection_times = []
|
|
for _ in range(100):
|
|
start_time = time.perf_counter()
|
|
detections = detector.detect(sample_frame, confidence_threshold=0.5)
|
|
end_time = time.perf_counter()
|
|
|
|
detection_times.append((end_time - start_time) * 1000)
|
|
|
|
# Analyze performance
|
|
avg_time = statistics.mean(detection_times)
|
|
theoretical_fps = 1000 / avg_time
|
|
|
|
# Performance assertions
|
|
assert avg_time < performance_config["max_detection_time_ms"]
|
|
assert theoretical_fps >= performance_config["target_fps"]
|
|
|
|
print(f"Average detection time: {avg_time:.2f} ms")
|
|
print(f"Theoretical FPS: {theoretical_fps:.1f}")
|
|
```
|
|
|
|
## Development Workflow
|
|
|
|
### Cross-Platform Setup
|
|
The Makefile automatically detects the correct Python command for your platform:
|
|
- **macOS/Linux**: Uses `python3` and `pip3` if available
|
|
- **Windows**: Falls back to `python` and `pip`
|
|
- **Automatic Detection**: No manual configuration needed
|
|
|
|
### Step-by-Step Project Setup
|
|
|
|
#### 1. Clone and Navigate to Project
|
|
```bash
|
|
git clone <repository-url>
|
|
cd python-detector-worker
|
|
```
|
|
|
|
#### 2. Install Dependencies
|
|
```bash
|
|
# Install production dependencies
|
|
make install
|
|
|
|
# Install development dependencies (includes testing tools)
|
|
make install-dev
|
|
|
|
# Check environment information
|
|
make env-info
|
|
```
|
|
|
|
#### 3. Configure Environment (Optional)
|
|
```bash
|
|
# Copy example configuration (if exists)
|
|
cp config.example.json config.json
|
|
|
|
# Set environment variables for development
|
|
export DETECTOR_WORKER_ENV=dev
|
|
export DETECTOR_WORKER_PORT=8001
|
|
```
|
|
|
|
#### 4. Run the Application
|
|
```bash
|
|
# For development (staging port 8001 with auto-reload)
|
|
make run-staging
|
|
|
|
# For production (port 8000)
|
|
make run-prod
|
|
|
|
# For debugging (verbose logging on staging port)
|
|
make run-debug
|
|
```
|
|
|
|
#### 5. Verify Installation
|
|
```bash
|
|
# Check system health
|
|
curl http://localhost:8001/health
|
|
|
|
# Run basic tests
|
|
make test-fast
|
|
|
|
# Check code quality
|
|
make lint
|
|
```
|
|
|
|
### Development Commands
|
|
|
|
#### Environment Management
|
|
```bash
|
|
make env-info # Show environment information
|
|
make check-deps # Verify dependency integrity
|
|
make update-deps # List outdated dependencies
|
|
make freeze # Generate requirements-frozen.txt
|
|
```
|
|
|
|
#### Code Development
|
|
```bash
|
|
# Code quality
|
|
make format # Format code with black & isort
|
|
make lint # Run code linting (flake8, mypy)
|
|
make quality # Run all quality checks
|
|
|
|
# Application execution
|
|
make run # Run production mode (port 8000) with reload
|
|
make run-staging # Run staging mode (port 8001) with reload
|
|
make run-prod # Run production mode (port 8000) without reload
|
|
make run-debug # Run debug mode (port 8001) with verbose logging
|
|
```
|
|
|
|
#### Testing Framework
|
|
```bash
|
|
# Test execution
|
|
make test # Run all tests with coverage
|
|
make test-unit # Run unit tests only
|
|
make test-integration # Run integration tests
|
|
make test-performance # Run performance benchmarks
|
|
make test-fast # Run fast tests only (skip slow markers)
|
|
make test-coverage # Generate detailed coverage report
|
|
make test-failed # Rerun only failed tests
|
|
|
|
# CI/CD testing
|
|
make ci-test # Run CI-optimized tests
|
|
make ci-quality # Run CI quality checks
|
|
```
|
|
|
|
#### Docker Operations
|
|
```bash
|
|
# Container management
|
|
make docker-build # Build Docker image
|
|
make docker-run # Run container (production port 8000)
|
|
make docker-run-staging # Run container (staging port 8001)
|
|
make docker-dev # Run development container with volume mounts
|
|
```
|
|
|
|
#### Utilities
|
|
```bash
|
|
# Maintenance
|
|
make clean # Clean build artifacts and cache
|
|
make monitor # Start system resource monitor
|
|
make profile # Run performance profiling
|
|
make version # Show application version
|
|
|
|
# Database utilities (when implemented)
|
|
make db-migrate # Run database migrations
|
|
make db-reset # Reset database to initial state
|
|
```
|
|
|
|
### Using the Test Runner
|
|
|
|
```bash
|
|
# Basic test execution
|
|
python scripts/run_tests.py --all # All tests with coverage
|
|
python scripts/run_tests.py --unit --verbose # Unit tests with verbose output
|
|
python scripts/run_tests.py --integration # Integration tests only
|
|
python scripts/run_tests.py --performance # Performance benchmarks
|
|
|
|
# Advanced options
|
|
python scripts/run_tests.py --fast # Fast tests only (no slow markers)
|
|
python scripts/run_tests.py --failed # Rerun only failed tests
|
|
python scripts/run_tests.py --specific "config" # Run tests matching pattern
|
|
python scripts/run_tests.py --coverage --open-browser # Generate and open coverage report
|
|
|
|
# Quality checks
|
|
python scripts/run_tests.py --quality # Run linting and formatting checks
|
|
```
|
|
|
|
### CI/CD Pipeline
|
|
|
|
The GitHub Actions workflow (`.github/workflows/ci.yml`) runs:
|
|
|
|
1. **Code Quality Checks**: flake8, mypy, black, isort
|
|
2. **Unit Tests**: Fast, isolated tests with coverage
|
|
3. **Integration Tests**: With Redis and PostgreSQL services
|
|
4. **Performance Tests**: On main branch pushes
|
|
5. **Security Scans**: safety and bandit
|
|
6. **Docker Build**: Verify containerization works
|
|
|
|
### Adding New Features
|
|
|
|
1. **Create Feature Branch**
|
|
```bash
|
|
git checkout -b feature/new-detection-algorithm
|
|
```
|
|
|
|
2. **Implement with TDD**
|
|
```bash
|
|
# 1. Write failing test
|
|
# tests/unit/detection/test_new_algorithm.py
|
|
|
|
# 2. Implement minimal code to pass
|
|
# detector_worker/detection/new_algorithm.py
|
|
|
|
# 3. Refactor and improve
|
|
# 4. Add integration tests if needed
|
|
```
|
|
|
|
3. **Run Quality Checks**
|
|
```bash
|
|
make format # Auto-format code
|
|
make lint # Check code quality
|
|
make test # Run all tests
|
|
```
|
|
|
|
4. **Create Pull Request**
|
|
- CI/CD pipeline runs automatically
|
|
- Coverage report posted as comment
|
|
- All checks must pass before merge
|
|
|
|
### Debugging & Monitoring
|
|
|
|
#### Logging Configuration
|
|
```python
|
|
# detector_worker/utils/logging.py
|
|
import logging
|
|
|
|
def setup_logging(level=logging.INFO):
|
|
"""Configure structured logging"""
|
|
|
|
formatter = logging.Formatter(
|
|
'%(asctime)s - %(name)s - %(levelname)s - [%(filename)s:%(lineno)d] - %(message)s'
|
|
)
|
|
|
|
# Console handler
|
|
console_handler = logging.StreamHandler()
|
|
console_handler.setFormatter(formatter)
|
|
|
|
# File handler
|
|
file_handler = logging.FileHandler('detector_worker.log')
|
|
file_handler.setFormatter(formatter)
|
|
|
|
# Root logger
|
|
logger = logging.getLogger()
|
|
logger.setLevel(level)
|
|
logger.addHandler(console_handler)
|
|
logger.addHandler(file_handler)
|
|
|
|
return logger
|
|
```
|
|
|
|
#### Performance Monitoring
|
|
```python
|
|
# detector_worker/utils/monitoring.py
|
|
import psutil
|
|
import time
|
|
from typing import Dict
|
|
|
|
class SystemMonitor:
|
|
"""System resource monitoring"""
|
|
|
|
def get_system_metrics(self) -> Dict:
|
|
"""Get current system metrics"""
|
|
|
|
return {
|
|
"cpu_percent": psutil.cpu_percent(),
|
|
"memory_percent": psutil.virtual_memory().percent,
|
|
"disk_percent": psutil.disk_usage('/').percent,
|
|
"network_io": psutil.net_io_counters()._asdict(),
|
|
"timestamp": time.time()
|
|
}
|
|
|
|
def get_process_metrics(self) -> Dict:
|
|
"""Get current process metrics"""
|
|
|
|
process = psutil.Process()
|
|
|
|
return {
|
|
"pid": process.pid,
|
|
"cpu_percent": process.cpu_percent(),
|
|
"memory_mb": process.memory_info().rss / 1024 / 1024,
|
|
"threads": process.num_threads(),
|
|
"open_files": len(process.open_files()),
|
|
"connections": len(process.connections()),
|
|
"timestamp": time.time()
|
|
}
|
|
```
|
|
|
|
This comprehensive architecture documentation provides a complete technical overview of the refactored system, enabling any engineer to quickly understand and contribute to the codebase. The modular design, clear separation of concerns, and extensive testing infrastructure ensure the system is maintainable, scalable, and reliable. |