new logic

2025-05-12 19:19:40 +07:00
13 changed files with 691 additions and 1817 deletions
--- a/.gitea/workflows/build.yml
+++ b/.gitea/workflows/build.yml
@ -1,34 +0,0 @@
 name: Build Backend Application and Docker Image
 on:
  push:
    branches:
      - main
  workflow_dispatch:
 jobs:   
  build-docker:
    runs-on: ubuntu-latest
    permissions:
      packages: write
    steps:
      - name: Checkout code
        uses: actions/checkout@v3
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
      - name: Login to GitHub Container Registry
        uses: docker/login-action@v3
        with:
          registry: git.siwatsystem.com
          username: ${{ github.actor }}
          password: ${{ secrets.RUNNER_TOKEN }}
      - name: Build and push Docker image
        uses: docker/build-push-action@v4
        with:
          context: .
          file: ./Dockerfile
          push: true
          tags: git.siwatsystem.com/adsist-cms/worker:latest
--- a/.gitignore
+++ b/.gitignore
@ -6,7 +6,4 @@ app.log
 __pycache__/
 .mptacache
-mptas
+mptas
 detector_worker.log
 .gitignore
 no_frame_debug.log
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -1,188 +0,0 @@
 # Python Detector Worker - CLAUDE.md
 ## Project Overview
 This is a FastAPI-based computer vision detection worker that processes video streams from RTSP/HTTP sources and runs YOLO-based machine learning pipelines for object detection and classification. The system is designed to work within a larger CMS (Content Management System) architecture.
 ## Architecture & Technology Stack
 - **Framework**: FastAPI with WebSocket support
 - **ML/CV**: PyTorch, Ultralytics YOLO, OpenCV
 - **Containerization**: Docker (Python 3.13-bookworm base)
 - **Data Storage**: Redis integration for action handling
 - **Communication**: WebSocket-based real-time protocol
 ## Core Components
 ### Main Application (`app.py`)
 - **FastAPI WebSocket server** for real-time communication
 - **Multi-camera stream management** with shared stream optimization
 - **HTTP REST endpoint** for image retrieval (`/camera/{camera_id}/image`)
 - **Threading-based frame readers** for RTSP streams and HTTP snapshots
 - **Model loading and inference** using MPTA (Machine Learning Pipeline Archive) format
 - **Session management** with display identifier mapping
 - **Resource monitoring** (CPU, memory, GPU usage via psutil)
 ### Pipeline System (`siwatsystem/pympta.py`)
 - **MPTA file handling** - ZIP archives containing model configurations
 - **Hierarchical pipeline execution** with detection → classification branching
 - **Redis action system** for image saving and message publishing
 - **Dynamic model loading** with GPU optimization
 - **Configurable trigger classes and confidence thresholds**
 ### Testing & Debugging
 - **Protocol test script** (`test_protocol.py`) for WebSocket communication validation
 - **Pipeline webcam utility** (`pipeline_webcam.py`) for local testing with visual output
 - **RTSP streaming debug tool** (`debug/rtsp_webcam.py`) using GStreamer
 ## Code Conventions & Patterns
 ### Logging
 - **Structured logging** using Python's logging module
 - **File + console output** to `detector_worker.log`
 - **Debug level separation** for detailed troubleshooting
 - **Context-aware messages** with camera IDs and model information
 ### Error Handling
 - **Graceful failure handling** with retry mechanisms (configurable max_retries)
 - **Thread-safe operations** using locks for streams and models
 - **WebSocket disconnect handling** with proper cleanup
 - **Model loading validation** with detailed error reporting
 ### Configuration
 - **JSON configuration** (`config.json`) for runtime parameters:
  - `poll_interval_ms`: Frame processing interval
  - `max_streams`: Concurrent stream limit
  - `target_fps`: Target frame rate
  - `reconnect_interval_sec`: Stream reconnection delay
  - `max_retries`: Maximum retry attempts (-1 for unlimited)
 ### Threading Model
 - **Frame reader threads** for each camera stream (RTSP/HTTP)
 - **Shared stream optimization** - multiple subscriptions can reuse the same camera stream
 - **Async WebSocket handling** with concurrent task management
 - **Thread-safe data structures** with proper locking mechanisms
 ## WebSocket Protocol
 ### Message Types
 - **subscribe**: Start camera stream with model pipeline
 - **unsubscribe**: Stop camera stream processing
 - **requestState**: Request current worker status
 - **setSessionId**: Associate display with session identifier
 - **patchSession**: Update session data
 - **stateReport**: Periodic heartbeat with system metrics
 - **imageDetection**: Detection results with timestamp and model info
 ### Subscription Format
 ```json
 {
  "type": "subscribe",
  "payload": {
    "subscriptionIdentifier": "display-001;cam-001",
    "rtspUrl": "rtsp://...",  // OR snapshotUrl
    "snapshotUrl": "http://...",
    "snapshotInterval": 5000,
    "modelUrl": "http://...model.mpta",
    "modelId": 101,
    "modelName": "Vehicle Detection",
    "cropX1": 100, "cropY1": 200,
    "cropX2": 300, "cropY2": 400
  }
 }
 ```
 ## Model Pipeline (MPTA) Format
 ### Structure
 - **ZIP archive** containing models and configuration
 - **pipeline.json** - Main configuration file
 - **Model files** - YOLO .pt files for detection/classification
 - **Redis configuration** - Optional for action execution
 ### Pipeline Flow
 1. **Detection stage** - YOLO object detection with bounding boxes
 2. **Trigger evaluation** - Check if detected class matches trigger conditions
 3. **Classification stage** - Crop detected region and run classification model
 4. **Action execution** - Redis operations (image saving, message publishing)
 ### Branch Configuration
 ```json
 {
  "modelId": "detector-v1",
  "modelFile": "detector.pt",
  "triggerClasses": ["car", "truck"],
  "minConfidence": 0.5,
  "branches": [{
    "modelId": "classifier-v1", 
    "modelFile": "classifier.pt",
    "crop": true,
    "triggerClasses": ["car"],
    "minConfidence": 0.3,
    "actions": [...]
  }]
 }
 ```
 ## Stream Management
 ### Shared Streams
 - Multiple subscriptions can share the same camera URL
 - Reference counting prevents premature stream termination
 - Automatic cleanup when last subscription ends
 ### Frame Processing
 - **Queue-based buffering** with single frame capacity (latest frame only)
 - **Configurable polling interval** based on target FPS
 - **Automatic reconnection** with exponential backoff
 ## Development & Testing
 ### Local Development
 ```bash
 # Install dependencies
 pip install -r requirements.txt
 # Run the worker
 python app.py
 # Test protocol compliance
 python test_protocol.py
 # Test pipeline with webcam
 python pipeline_webcam.py --mpta-file path/to/model.mpta --video 0
 ```
 ### Docker Deployment
 ```bash
 # Build container
 docker build -t detector-worker .
 # Run with volume mounts for models
 docker run -p 8000:8000 -v ./models:/app/models detector-worker
 ```
 ### Testing Commands
 - **Protocol testing**: `python test_protocol.py`
 - **Pipeline validation**: `python pipeline_webcam.py --mpta-file <path> --video 0`
 - **RTSP debugging**: `python debug/rtsp_webcam.py`
 ## Dependencies
 - **fastapi[standard]**: Web framework with WebSocket support
 - **uvicorn**: ASGI server
 - **torch, torchvision**: PyTorch for ML inference
 - **ultralytics**: YOLO implementation
 - **opencv-python**: Computer vision operations
 - **websockets**: WebSocket client/server
 - **redis**: Redis client for action execution
 ## Security Considerations
 - Model files are loaded from trusted sources only
 - Redis connections use authentication when configured
 - WebSocket connections handle disconnects gracefully
 - Resource usage is monitored to prevent DoS
 ## Performance Optimizations
 - GPU acceleration when CUDA is available
 - Shared camera streams reduce resource usage
 - Frame queue optimization (single latest frame)
 - Model caching across subscriptions
 - Trigger class filtering for faster inference
--- a/app.py
+++ b/app.py
@ -5,7 +5,6 @@ import time
 import queue
 import torch
 import cv2
 import numpy as np
 import base64
 import logging
 import threading
@ -14,9 +13,8 @@ import asyncio
 import psutil
 import zipfile
 from urllib.parse import urlparse
-from fastapi import FastAPI, WebSocket, HTTPException
+from fastapi import FastAPI, WebSocket
 from fastapi.websockets import WebSocketDisconnect
 from fastapi.responses import Response
 from websockets.exceptions import ConnectionClosedError
 from ultralytics import YOLO
@ -29,12 +27,6 @@ app = FastAPI()
 # "models" now holds a nested dict: { camera_id: { modelId: model_tree } }
 models: Dict[str, Dict[str, Any]] = {}
 streams: Dict[str, Dict[str, Any]] = {}
 # Store session IDs per display
 session_ids: Dict[str, int] = {}
 # Track shared camera streams by camera URL
 camera_streams: Dict[str, Dict[str, Any]] = {}
 # Map subscriptions to their camera URL
 subscription_to_camera: Dict[str, str] = {}
 with open("config.json", "r") as f:
    config = json.load(f)
@ -49,456 +41,145 @@ max_retries = config.get("max_retries", 3)
 # Configure logging
 logging.basicConfig(
-    level=logging.INFO,  # Set to INFO level for less verbose output
+    level=logging.DEBUG,
-    format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
+    format="%(asctime)s [%(levelname)s] %(message)s",
    handlers=[
-        logging.FileHandler("detector_worker.log"),  # Write logs to a file
+        logging.FileHandler("app.log"),
-        logging.StreamHandler()  # Also output to console
+        logging.StreamHandler()
    ]
 )
 # Create a logger specifically for this application
 logger = logging.getLogger("detector_worker")
 logger.setLevel(logging.DEBUG)  # Set app-specific logger to DEBUG level
 # Ensure all other libraries (including root) use at least INFO level
 logging.getLogger().setLevel(logging.INFO)
 logger.info("Starting detector worker application")
 logger.info(f"Configuration: Target FPS: {TARGET_FPS}, Max streams: {max_streams}, Max retries: {max_retries}")
 # Ensure the models directory exists
 os.makedirs("models", exist_ok=True)
 logger.info("Ensured models directory exists")
 # Constants for heartbeat and timeouts
 HEARTBEAT_INTERVAL = 2  # seconds
 WORKER_TIMEOUT_MS = 10000
 logger.debug(f"Heartbeat interval set to {HEARTBEAT_INTERVAL} seconds")
 # Locks for thread-safe operations
 streams_lock = threading.Lock()
 models_lock = threading.Lock()
 logger.debug("Initialized thread locks")
 # Add helper to download mpta ZIP file from a remote URL
 def download_mpta(url: str, dest_path: str) -> str:
    try:
        logger.info(f"Starting download of model from {url} to {dest_path}")
        os.makedirs(os.path.dirname(dest_path), exist_ok=True)
        response = requests.get(url, stream=True)
        if response.status_code == 200:
            file_size = int(response.headers.get('content-length', 0))
            logger.info(f"Model file size: {file_size/1024/1024:.2f} MB")
            downloaded = 0
            with open(dest_path, "wb") as f:
                for chunk in response.iter_content(chunk_size=8192):
                    f.write(chunk)
-                    downloaded += len(chunk)
+            logging.info(f"Downloaded mpta file from {url} to {dest_path}")
                    if file_size > 0 and downloaded % (file_size // 10) < 8192:  # Log approximately every 10%
                        logger.debug(f"Download progress: {downloaded/file_size*100:.1f}%")
            logger.info(f"Successfully downloaded mpta file from {url} to {dest_path}")
            return dest_path
        else:
-            logger.error(f"Failed to download mpta file (status code {response.status_code}): {response.text}")
+            logging.error(f"Failed to download mpta file (status code {response.status_code})")
            return None
    except Exception as e:
-        logger.error(f"Exception downloading mpta file from {url}: {str(e)}", exc_info=True)
+        logging.error(f"Exception downloading mpta file from {url}: {e}")
        return None
 # Add helper to fetch snapshot image from HTTP/HTTPS URL
 def fetch_snapshot(url: str):
    try:
        response = requests.get(url, timeout=10)
        if response.status_code == 200:
            # Convert response content to numpy array
            nparr = np.frombuffer(response.content, np.uint8)
            # Decode image
            frame = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
            if frame is not None:
                logger.debug(f"Successfully fetched snapshot from {url}, shape: {frame.shape}")
                return frame
            else:
                logger.error(f"Failed to decode image from snapshot URL: {url}")
                return None
        else:
            logger.error(f"Failed to fetch snapshot (status code {response.status_code}): {url}")
            return None
    except Exception as e:
        logger.error(f"Exception fetching snapshot from {url}: {str(e)}")
        return None
 # Helper to get crop coordinates from stream
 def get_crop_coords(stream):
    return {
        "cropX1": stream.get("cropX1"),
        "cropY1": stream.get("cropY1"),
        "cropX2": stream.get("cropX2"),
        "cropY2": stream.get("cropY2")
    }
 ####################################################
 # REST API endpoint for image retrieval
 ####################################################
@app.get("/camera/{camera_id}/image")
 async def get_camera_image(camera_id: str):
    """
    Get the current frame from a camera as JPEG image
    """
    try:
        with streams_lock:
            if camera_id not in streams:
                logger.warning(f"Camera ID '{camera_id}' not found in streams. Current streams: {list(streams.keys())}")
                raise HTTPException(status_code=404, detail=f"Camera {camera_id} not found or not active")
            stream = streams[camera_id]
            buffer = stream["buffer"]
            logger.debug(f"Camera '{camera_id}' buffer size: {buffer.qsize()}, buffer empty: {buffer.empty()}")
            logger.debug(f"Buffer queue contents: {getattr(buffer, 'queue', None)}")
            if buffer.empty():
                logger.warning(f"No frame available for camera '{camera_id}'. Buffer is empty.")
                raise HTTPException(status_code=404, detail=f"No frame available for camera {camera_id}")
            # Get the latest frame (non-blocking)
            try:
                frame = buffer.queue[-1]  # Get the most recent frame without removing it
            except IndexError:
                logger.warning(f"Buffer queue is empty for camera '{camera_id}' when trying to access last frame.")
                raise HTTPException(status_code=404, detail=f"No frame available for camera {camera_id}")
        # Encode frame as JPEG
        success, buffer_img = cv2.imencode('.jpg', frame, [cv2.IMWRITE_JPEG_QUALITY, 85])
        if not success:
            raise HTTPException(status_code=500, detail="Failed to encode image as JPEG")
        # Return image as binary response
        return Response(content=buffer_img.tobytes(), media_type="image/jpeg")
    except HTTPException:
        raise
    except Exception as e:
        logger.error(f"Error retrieving image for camera {camera_id}: {str(e)}", exc_info=True)
        raise HTTPException(status_code=500, detail=f"Internal server error: {str(e)}")
 ####################################################
 # Detection and frame processing functions
 ####################################################
@app.websocket("/")
 async def detect(websocket: WebSocket):
-    logger.info("WebSocket connection accepted")
+    logging.info("WebSocket connection accepted")
    persistent_data_dict = {}
    async def handle_detection(camera_id, stream, frame, websocket, model_tree, persistent_data):
        try:
-            # Apply crop if specified
+            detection_result = run_pipeline(frame, model_tree)
            cropped_frame = frame
            if all(coord is not None for coord in [stream.get("cropX1"), stream.get("cropY1"), stream.get("cropX2"), stream.get("cropY2")]):
                cropX1, cropY1, cropX2, cropY2 = stream["cropX1"], stream["cropY1"], stream["cropX2"], stream["cropY2"]
                cropped_frame = frame[cropY1:cropY2, cropX1:cropX2]
                logger.debug(f"Applied crop coordinates ({cropX1}, {cropY1}, {cropX2}, {cropY2}) to frame for camera {camera_id}")
            logger.debug(f"Processing frame for camera {camera_id} with model {stream['modelId']}")
            start_time = time.time()
            detection_result = run_pipeline(cropped_frame, model_tree)
            process_time = (time.time() - start_time) * 1000
            logger.debug(f"Detection for camera {camera_id} completed in {process_time:.2f}ms")
            # Log the raw detection result for debugging
            logger.debug(f"Raw detection result for camera {camera_id}:\n{json.dumps(detection_result, indent=2, default=str)}")
            # Direct class result (no detections/classifications structure)
            if detection_result and isinstance(detection_result, dict) and "class" in detection_result and "confidence" in detection_result:
                highest_confidence_detection = {
                    "class": detection_result.get("class", "none"),
                    "confidence": detection_result.get("confidence", 1.0),
                    "box": [0, 0, 0, 0]  # Empty bounding box for classifications
                }
            # Handle case when no detections found or result is empty
            elif not detection_result or not detection_result.get("detections"):
                # Check if we have classification results
                if detection_result and detection_result.get("classifications"):
                    # Get the highest confidence classification
                    classifications = detection_result.get("classifications", [])
                    highest_confidence_class = max(classifications, key=lambda x: x.get("confidence", 0)) if classifications else None
                    if highest_confidence_class:
                        highest_confidence_detection = {
                            "class": highest_confidence_class.get("class", "none"),
                            "confidence": highest_confidence_class.get("confidence", 1.0),
                            "box": [0, 0, 0, 0]  # Empty bounding box for classifications
                        }
                    else:
                        highest_confidence_detection = {
                            "class": "none",
                            "confidence": 1.0,
                            "box": [0, 0, 0, 0]
                        }
                else:
                    highest_confidence_detection = {
                        "class": "none",
                        "confidence": 1.0,
                        "box": [0, 0, 0, 0]
                    }
            else:
                # Find detection with highest confidence
                detections = detection_result.get("detections", [])
                highest_confidence_detection = max(detections, key=lambda x: x.get("confidence", 0)) if detections else {
                    "class": "none",
                    "confidence": 1.0,
                    "box": [0, 0, 0, 0]
                }
            # Convert detection format to match protocol - flatten detection attributes
            detection_dict = {}
            # Handle different detection result formats
            if isinstance(highest_confidence_detection, dict):
                # Copy all fields from the detection result
                for key, value in highest_confidence_detection.items():
                    if key not in ["box", "id"]:  # Skip internal fields
                        detection_dict[key] = value
            # Extract display identifier for session ID lookup
            subscription_parts = stream["subscriptionIdentifier"].split(';')
            display_identifier = subscription_parts[0] if subscription_parts else None
            session_id = session_ids.get(display_identifier) if display_identifier else None
            detection_data = {
                "type": "imageDetection",
-                "subscriptionIdentifier": stream["subscriptionIdentifier"],
+                "cameraIdentifier": camera_id,
-                "timestamp": time.strftime("%Y-%m-%dT%H:%M:%S.%fZ", time.gmtime()),
+                "timestamp": time.time(),
                "data": {
-                    "detection": detection_dict,
+                    "detection": detection_result if detection_result else None,
                    "modelId": stream["modelId"],
                    "modelName": stream["modelName"]
                }
            }
-            
+            logging.debug(f"Sending detection data for camera {camera_id}: {detection_data}")
            # Add session ID if available
            if session_id is not None:
                detection_data["sessionId"] = session_id
            if highest_confidence_detection["class"] != "none":
                logger.info(f"Camera {camera_id}: Detected {highest_confidence_detection['class']} with confidence {highest_confidence_detection['confidence']:.2f} using model {stream['modelName']}")
                # Log session ID if available
                subscription_parts = stream["subscriptionIdentifier"].split(';')
                display_identifier = subscription_parts[0] if subscription_parts else None
                session_id = session_ids.get(display_identifier) if display_identifier else None
                if session_id:
                    logger.debug(f"Detection associated with session ID: {session_id}")
            await websocket.send_json(detection_data)
            logger.debug(f"Sent detection data to client for camera {camera_id}")
            return persistent_data
        except Exception as e:
-            logger.error(f"Error in handle_detection for camera {camera_id}: {str(e)}", exc_info=True)
+            logging.error(f"Error in handle_detection for camera {camera_id}: {e}")
            return persistent_data
    def frame_reader(camera_id, cap, buffer, stop_event):
        retries = 0
        logger.info(f"Starting frame reader thread for camera {camera_id}")
        frame_count = 0
        last_log_time = time.time()
        try:
            # Log initial camera status and properties
            if cap.isOpened():
                width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
                height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
                fps = cap.get(cv2.CAP_PROP_FPS)
                logger.info(f"Camera {camera_id} opened successfully with resolution {width}x{height}, FPS: {fps}")
            else:
                logger.error(f"Camera {camera_id} failed to open initially")
            while not stop_event.is_set():
                try:
                    if not cap.isOpened():
                        logger.error(f"Camera {camera_id} is not open before trying to read")
                        # Attempt to reopen
                        cap = cv2.VideoCapture(streams[camera_id]["rtsp_url"])
                        time.sleep(reconnect_interval)
                        continue
                    logger.debug(f"Attempting to read frame from camera {camera_id}")
                    ret, frame = cap.read()
                    if not ret:
-                        logger.warning(f"Connection lost for camera: {camera_id}, retry {retries+1}/{max_retries}")
+                        logging.warning(f"Connection lost for camera: {camera_id}, retry {retries+1}/{max_retries}")
                        cap.release()
                        time.sleep(reconnect_interval)
                        retries += 1
                        if retries > max_retries and max_retries != -1:
-                            logger.error(f"Max retries reached for camera: {camera_id}, stopping frame reader")
+                            logging.error(f"Max retries reached for camera: {camera_id}")
                            break
                        # Re-open
                        logger.info(f"Attempting to reopen RTSP stream for camera: {camera_id}")
                        cap = cv2.VideoCapture(streams[camera_id]["rtsp_url"])
                        if not cap.isOpened():
-                            logger.error(f"Failed to reopen RTSP stream for camera: {camera_id}")
+                            logging.error(f"Failed to reopen RTSP stream for camera: {camera_id}")
                            continue
                        logger.info(f"Successfully reopened RTSP stream for camera: {camera_id}")
                        continue
                    # Successfully read a frame
                    frame_count += 1
                    current_time = time.time()
                    # Log frame stats every 5 seconds
                    if current_time - last_log_time > 5:
                        logger.info(f"Camera {camera_id}: Read {frame_count} frames in the last {current_time - last_log_time:.1f} seconds")
                        frame_count = 0
                        last_log_time = current_time
                    logger.debug(f"Successfully read frame from camera {camera_id}, shape: {frame.shape}")
                    retries = 0
                    # Overwrite old frame if buffer is full
                    if not buffer.empty():
                        try:
                            buffer.get_nowait()
                            logger.debug(f"[frame_reader] Removed old frame from buffer for camera {camera_id}")
                        except queue.Empty:
                            pass
                    buffer.put(frame)
                    logger.debug(f"[frame_reader] Added new frame to buffer for camera {camera_id}. Buffer size: {buffer.qsize()}")
                    # Short sleep to avoid CPU overuse
                    time.sleep(0.01)
                except cv2.error as e:
-                    logger.error(f"OpenCV error for camera {camera_id}: {e}", exc_info=True)
+                    logging.error(f"OpenCV error for camera {camera_id}: {e}")
                    cap.release()
                    time.sleep(reconnect_interval)
                    retries += 1
                    if retries > max_retries and max_retries != -1:
-                        logger.error(f"Max retries reached after OpenCV error for camera {camera_id}")
+                        logging.error(f"Max retries reached after OpenCV error for camera {camera_id}")
                        break
                    logger.info(f"Attempting to reopen RTSP stream after OpenCV error for camera: {camera_id}")
                    cap = cv2.VideoCapture(streams[camera_id]["rtsp_url"])
                    if not cap.isOpened():
-                        logger.error(f"Failed to reopen RTSP stream for camera {camera_id} after OpenCV error")
+                        logging.error(f"Failed to reopen RTSP stream for camera {camera_id} after OpenCV error")
                        continue
                    logger.info(f"Successfully reopened RTSP stream after OpenCV error for camera: {camera_id}")
                except Exception as e:
-                    logger.error(f"Unexpected error for camera {camera_id}: {str(e)}", exc_info=True)
+                    logging.error(f"Unexpected error for camera {camera_id}: {e}")
                    cap.release()
                    break
        except Exception as e:
-            logger.error(f"Error in frame_reader thread for camera {camera_id}: {str(e)}", exc_info=True)
+            logging.error(f"Error in frame_reader thread for camera {camera_id}: {e}")
        finally:
            logger.info(f"Frame reader thread for camera {camera_id} is exiting")
            if cap and cap.isOpened():
                cap.release()
    def snapshot_reader(camera_id, snapshot_url, snapshot_interval, buffer, stop_event):
        """Frame reader that fetches snapshots from HTTP/HTTPS URL at specified intervals"""
        retries = 0
        logger.info(f"Starting snapshot reader thread for camera {camera_id} from {snapshot_url}")
        frame_count = 0
        last_log_time = time.time()
        try:
            interval_seconds = snapshot_interval / 1000.0  # Convert milliseconds to seconds
            logger.info(f"Snapshot interval for camera {camera_id}: {interval_seconds}s")
            while not stop_event.is_set():
                try:
                    start_time = time.time()
                    frame = fetch_snapshot(snapshot_url)
                    if frame is None:
                        logger.warning(f"Failed to fetch snapshot for camera: {camera_id}, retry {retries+1}/{max_retries}")
                        retries += 1
                        if retries > max_retries and max_retries != -1:
                            logger.error(f"Max retries reached for snapshot camera: {camera_id}, stopping reader")
                            break
                        time.sleep(min(interval_seconds, reconnect_interval))
                        continue
                    # Successfully fetched a frame
                    frame_count += 1
                    current_time = time.time()
                    # Log frame stats every 5 seconds
                    if current_time - last_log_time > 5:
                        logger.info(f"Camera {camera_id}: Fetched {frame_count} snapshots in the last {current_time - last_log_time:.1f} seconds")
                        frame_count = 0
                        last_log_time = current_time
                    logger.debug(f"Successfully fetched snapshot from camera {camera_id}, shape: {frame.shape}")
                    retries = 0
                    # Overwrite old frame if buffer is full
                    if not buffer.empty():
                        try:
                            buffer.get_nowait()
                            logger.debug(f"[snapshot_reader] Removed old snapshot from buffer for camera {camera_id}")
                        except queue.Empty:
                            pass
                    buffer.put(frame)
                    logger.debug(f"[snapshot_reader] Added new snapshot to buffer for camera {camera_id}. Buffer size: {buffer.qsize()}")
                    # Wait for the specified interval
                    elapsed = time.time() - start_time
                    sleep_time = max(interval_seconds - elapsed, 0)
                    if sleep_time > 0:
                        time.sleep(sleep_time)
                except Exception as e:
                    logger.error(f"Unexpected error fetching snapshot for camera {camera_id}: {str(e)}", exc_info=True)
                    retries += 1
                    if retries > max_retries and max_retries != -1:
                        logger.error(f"Max retries reached after error for snapshot camera {camera_id}")
                        break
                    time.sleep(min(interval_seconds, reconnect_interval))
        except Exception as e:
            logger.error(f"Error in snapshot_reader thread for camera {camera_id}: {str(e)}", exc_info=True)
        finally:
            logger.info(f"Snapshot reader thread for camera {camera_id} is exiting")
    async def process_streams():
-        logger.info("Started processing streams")
+        logging.info("Started processing streams")
        try:
            while True:
                start_time = time.time()
                with streams_lock:
                    current_streams = list(streams.items())
                    if current_streams:
                        logger.debug(f"Processing {len(current_streams)} active streams")
                    else:
                        logger.debug("No active streams to process")
                for camera_id, stream in current_streams:
                    buffer = stream["buffer"]
-                    if buffer.empty():
+                    if not buffer.empty():
-                        logger.debug(f"Frame buffer is empty for camera {camera_id}")
+                        frame = buffer.get()
-                        continue
+                        with models_lock:
-                    
+                            model_tree = models.get(camera_id, {}).get(stream["modelId"])
-                    logger.debug(f"Got frame from buffer for camera {camera_id}")
+                        key = (camera_id, stream["modelId"])
-                    frame = buffer.get()
+                        persistent_data = persistent_data_dict.get(key, {})
-                    
+                        updated_persistent_data = await handle_detection(
-                    with models_lock:
+                            camera_id, stream, frame, websocket, model_tree, persistent_data
-                        model_tree = models.get(camera_id, {}).get(stream["modelId"])
+                        )
-                        if not model_tree:
+                        persistent_data_dict[key] = updated_persistent_data
                            logger.warning(f"Model not found for camera {camera_id}, modelId {stream['modelId']}")
                            continue
                        logger.debug(f"Found model tree for camera {camera_id}, modelId {stream['modelId']}")
                    key = (camera_id, stream["modelId"])
                    persistent_data = persistent_data_dict.get(key, {})
                    logger.debug(f"Starting detection for camera {camera_id} with modelId {stream['modelId']}")
                    updated_persistent_data = await handle_detection(
                        camera_id, stream, frame, websocket, model_tree, persistent_data
                    )
                    persistent_data_dict[key] = updated_persistent_data
                elapsed_time = (time.time() - start_time) * 1000  # ms
                sleep_time = max(poll_interval - elapsed_time, 0)
-                logger.debug(f"Frame processing cycle: {elapsed_time:.2f}ms, sleeping for: {sleep_time:.2f}ms")
+                logging.debug(f"Elapsed time: {elapsed_time}ms, sleeping for: {sleep_time}ms")
                await asyncio.sleep(sleep_time / 1000.0)
        except asyncio.CancelledError:
-            logger.info("Stream processing task cancelled")
+            logging.info("Stream processing task cancelled")
        except Exception as e:
-            logger.error(f"Error in process_streams: {str(e)}", exc_info=True)
+            logging.error(f"Error in process_streams: {e}")
    async def send_heartbeat():
        while True:
@ -506,19 +187,18 @@ async def detect(websocket: WebSocket):
                cpu_usage = psutil.cpu_percent()
                memory_usage = psutil.virtual_memory().percent
                if torch.cuda.is_available():
-                    gpu_usage = torch.cuda.utilization() if hasattr(torch.cuda, 'utilization') else None
+                    gpu_usage = torch.cuda.memory_allocated() / (1024 ** 2)  # MB
-                    gpu_memory_usage = torch.cuda.memory_reserved() / (1024 ** 2)
+                    gpu_memory_usage = torch.cuda.memory_reserved() / (1024 ** 2)  # MB
                else:
                    gpu_usage = None
                    gpu_memory_usage = None
                camera_connections = [
                    {
-                        "subscriptionIdentifier": stream["subscriptionIdentifier"],
+                        "cameraIdentifier": camera_id,
                        "modelId": stream["modelId"],
                        "modelName": stream["modelName"],
-                        "online": True,
+                        "online": True
                        **{k: v for k, v in get_crop_coords(stream).items() if v is not None}
                    }
                    for camera_id, stream in streams.items()
                ]
@ -532,225 +212,104 @@ async def detect(websocket: WebSocket):
                    "cameraConnections": camera_connections
                }
                await websocket.send_text(json.dumps(state_report))
-                logger.debug(f"Sent stateReport as heartbeat: CPU {cpu_usage:.1f}%, Memory {memory_usage:.1f}%, {len(camera_connections)} active cameras")
+                logging.debug("Sent stateReport as heartbeat")
                await asyncio.sleep(HEARTBEAT_INTERVAL)
            except Exception as e:
-                logger.error(f"Error sending stateReport heartbeat: {e}")
+                logging.error(f"Error sending stateReport heartbeat: {e}")
                break
    async def on_message():
        while True:
            try:
                msg = await websocket.receive_text()
-                logger.debug(f"Received message: {msg}")
+                logging.debug(f"Received message: {msg}")
                data = json.loads(msg)
                msg_type = data.get("type")
                if msg_type == "subscribe":
                    payload = data.get("payload", {})
-                    subscriptionIdentifier = payload.get("subscriptionIdentifier")
+                    camera_id = payload.get("cameraIdentifier")
                    rtsp_url = payload.get("rtspUrl")
-                    snapshot_url = payload.get("snapshotUrl")
+                    model_url = payload.get("modelUrl")  # may be remote or local
                    snapshot_interval = payload.get("snapshotInterval")
                    model_url = payload.get("modelUrl")
                    modelId = payload.get("modelId")
                    modelName = payload.get("modelName")
                    cropX1 = payload.get("cropX1")
                    cropY1 = payload.get("cropY1")
                    cropX2 = payload.get("cropX2")
                    cropY2 = payload.get("cropY2")
                    # Extract camera_id from subscriptionIdentifier (format: displayIdentifier;cameraIdentifier)
                    parts = subscriptionIdentifier.split(';')
                    if len(parts) != 2:
                        logger.error(f"Invalid subscriptionIdentifier format: {subscriptionIdentifier}")
                        continue
                    display_identifier, camera_identifier = parts
                    camera_id = subscriptionIdentifier  # Use full subscriptionIdentifier as camera_id for mapping
                    if model_url:
                        with models_lock:
-                            if (camera_id not in models) or (modelId not in models[camera_id]):
+                            if camera_id not in models:
-                                logger.info(f"Loading model from {model_url} for camera {camera_id}, modelId {modelId}")
+                                models[camera_id] = {}
-                                extraction_dir = os.path.join("models", camera_identifier, str(modelId))
+                            if modelId not in models[camera_id]:
                                logging.info(f"Loading model from {model_url}")
                                extraction_dir = os.path.join("models", camera_id, str(modelId))
                                os.makedirs(extraction_dir, exist_ok=True)
                                # If model_url is remote, download it first.
                                parsed = urlparse(model_url)
                                if parsed.scheme in ("http", "https"):
-                                    logger.info(f"Downloading remote .mpta file from {model_url}")
+                                    local_mpta = os.path.join(extraction_dir, os.path.basename(parsed.path))
                                    filename = os.path.basename(parsed.path) or f"model_{modelId}.mpta"
                                    local_mpta = os.path.join(extraction_dir, filename)
                                    logger.debug(f"Download destination: {local_mpta}")
                                    local_path = download_mpta(model_url, local_mpta)
                                    if not local_path:
-                                        logger.error(f"Failed to download the remote .mpta file from {model_url}")
+                                        logging.error("Failed to download the remote mpta file.")
                                        error_response = {
                                            "type": "error",
                                            "subscriptionIdentifier": subscriptionIdentifier,
                                            "error": f"Failed to download model from {model_url}"
                                        }
                                        await websocket.send_json(error_response)
                                        continue
                                    model_tree = load_pipeline_from_zip(local_path, extraction_dir)
                                else:
                                    logger.info(f"Loading local .mpta file from {model_url}")
                                    # Check if file exists before attempting to load
                                    if not os.path.exists(model_url):
                                        logger.error(f"Local .mpta file not found: {model_url}")
                                        logger.debug(f"Current working directory: {os.getcwd()}")
                                        error_response = {
                                            "type": "error",
                                            "subscriptionIdentifier": subscriptionIdentifier,
                                            "error": f"Model file not found: {model_url}"
                                        }
                                        await websocket.send_json(error_response)
                                        continue
                                    model_tree = load_pipeline_from_zip(model_url, extraction_dir)
                                if model_tree is None:
-                                    logger.error(f"Failed to load model {modelId} from .mpta file for camera {camera_id}")
+                                    logging.error("Failed to load model from mpta file.")
                                    error_response = {
                                        "type": "error",
                                        "subscriptionIdentifier": subscriptionIdentifier,
                                        "error": f"Failed to load model {modelId}"
                                    }
                                    await websocket.send_json(error_response)
                                    continue
                                if camera_id not in models:
                                    models[camera_id] = {}
                                models[camera_id][modelId] = model_tree
-                                logger.info(f"Successfully loaded model {modelId} for camera {camera_id}")
+                                logging.info(f"Loaded model {modelId} for camera {camera_id}")
-                                logger.debug(f"Model extraction directory: {extraction_dir}")
+
-                    if camera_id and (rtsp_url or snapshot_url):
+                    if camera_id and rtsp_url:
                        with streams_lock:
                            # Determine camera URL for shared stream management
                            camera_url = snapshot_url if snapshot_url else rtsp_url
                            if camera_id not in streams and len(streams) < max_streams:
-                                # Check if we already have a stream for this camera URL
+                                cap = cv2.VideoCapture(rtsp_url)
-                                shared_stream = camera_streams.get(camera_url)
+                                if not cap.isOpened():
-                                
+                                    logging.error(f"Failed to open RTSP stream for camera {camera_id}")
-                                if shared_stream:
+                                    continue
-                                    # Reuse existing stream
+                                buffer = queue.Queue(maxsize=1)
-                                    logger.info(f"Reusing existing stream for camera URL: {camera_url}")
+                                stop_event = threading.Event()
-                                    buffer = shared_stream["buffer"]
+                                thread = threading.Thread(target=frame_reader, args=(camera_id, cap, buffer, stop_event))
-                                    stop_event = shared_stream["stop_event"]
+                                thread.daemon = True
-                                    thread = shared_stream["thread"]
+                                thread.start()
-                                    mode = shared_stream["mode"]
+                                streams[camera_id] = {
-                                    
+                                    "cap": cap,
                                    # Increment reference count
                                    shared_stream["ref_count"] = shared_stream.get("ref_count", 0) + 1
                                else:
                                    # Create new stream
                                    buffer = queue.Queue(maxsize=1)
                                    stop_event = threading.Event()
                                    if snapshot_url and snapshot_interval:
                                        logger.info(f"Creating new snapshot stream for camera {camera_id}: {snapshot_url}")
                                        thread = threading.Thread(target=snapshot_reader, args=(camera_identifier, snapshot_url, snapshot_interval, buffer, stop_event))
                                        thread.daemon = True
                                        thread.start()
                                        mode = "snapshot"
                                        # Store shared stream info
                                        shared_stream = {
                                            "buffer": buffer,
                                            "thread": thread,
                                            "stop_event": stop_event,
                                            "mode": mode,
                                            "url": snapshot_url,
                                            "snapshot_interval": snapshot_interval,
                                            "ref_count": 1
                                        }
                                        camera_streams[camera_url] = shared_stream
                                    elif rtsp_url:
                                        logger.info(f"Creating new RTSP stream for camera {camera_id}: {rtsp_url}")
                                        cap = cv2.VideoCapture(rtsp_url)
                                        if not cap.isOpened():
                                            logger.error(f"Failed to open RTSP stream for camera {camera_id}")
                                            continue
                                        thread = threading.Thread(target=frame_reader, args=(camera_identifier, cap, buffer, stop_event))
                                        thread.daemon = True
                                        thread.start()
                                        mode = "rtsp"
                                        # Store shared stream info
                                        shared_stream = {
                                            "buffer": buffer,
                                            "thread": thread,
                                            "stop_event": stop_event,
                                            "mode": mode,
                                            "url": rtsp_url,
                                            "cap": cap,
                                            "ref_count": 1
                                        }
                                        camera_streams[camera_url] = shared_stream
                                    else:
                                        logger.error(f"No valid URL provided for camera {camera_id}")
                                        continue
                                # Create stream info for this subscription
                                stream_info = {
                                    "buffer": buffer,
                                    "thread": thread,
                                    "rtsp_url": rtsp_url,
                                    "stop_event": stop_event,
                                    "modelId": modelId,
-                                    "modelName": modelName,
+                                    "modelName": modelName
                                    "subscriptionIdentifier": subscriptionIdentifier,
                                    "cropX1": cropX1,
                                    "cropY1": cropY1,
                                    "cropX2": cropX2,
                                    "cropY2": cropY2,
                                    "mode": mode,
                                    "camera_url": camera_url
                                }
-                                
+                                logging.info(f"Subscribed to camera {camera_id} with modelId {modelId}, modelName {modelName}, URL {rtsp_url}")
                                if mode == "snapshot":
                                    stream_info["snapshot_url"] = snapshot_url
                                    stream_info["snapshot_interval"] = snapshot_interval
                                elif mode == "rtsp":
                                    stream_info["rtsp_url"] = rtsp_url
                                    stream_info["cap"] = shared_stream["cap"]
                                streams[camera_id] = stream_info
                                subscription_to_camera[camera_id] = camera_url
                            elif camera_id and camera_id in streams:
-                                # If already subscribed, unsubscribe first
+                                # If already subscribed, unsubscribe
-                                logger.info(f"Resubscribing to camera {camera_id}")
+                                stream = streams.pop(camera_id)
-                                # Note: Keep models in memory for reuse across subscriptions
+                                stream["cap"].release()
                                logging.info(f"Unsubscribed from camera {camera_id}")
                                with models_lock:
                                    if camera_id in models and modelId in models[camera_id]:
                                        del models[camera_id][modelId]
                                        if not models[camera_id]:
                                            del models[camera_id]
                elif msg_type == "unsubscribe":
                    payload = data.get("payload", {})
-                    subscriptionIdentifier = payload.get("subscriptionIdentifier")
+                    camera_id = payload.get("cameraIdentifier")
-                    camera_id = subscriptionIdentifier
+                    logging.debug(f"Unsubscribing from camera {camera_id}")
                    with streams_lock:
                        if camera_id and camera_id in streams:
                            stream = streams.pop(camera_id)
-                            camera_url = subscription_to_camera.pop(camera_id, None)
+                            stream["stop_event"].set()
-                            
+                            stream["thread"].join()
-                            if camera_url and camera_url in camera_streams:
+                            stream["cap"].release()
-                                shared_stream = camera_streams[camera_url]
+                            logging.info(f"Unsubscribed from camera {camera_id}")
-                                shared_stream["ref_count"] -= 1
+                            with models_lock:
-                                
+                                if camera_id in models:
-                                # If no more references, stop the shared stream
+                                    del models[camera_id]
                                if shared_stream["ref_count"] <= 0:
                                    logger.info(f"Stopping shared stream for camera URL: {camera_url}")
                                    shared_stream["stop_event"].set()
                                    shared_stream["thread"].join()
                                    if "cap" in shared_stream:
                                        shared_stream["cap"].release()
                                    del camera_streams[camera_url]
                                else:
                                    logger.info(f"Shared stream for {camera_url} still has {shared_stream['ref_count']} references")
                            logger.info(f"Unsubscribed from camera {camera_id}")
                            # Note: Keep models in memory for potential reuse
                elif msg_type == "requestState":
                    cpu_usage = psutil.cpu_percent()
                    memory_usage = psutil.virtual_memory().percent
                    if torch.cuda.is_available():
-                        gpu_usage = torch.cuda.utilization() if hasattr(torch.cuda, 'utilization') else None
+                        gpu_usage = torch.cuda.memory_allocated() / (1024 ** 2)
                        gpu_memory_usage = torch.cuda.memory_reserved() / (1024 ** 2)
                    else:
                        gpu_usage = None
@ -758,11 +317,10 @@ async def detect(websocket: WebSocket):
                    camera_connections = [
                        {
-                            "subscriptionIdentifier": stream["subscriptionIdentifier"],
+                            "cameraIdentifier": camera_id,
                            "modelId": stream["modelId"],
                            "modelName": stream["modelName"],
-                            "online": True,
+                            "online": True
                            **{k: v for k, v in get_crop_coords(stream).items() if v is not None}
                        }
                        for camera_id, stream in streams.items()
                    ]
@ -776,47 +334,17 @@ async def detect(websocket: WebSocket):
                        "cameraConnections": camera_connections
                    }
                    await websocket.send_text(json.dumps(state_report))
                elif msg_type == "setSessionId":
                    payload = data.get("payload", {})
                    display_identifier = payload.get("displayIdentifier")
                    session_id = payload.get("sessionId")
                    if display_identifier:
                        # Store session ID for this display
                        if session_id is None:
                            session_ids.pop(display_identifier, None)
                            logger.info(f"Cleared session ID for display {display_identifier}")
                        else:
                            session_ids[display_identifier] = session_id
                            logger.info(f"Set session ID {session_id} for display {display_identifier}")
                elif msg_type == "patchSession":
                    session_id = data.get("sessionId")
                    patch_data = data.get("data", {})
                    # For now, just acknowledge the patch - actual implementation depends on backend requirements
                    response = {
                        "type": "patchSessionResult",
                        "payload": {
                            "sessionId": session_id,
                            "success": True,
                            "message": "Session patch acknowledged"
                        }
                    }
                    await websocket.send_json(response)
                    logger.info(f"Acknowledged patch for session {session_id}")
                else:
-                    logger.error(f"Unknown message type: {msg_type}")
+                    logging.error(f"Unknown message type: {msg_type}")
            except json.JSONDecodeError:
-                logger.error("Received invalid JSON message")
+                logging.error("Received invalid JSON message")
            except (WebSocketDisconnect, ConnectionClosedError) as e:
-                logger.warning(f"WebSocket disconnected: {e}")
+                logging.warning(f"WebSocket disconnected: {e}")
                break
            except Exception as e:
-                logger.error(f"Error handling message: {e}")
+                logging.error(f"Error handling message: {e}")
                break
    try:
        await websocket.accept()
        stream_task = asyncio.create_task(process_streams())
@ -824,28 +352,22 @@ async def detect(websocket: WebSocket):
        message_task = asyncio.create_task(on_message())
        await asyncio.gather(heartbeat_task, message_task)
    except Exception as e:
-        logger.error(f"Error in detect websocket: {e}")
+        logging.error(f"Error in detect websocket: {e}")
    finally:
        stream_task.cancel()
        await stream_task
        with streams_lock:
-            # Clean up shared camera streams
+            for camera_id, stream in streams.items():
-            for camera_url, shared_stream in camera_streams.items():
+                stream["stop_event"].set()
-                shared_stream["stop_event"].set()
+                stream["thread"].join()
-                shared_stream["thread"].join()
+                stream["cap"].release()
-                if "cap" in shared_stream:
+                while not stream["buffer"].empty():
                    shared_stream["cap"].release()
                while not shared_stream["buffer"].empty():
                    try:
-                        shared_stream["buffer"].get_nowait()
+                        stream["buffer"].get_nowait()
                    except queue.Empty:
                        pass
-                logger.info(f"Released shared camera stream for {camera_url}")
+                logging.info(f"Released camera {camera_id} and cleaned up resources")
            streams.clear()
            camera_streams.clear()
            subscription_to_camera.clear()
        with models_lock:
            models.clear()
-        session_ids.clear()
+        logging.info("WebSocket connection closed")
        logger.info("WebSocket connection closed")
--- a/app_single.py
+++ b/app_single.py
@ -0,0 +1,366 @@
 from typing import List
 from fastapi import FastAPI, WebSocket
 from fastapi.websockets import WebSocketDisconnect
 from websockets.exceptions import ConnectionClosedError
 from ultralytics import YOLO
 import torch
 import cv2
 import base64
 import numpy as np
 import json
 import logging
 import threading
 import queue
 import os
 import requests
 from urllib.parse import urlparse
 import asyncio
 import psutil
 app = FastAPI()
 models = {}
 with open("config.json", "r") as f:
    config = json.load(f)
 poll_interval = config.get("poll_interval_ms", 100)
 reconnect_interval = config.get("reconnect_interval_sec", 5) 
 TARGET_FPS = config.get("target_fps", 10)
 poll_interval = 1000 / TARGET_FPS
 logging.info(f"Poll interval: {poll_interval}ms")
 max_streams = config.get("max_streams", 5)
 max_retries = config.get("max_retries", 3)
 # Configure logging
 logging.basicConfig(
    level=logging.DEBUG,
    format="%(asctime)s [%(levelname)s] %(message)s",
    handlers=[
        logging.FileHandler("app.log"),
        logging.StreamHandler()
    ]
 )
 # Ensure the models directory exists
 os.makedirs("models", exist_ok=True)
 # Add constants for heartbeat
 HEARTBEAT_INTERVAL = 2  # seconds
 WORKER_TIMEOUT_MS = 10000
 # Add a lock for thread-safe operations on shared resources
 streams_lock = threading.Lock()
 models_lock = threading.Lock()
@app.websocket("/")
 async def detect(websocket: WebSocket):
    import asyncio
    import time
    logging.info("WebSocket connection accepted")
    streams = {}
    # This function is user-modifiable
    # Save data you want to persist across frames in the persistent_data dictionary
    async def handle_detection(camera_id, stream, frame, websocket, model: YOLO, persistent_data):
        try:
            highest_conf_box = None
            max_conf = -1
            for r in model.track(frame, stream=False, persist=True):
                for box in r.boxes:
                    box_cpu = box.cpu()
                    conf = float(box_cpu.conf[0])
                    if conf > max_conf and hasattr(box, "id") and box.id is not None:
                        max_conf = conf
                        highest_conf_box = {
                            "class": model.names[int(box_cpu.cls[0])],
                            "confidence": conf,
                            "id": box.id.item(),
                        }
            # Broadcast to all subscribers of this URL
            detection_data = {
                "type": "imageDetection",
                "cameraIdentifier": camera_id,
                "timestamp": time.time(),
                "data": {
                    "detections": highest_conf_box if highest_conf_box else None,
                    "modelId": stream['modelId'],
                    "modelName": stream['modelName']
                }
            }
            logging.debug(f"Sending detection data for camera {camera_id}: {detection_data}")
            await websocket.send_json(detection_data)
            return persistent_data
        except Exception as e:
            logging.error(f"Error in handle_detection for camera {camera_id}: {e}")
            return persistent_data
    def frame_reader(camera_id, cap, buffer, stop_event):
        import time
        retries = 0
        try:
            while not stop_event.is_set():
                try:
                    ret, frame = cap.read()
                    if not ret:
                        logging.warning(f"Connection lost for camera: {camera_id}, retry {retries+1}/{max_retries}")
                        cap.release()
                        time.sleep(reconnect_interval)
                        retries += 1
                        if retries > max_retries and max_retries != -1:
                            logging.error(f"Max retries reached for camera: {camera_id}")
                            break
                        # Re-open the VideoCapture
                        cap = cv2.VideoCapture(streams[camera_id]['rtsp_url'])
                        if not cap.isOpened():
                            logging.error(f"Failed to reopen RTSP stream for camera: {camera_id}")
                            continue
                        continue
                    retries = 0  # Reset on success
                    if not buffer.empty():
                        try:
                            buffer.get_nowait()  # Discard the old frame
                        except queue.Empty:
                            pass
                    buffer.put(frame)
                except cv2.error as e:
                    logging.error(f"OpenCV error for camera {camera_id}: {e}")
                    cap.release()
                    time.sleep(reconnect_interval)
                    retries += 1
                    if retries > max_retries and max_retries != -1:
                        logging.error(f"Max retries reached after OpenCV error for camera: {camera_id}")
                        break
                    # Re-open the VideoCapture
                    cap = cv2.VideoCapture(streams[camera_id]['rtsp_url'])
                    if not cap.isOpened():
                        logging.error(f"Failed to reopen RTSP stream for camera {camera_id} after OpenCV error")
                        continue
                except Exception as e:
                    logging.error(f"Unexpected error for camera {camera_id}: {e}")
                    cap.release()
                    break
        except Exception as e:
            logging.error(f"Error in frame_reader thread for camera {camera_id}: {e}")
    async def process_streams():
        global models
        logging.info("Started processing streams")
        persistent_data_dict = {} 
        try:
            while True:
                start_time = time.time()
                # Round-robin processing
                with streams_lock:
                    current_streams = list(streams.items())
                for camera_id, stream in current_streams:
                    buffer = stream['buffer']
                    if not buffer.empty():
                        frame = buffer.get()
                        with models_lock:
                            model = models.get(camera_id, {}).get(stream['modelId'])
                        key = (camera_id, stream['modelId'])
                        persistent_data = persistent_data_dict.get(key, {})
                        updated_persistent_data = await handle_detection(camera_id, stream, frame, websocket, model, persistent_data)
                        persistent_data_dict[key] = updated_persistent_data
                elapsed_time = (time.time() - start_time) * 1000  # in ms
                sleep_time = max(poll_interval - elapsed_time, 0)
                logging.debug(f"Elapsed time: {elapsed_time}ms, sleeping for: {sleep_time}ms")
                await asyncio.sleep(sleep_time / 1000.0)
        except asyncio.CancelledError:
            logging.info("Stream processing task cancelled")
        except Exception as e:
            logging.error(f"Error in process_streams: {e}")
    async def send_heartbeat():
        while True:
            try:
                cpu_usage = psutil.cpu_percent()
                memory_usage = psutil.virtual_memory().percent
                if torch.cuda.is_available():
                    gpu_usage = torch.cuda.memory_allocated() / (1024 ** 2)  # Convert to MB
                    gpu_memory_usage = torch.cuda.memory_reserved() / (1024 ** 2)  # Convert to MB
                else:
                    gpu_usage = None
                    gpu_memory_usage = None
                camera_connections = [
                    {
                        "cameraIdentifier": camera_id,
                        "modelId": stream['modelId'],
                        "modelName": stream['modelName'],
                        "online": True
                    }
                    for camera_id, stream in streams.items()
                ]
                state_report = {
                    "type": "stateReport",
                    "cpuUsage": cpu_usage,
                    "memoryUsage": memory_usage,
                    "gpuUsage": gpu_usage,
                    "gpuMemoryUsage": gpu_memory_usage,
                    "cameraConnections": camera_connections
                }
                await websocket.send_text(json.dumps(state_report))
                logging.debug("Sent stateReport as heartbeat")
                await asyncio.sleep(HEARTBEAT_INTERVAL)
            except Exception as e:
                logging.error(f"Error sending stateReport heartbeat: {e}")
                break
    async def on_message():
        global models
        while True:
            try:
                msg = await websocket.receive_text()
                logging.debug(f"Received message: {msg}")
                print(f"Received message: {msg}")
                data = json.loads(msg)
                msg_type = data.get("type")
                if msg_type == "subscribe":
                    payload = data.get("payload", {})
                    camera_id = payload.get("cameraIdentifier")
                    rtsp_url = payload.get("rtspUrl")
                    model_url = payload.get("modelUrl")
                    modelId = payload.get("modelId")
                    modelName = payload.get("modelName")
                    if model_url:
                        with models_lock:
                            if camera_id not in models:
                                models[camera_id] = {}
                            if modelId not in models[camera_id]:
                                print(f"Downloading model from {model_url}")
                                parsed_url = urlparse(model_url)
                                filename = os.path.basename(parsed_url.path)    
                                model_filename = os.path.join("models", filename)
                                # Download the model
                                response = requests.get(model_url, stream=True)
                                if response.status_code == 200:
                                    with open(model_filename, 'wb') as f:
                                        for chunk in response.iter_content(chunk_size=8192):
                                            f.write(chunk)
                                    logging.info(f"Downloaded model from {model_url} to {model_filename}")
                                    model = YOLO(model_filename)
                                    if torch.cuda.is_available():
                                        model.to('cuda')
                                    models[camera_id][modelId] = model
                                    logging.info(f"Loaded model {modelId} for camera {camera_id}")
                                else:
                                    logging.error(f"Failed to download model from {model_url}")
                                    continue
                    if camera_id and rtsp_url:
                        with streams_lock:
                            if camera_id not in streams and len(streams) < max_streams:
                                cap = cv2.VideoCapture(rtsp_url)
                                if not cap.isOpened():
                                    logging.error(f"Failed to open RTSP stream for camera {camera_id}")
                                    continue
                                buffer = queue.Queue(maxsize=1)
                                stop_event = threading.Event()
                                thread = threading.Thread(target=frame_reader, args=(camera_id, cap, buffer, stop_event))
                                thread.daemon = True
                                thread.start()
                                streams[camera_id] = {
                                    'cap': cap,
                                    'buffer': buffer,
                                    'thread': thread,
                                    'rtsp_url': rtsp_url,
                                    'stop_event': stop_event,
                                    'modelId': modelId,
                                    'modelName': modelName
                                }
                                logging.info(f"Subscribed to camera {camera_id} with modelId {modelId}, modelName {modelName} and URL {rtsp_url}")
                            elif camera_id and camera_id in streams:
                                stream = streams.pop(camera_id)
                                stream['cap'].release()
                                logging.info(f"Unsubscribed from camera {camera_id}")
                                if camera_id in models and modelId in models[camera_id]:
                                    del models[camera_id][modelId]
                                    if not models[camera_id]:
                                        del models[camera_id]
                elif msg_type == "unsubscribe":
                    payload = data.get("payload", {})
                    camera_id = payload.get("cameraIdentifier")
                    logging.debug(f"Unsubscribing from camera {camera_id}")
                    with streams_lock:
                        if camera_id and camera_id in streams:
                            stream = streams.pop(camera_id)
                            stream['stop_event'].set()
                            stream['thread'].join()
                            stream['cap'].release()
                            logging.info(f"Unsubscribed from camera {camera_id}")
                            if camera_id in models and modelId in models[camera_id]:
                                del models[camera_id][modelId]
                                if not models[camera_id]:
                                    del models[camera_id]
                elif msg_type == "requestState":
                    # Handle state request
                    cpu_usage = psutil.cpu_percent()
                    memory_usage = psutil.virtual_memory().percent
                    if torch.cuda.is_available():
                        gpu_usage = torch.cuda.memory_allocated() / (1024 ** 2)  # Convert to MB
                        gpu_memory_usage = torch.cuda.memory_reserved() / (1024 ** 2)  # Convert to MB
                    else:
                        gpu_usage = None
                        gpu_memory_usage = None
                    camera_connections = [
                        {
                            "cameraIdentifier": camera_id,
                            "modelId": stream['modelId'],
                            "modelName": stream['modelName'],
                            "online": True
                        }
                        for camera_id, stream in streams.items()
                    ]
                    state_report = {
                        "type": "stateReport",
                        "cpuUsage": cpu_usage,
                        "memoryUsage": memory_usage,
                        "gpuUsage": gpu_usage,
                        "gpuMemoryUsage": gpu_memory_usage,
                        "cameraConnections": camera_connections
                    }
                    await websocket.send_text(json.dumps(state_report))
                else:
                    logging.error(f"Unknown message type: {msg_type}")
            except json.JSONDecodeError:
                logging.error("Received invalid JSON message")
            except (WebSocketDisconnect, ConnectionClosedError) as e:
                logging.warning(f"WebSocket disconnected: {e}")
                break 
            except Exception as e:
                logging.error(f"Error handling message: {e}")
                break
    try:
        await websocket.accept()
        task = asyncio.create_task(process_streams())
        heartbeat_task = asyncio.create_task(send_heartbeat())
        message_task = asyncio.create_task(on_message())
        await asyncio.gather(heartbeat_task, message_task)
    except Exception as e:
        logging.error(f"Error in detect websocket: {e}")
    finally:
        task.cancel()
        await task
        with streams_lock:
            for camera_id, stream in streams.items():
                stream['stop_event'].set()
                stream['thread'].join()
                stream['cap'].release()
                stream['buffer'].queue.clear()
                logging.info(f"Released camera {camera_id} and cleaned up resources")
            streams.clear()
        with models_lock:
            models.clear()
        logging.info("WebSocket connection closed")
--- a/debug.py
+++ b/debug.py
@ -0,0 +1,143 @@
 import argparse
 import os
 import cv2
 import time
 import logging
 import shutil
 import threading  # added threading
 import yaml  # for silencing YOLO
 from siwatsystem.pympta import load_pipeline_from_zip, run_pipeline
 # Configure logging
 logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s")
 # Silence YOLO logging
 os.environ["YOLO_VERBOSE"] = "False"
 for logger_name in ["ultralytics", "ultralytics.hub", "ultralytics.yolo.utils"]:
    logging.getLogger(logger_name).setLevel(logging.WARNING)
 # Global variables for frame sharing
 global_frame = None
 global_ret = False
 capture_running = False
 def video_capture_loop(cap):
    global global_frame, global_ret, capture_running
    while capture_running:
        global_ret, global_frame = cap.read()
        time.sleep(0.01)  # slight delay to reduce CPU usage
 def clear_cache(cache_dir: str):
    if os.path.exists(cache_dir):
        shutil.rmtree(cache_dir)
 def log_pipeline_flow(frame, model_tree, level=0):
    """
    Wrapper around run_pipeline that logs the model flow and detection results.
    Returns the same output as the original run_pipeline function.
    """
    indent = "  " * level
    model_id = model_tree.get("modelId", "unknown")
    logging.info(f"{indent}→ Running model: {model_id}")
    detection, bbox = run_pipeline(frame, model_tree, return_bbox=True)
    if detection:
        confidence = detection.get("confidence", 0) * 100
        class_name = detection.get("class", "unknown")
        object_id = detection.get("id", "N/A")
        logging.info(f"{indent}✓ Detected: {class_name} (ID: {object_id}, confidence: {confidence:.1f}%)")
        # Check if any branches were triggered
        triggered = False
        for branch in model_tree.get("branches", []):
            trigger_classes = branch.get("triggerClasses", [])
            min_conf = branch.get("minConfidence", 0)
            if class_name in trigger_classes and detection.get("confidence", 0) >= min_conf:
                triggered = True
                if branch.get("crop", False) and bbox:
                    x1, y1, x2, y2 = bbox
                    cropped_frame = frame[y1:y2, x1:x2]
                    logging.info(f"{indent}  ⌊ Triggering branch with cropped region {x1},{y1} to {x2},{y2}")
                    branch_result = log_pipeline_flow(cropped_frame, branch, level + 1)
                else:
                    logging.info(f"{indent}  ⌊ Triggering branch with full frame")
                    branch_result = log_pipeline_flow(frame, branch, level + 1)
                if branch_result[0]:  # If branch detection successful, return it
                    return branch_result
        if not triggered and model_tree.get("branches"):
            logging.info(f"{indent}  ⌊ No branches triggered")
    else:
        logging.info(f"{indent}✗ No detection for {model_id}")
    return detection, bbox
 def main(mpta_file: str, video_source: str):
    global capture_running
    CACHE_DIR = os.path.join(".", ".mptacache")
    clear_cache(CACHE_DIR)
    logging.info(f"Loading pipeline from local file: {mpta_file}")
    model_tree = load_pipeline_from_zip(mpta_file, CACHE_DIR)
    if model_tree is None:
        logging.error("Failed to load pipeline.")
        return
    cap = cv2.VideoCapture(video_source)
    if not cap.isOpened():
        logging.error(f"Cannot open video source {video_source}")
        return
    # Start video capture in a separate thread
    capture_running = True
    capture_thread = threading.Thread(target=video_capture_loop, args=(cap,))
    capture_thread.start()
    logging.info("Press 'q' to exit.")
    try:
        while True:
            # Use the global frame and ret updated by the thread
            if not global_ret or global_frame is None:
                continue  # wait until a frame is available
            frame = global_frame.copy()  # local copy to work with
            # Replace run_pipeline with our logging version
            detection, bbox = log_pipeline_flow(frame, model_tree)
            # Stop if "honda" is detected
            if detection and detection.get("class", "").lower() == "toyota":
                logging.info("Detected 'toyota'. Stopping pipeline.")
                break
            if bbox:
                x1, y1, x2, y2 = bbox
                cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
                label = detection["class"] if detection else "Detection"
                cv2.putText(frame, label, (x1, y1 - 10),
                            cv2.FONT_HERSHEY_SIMPLEX, 0.9, (36, 255, 12), 2)
            cv2.imshow("Pipeline Webcam", frame)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
    finally:
        # Stop capture thread and cleanup
        capture_running = False
        capture_thread.join()
        cap.release()
        cv2.destroyAllWindows()
        clear_cache(CACHE_DIR)
        logging.info("Cleaned up .mptacache directory on shutdown.")
 if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Run pipeline webcam utility.")
    parser.add_argument("--mpta-file", type=str, required=True, help="Path to the local pipeline mpta (ZIP) file.")
    parser.add_argument("--video", type=str, default="0", help="Video source (default webcam index 0).")
    args = parser.parse_args()
    video_source = int(args.video) if args.video.isdigit() else args.video
    main(args.mpta_file, video_source)
--- a/demoa.mpta
+++ b/demoa.mpta
--- a/pipeline.log
+++ b/pipeline.log
@ -0,0 +1,23 @@
 2025-05-12 18:10:04,590 [INFO] Loading pipeline from local file: demoa.mpta
 2025-05-12 18:10:04,610 [INFO] Copied local .mpta file from demoa.mpta to .\.mptacache\pipeline.mpta
 2025-05-12 18:10:04,901 [INFO] Extracted .mpta file to .\.mptacache
 2025-05-12 18:10:04,905 [INFO] Loading model for node DetectionDraft from .\.mptacache\demoa\DetectionDraft.pt
 2025-05-12 18:10:05,083 [INFO] Loading model for node ClassificationDraft from .\.mptacache\demoa\ClassificationDraft.pt
 2025-05-12 18:10:08,035 [INFO] Press 'q' to exit.
 2025-05-12 18:10:12,217 [INFO] Cleaned up .mptacache directory on shutdown.
 2025-05-12 18:13:08,465 [INFO] Loading pipeline from local file: demoa.mpta
 2025-05-12 18:13:08,512 [INFO] Copied local .mpta file from demoa.mpta to .\.mptacache\pipeline.mpta
 2025-05-12 18:13:08,769 [INFO] Extracted .mpta file to .\.mptacache
 2025-05-12 18:13:08,773 [INFO] Loading model for node DetectionDraft from .\.mptacache\demoa\DetectionDraft.pt
 2025-05-12 18:13:09,083 [INFO] Loading model for node ClassificationDraft from .\.mptacache\demoa\ClassificationDraft.pt
 2025-05-12 18:13:12,187 [INFO] Press 'q' to exit.
 2025-05-12 18:13:14,146 [INFO] → Running model: DetectionDraft
 2025-05-12 18:13:17,119 [INFO] Cleaned up .mptacache directory on shutdown.
 2025-05-12 18:14:25,665 [INFO] Loading pipeline from local file: demoa.mpta
 2025-05-12 18:14:25,687 [INFO] Copied local .mpta file from demoa.mpta to .\.mptacache\pipeline.mpta
 2025-05-12 18:14:25,953 [INFO] Extracted .mpta file to .\.mptacache
 2025-05-12 18:14:25,957 [INFO] Loading model for node DetectionDraft from .\.mptacache\demoa\DetectionDraft.pt
 2025-05-12 18:14:26,138 [INFO] Loading model for node ClassificationDraft from .\.mptacache\demoa\ClassificationDraft.pt
 2025-05-12 18:14:29,171 [INFO] Press 'q' to exit.
 2025-05-12 18:14:30,146 [INFO] → Running model: DetectionDraft
 2025-05-12 18:14:32,080 [INFO] Cleaned up .mptacache directory on shutdown.
--- a/pympta.md
+++ b/pympta.md
@ -1,204 +0,0 @@
 # pympta: Modular Pipeline Task Executor
 `pympta` is a Python module designed to load and execute modular, multi-stage AI pipelines defined in a special package format (`.mpta`). It is primarily used within the detector worker to run complex computer vision tasks where the output of one model can trigger a subsequent model on a specific region of interest.
 ## Core Concepts
 ### 1. MPTA Package (`.mpta`)
 An `.mpta` file is a standard `.zip` archive with a different extension. It bundles all the necessary components for a pipeline to run.
 A typical `.mpta` file has the following structure:
 ```
 my_pipeline.mpta/
 ├── pipeline.json
 ├── model1.pt
 ├── model2.pt
 └── ...
 ```
 - **`pipeline.json`**: (Required) The manifest file that defines the structure of the pipeline, the models to use, and the logic connecting them.
 - **Model Files (`.pt`, etc.)**: The actual pre-trained model files (e.g., PyTorch, ONNX). The pipeline currently uses `ultralytics.YOLO` models.
 ### 2. Pipeline Structure
 A pipeline is a tree-like structure of "nodes," defined in `pipeline.json`.
 - **Root Node**: The entry point of the pipeline. It processes the initial, full-frame image.
 - **Branch Nodes**: Child nodes that are triggered by specific detection results from their parent. For example, a root node might detect a "vehicle," which then triggers a branch node to detect a "license plate" within the vehicle's bounding box.
 This modular structure allows for creating complex and efficient inference logic, avoiding the need to run every model on every frame.
 ## `pipeline.json` Specification
 This file defines the entire pipeline logic. The root object contains a `pipeline` key for the pipeline definition and an optional `redis` key for Redis configuration.
 ### Top-Level Object Structure
 | Key        | Type   | Required | Description                                             |
 | ---------- | ------ | -------- | ------------------------------------------------------- |
 | `pipeline` | Object | Yes      | The root node object of the pipeline.                   |
 | `redis`    | Object | No       | Configuration for connecting to a Redis server.         |
 ### Redis Configuration (`redis`)
 | Key        | Type   | Required | Description                                             |
 | ---------- | ------ | -------- | ------------------------------------------------------- |
 | `host`     | String | Yes      | The hostname or IP address of the Redis server.         |
 | `port`     | Number | Yes      | The port number of the Redis server.                    |
 | `password` | String | No       | The password for Redis authentication.                  |
 | `db`       | Number | No       | The Redis database number to use. Defaults to `0`.      |
 ### Node Object Structure
 | Key                 | Type          | Required | Description                                                                                                                            |
 | ------------------- | ------------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------- |
 | `modelId`           | String        | Yes      | A unique identifier for this model node (e.g., "vehicle-detector").                                                                    |
 | `modelFile`         | String        | Yes      | The path to the model file within the `.mpta` archive (e.g., "yolov8n.pt").                                                            |
 | `minConfidence`     | Float         | Yes      | The minimum confidence score (0.0 to 1.0) required for a detection to be considered valid and potentially trigger a branch.              |
 | `triggerClasses`    | Array<String> | Yes      | A list of class names that, when detected by the parent, can trigger this node. For the root node, this lists all classes of interest. |
 | `crop`              | Boolean       | No       | If `true`, the image is cropped to the parent's detection bounding box before being passed to this node's model. Defaults to `false`.    |
 | `branches`          | Array<Node>   | No       | A list of child node objects that can be triggered by this node's detections.                                                          |
 | `actions`           | Array<Action> | No       | A list of actions to execute upon a successful detection in this node.                                                                 |
 ### Action Object Structure
 Actions allow the pipeline to interact with Redis. They are executed sequentially for a given detection.
 #### Action Context & Dynamic Keys
 All actions have access to a dynamic context for formatting keys and messages. The context is created for each detection event and includes:
 - All key-value pairs from the detection result (e.g., `class`, `confidence`, `id`).
 - `{timestamp_ms}`: The current Unix timestamp in milliseconds.
 - `{uuid}`: A unique identifier (UUID4) for the detection event.
 - `{image_key}`: If a `redis_save_image` action has already been executed for this event, this placeholder will be replaced with the key where the image was stored.
 #### `redis_save_image`
 Saves the current image frame (or cropped sub-image) to a Redis key.
 | Key              | Type   | Required | Description                                                                                             |
 | ---------------- | ------ | -------- | ------------------------------------------------------------------------------------------------------- |
 | `type`           | String | Yes      | Must be `"redis_save_image"`.                                                                           |
 | `key`            | String | Yes      | The Redis key to save the image to. Can contain any of the dynamic placeholders.                        |
 | `expire_seconds` | Number | No       | If provided, sets an expiration time (in seconds) for the Redis key.                                    |
 #### `redis_publish`
 Publishes a message to a Redis channel.
 | Key       | Type   | Required | Description                                                                                             |
 | --------- | ------ | -------- | ------------------------------------------------------------------------------------------------------- |
 | `type`    | String | Yes      | Must be `"redis_publish"`.                                                                              |
 | `channel` | String | Yes      | The Redis channel to publish the message to.                                                            |
 | `message` | String | Yes      | The message to publish. Can contain any of the dynamic placeholders, including `{image_key}`.           |
 ### Example `pipeline.json` with Redis
 This example demonstrates a pipeline that detects vehicles, saves a uniquely named image of each detection that expires in one hour, and then publishes a notification with the image key.
 ```json
 {
  "redis": {
    "host": "redis.local",
    "port": 6379,
    "password": "your-super-secret-password"
  },
  "pipeline": {
    "modelId": "vehicle-detector",
    "modelFile": "vehicle_model.pt",
    "minConfidence": 0.6,
    "triggerClasses": ["car", "truck"],
    "actions": [
      {
        "type": "redis_save_image",
        "key": "detections:{class}:{timestamp_ms}:{uuid}",
        "expire_seconds": 3600
      },
      {
        "type": "redis_publish",
        "channel": "vehicle_events",
        "message": "{\"event\":\"new_detection\",\"class\":\"{class}\",\"confidence\":{confidence},\"image_key\":\"{image_key}\"}"
      }
    ],
    "branches": []
  }
 }
 ```
 ## API Reference
 The `pympta` module exposes two main functions.
 ### `load_pipeline_from_zip(zip_source: str, target_dir: str) -> dict`
 Loads, extracts, and parses an `.mpta` file to build a pipeline tree in memory. It also establishes a Redis connection if configured in `pipeline.json`.
 - **Parameters:**
  - `zip_source` (str): The file path to the local `.mpta` zip archive.
  - `target_dir` (str): A directory path where the archive's contents will be extracted.
 - **Returns:**
  - A dictionary representing the root node of the pipeline, ready to be used with `run_pipeline`. Returns `None` if loading fails.
 ### `run_pipeline(frame, node: dict, return_bbox: bool = False)`
 Executes the inference pipeline on a single image frame.
 - **Parameters:**
  - `frame`: The input image frame (e.g., a NumPy array from OpenCV).
  - `node` (dict): The pipeline node to execute (typically the root node returned by `load_pipeline_from_zip`).
  - `return_bbox` (bool): If `True`, the function returns a tuple `(detection, bounding_box)`. Otherwise, it returns only the `detection`.
 - **Returns:**
  - The final detection result from the last executed node in the chain. A detection is a dictionary like `{'class': 'car', 'confidence': 0.95, 'id': 1}`. If no detection meets the criteria, it returns `None` (or `(None, None)` if `return_bbox` is `True`).
 ## Usage Example
 This snippet, inspired by `pipeline_webcam.py`, shows how to use `pympta` to load a pipeline and process an image from a webcam.
 ```python
 import cv2
 from siwatsystem.pympta import load_pipeline_from_zip, run_pipeline
 # 1. Define paths
 MPTA_FILE = "path/to/your/pipeline.mpta"
 CACHE_DIR = ".mptacache"
 # 2. Load the pipeline from the .mpta file
 # This reads pipeline.json and loads the YOLO models into memory.
 model_tree = load_pipeline_from_zip(MPTA_FILE, CACHE_DIR)
 if not model_tree:
    print("Failed to load pipeline.")
    exit()
 # 3. Open a video source
 cap = cv2.VideoCapture(0)
 while True:
    ret, frame = cap.read()
    if not ret:
        break
    # 4. Run the pipeline on the current frame
    # The function will handle the entire logic tree (e.g., find a car, then find its license plate).
    detection_result, bounding_box = run_pipeline(frame, model_tree, return_bbox=True)
    # 5. Display the results
    if detection_result:
        print(f"Detected: {detection_result['class']} with confidence {detection_result['confidence']:.2f}")
        if bounding_box:
            x1, y1, x2, y2 = bounding_box
            cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
            cv2.putText(frame, detection_result['class'], (x1, y1 - 10),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.9, (36, 255, 12), 2)
    cv2.imshow("Pipeline Output", frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
 cap.release()
 cv2.destroyAllWindows()
 ```
--- a/requirements.txt
+++ b/requirements.txt
@ -5,5 +5,4 @@ torchvision
 ultralytics
 opencv-python
 websockets
-fastapi[standard]
+fastapi[standard]
 redis
--- a/siwatsystem/pympta.py
+++ b/siwatsystem/pympta.py
@ -3,228 +3,69 @@ import json
 import logging
 import torch
 import cv2
 import requests
 import zipfile
 import shutil
 import traceback
 import redis
 import time
 import uuid
 from ultralytics import YOLO
 from urllib.parse import urlparse
-# Create a logger specifically for this module
+def load_pipeline_node(node_config: dict, mpta_dir: str) -> dict:
 logger = logging.getLogger("detector_worker.pympta")
 def load_pipeline_node(node_config: dict, mpta_dir: str, redis_client) -> dict:
    # Recursively load a model node from configuration.
    model_path = os.path.join(mpta_dir, node_config["modelFile"])
    if not os.path.exists(model_path):
-        logger.error(f"Model file {model_path} not found. Current directory: {os.getcwd()}")
+        logging.error(f"Model file {model_path} not found.")
        logger.error(f"Directory content: {os.listdir(os.path.dirname(model_path))}")
        raise FileNotFoundError(f"Model file {model_path} not found.")
-    logger.info(f"Loading model for node {node_config['modelId']} from {model_path}")
+    logging.info(f"Loading model {node_config['modelId']} from {model_path}")
    model = YOLO(model_path)
    if torch.cuda.is_available():
        logger.info(f"CUDA available. Moving model {node_config['modelId']} to GPU")
        model.to("cuda")
    else:
        logger.info(f"CUDA not available. Using CPU for model {node_config['modelId']}")
-    # Prepare trigger class indices for optimization
+    # map triggerClasses names → indices for YOLO
-    trigger_classes = node_config.get("triggerClasses", [])
+    names = model.names  # idx -> class name
-    trigger_class_indices = None
+    trigger_names = node_config.get("triggerClasses", [])
-    if trigger_classes and hasattr(model, "names"):
+    trigger_inds = [i for i, nm in names.items() if nm in trigger_names]
        # Convert class names to indices for the model
        trigger_class_indices = [i for i, name in model.names.items() 
                                if name in trigger_classes]
        logger.debug(f"Converted trigger classes to indices: {trigger_class_indices}")
-    node = {
+    return {
        "modelId": node_config["modelId"],
        "modelFile": node_config["modelFile"],
-        "triggerClasses": trigger_classes,
+        "triggerClasses": trigger_names,
-        "triggerClassIndices": trigger_class_indices,
+        "triggerClassIndices": trigger_inds,
        "crop": node_config.get("crop", False),
-        "minConfidence": node_config.get("minConfidence", None),
+        "minConfidence": node_config.get("minConfidence", 0.0),
        "actions": node_config.get("actions", []),
        "model": model,
-        "branches": [],
+        "branches": [
-        "redis_client": redis_client
+            load_pipeline_node(child, mpta_dir)
            for child in node_config.get("branches", [])
        ]
    }
    logger.debug(f"Configured node {node_config['modelId']} with trigger classes: {node['triggerClasses']}")
    for child in node_config.get("branches", []):
        logger.debug(f"Loading branch for parent node {node_config['modelId']}")
        node["branches"].append(load_pipeline_node(child, mpta_dir, redis_client))
    return node
 def load_pipeline_from_zip(zip_source: str, target_dir: str) -> dict:
    logger.info(f"Attempting to load pipeline from {zip_source} to {target_dir}")
    os.makedirs(target_dir, exist_ok=True)
    zip_path = os.path.join(target_dir, "pipeline.mpta")
    # Parse the source; only local files are supported here.
    parsed = urlparse(zip_source)
    if parsed.scheme in ("", "file"):
-        local_path = parsed.path if parsed.scheme == "file" else zip_source
+        local = parsed.path if parsed.scheme == "file" else zip_source
-        logger.debug(f"Checking if local file exists: {local_path}")
+        if not os.path.exists(local):
-        if os.path.exists(local_path):
+            logging.error(f"Local file {local} does not exist.")
            try:
                shutil.copy(local_path, zip_path)
                logger.info(f"Copied local .mpta file from {local_path} to {zip_path}")
            except Exception as e:
                logger.error(f"Failed to copy local .mpta file from {local_path}: {str(e)}", exc_info=True)
                return None
        else:
            logger.error(f"Local file {local_path} does not exist. Current directory: {os.getcwd()}")
            # List all subdirectories of models directory to help debugging
            if os.path.exists("models"):
                logger.error(f"Content of models directory: {os.listdir('models')}")
                for root, dirs, files in os.walk("models"):
                    logger.error(f"Directory {root} contains subdirs: {dirs} and files: {files}")
            else:
                logger.error("The models directory doesn't exist")
            return None
        shutil.copy(local, zip_path)
    else:
-        logger.error(f"HTTP download functionality has been moved. Use a local file path here. Received: {zip_source}")
+        logging.error("HTTP download not supported; use local file.")
        return None
-    try:
+    with zipfile.ZipFile(zip_path, "r") as z:
-        if not os.path.exists(zip_path):
+        z.extractall(target_dir)
-            logger.error(f"Zip file not found at expected location: {zip_path}")
+    os.remove(zip_path)
            return None
        logger.debug(f"Extracting .mpta file from {zip_path} to {target_dir}")
        # Extract contents and track the directories created
        extracted_dirs = []
        with zipfile.ZipFile(zip_path, "r") as zip_ref:
            file_list = zip_ref.namelist()
            logger.debug(f"Files in .mpta archive: {file_list}")
            # Extract and track the top-level directories
            for file_path in file_list:
                parts = file_path.split('/')
                if len(parts) > 1:
                    top_dir = parts[0]
                    if top_dir and top_dir not in extracted_dirs:
                        extracted_dirs.append(top_dir)
            # Now extract the files
            zip_ref.extractall(target_dir)
        logger.info(f"Successfully extracted .mpta file to {target_dir}")
        logger.debug(f"Extracted directories: {extracted_dirs}")
        # Check what was actually created after extraction
        actual_dirs = [d for d in os.listdir(target_dir) if os.path.isdir(os.path.join(target_dir, d))]
        logger.debug(f"Actual directories created: {actual_dirs}")
    except zipfile.BadZipFile as e:
        logger.error(f"Bad zip file {zip_path}: {str(e)}", exc_info=True)
        return None
    except Exception as e:
        logger.error(f"Failed to extract .mpta file {zip_path}: {str(e)}", exc_info=True)
        return None
    finally:
        if os.path.exists(zip_path):
            os.remove(zip_path)
            logger.debug(f"Removed temporary zip file: {zip_path}")
-    # Use the first extracted directory if it exists, otherwise use the expected name
+    base = os.path.splitext(os.path.basename(zip_source))[0]
-    pipeline_name = os.path.basename(zip_source)
+    mpta_dir = os.path.join(target_dir, base)
-    pipeline_name = os.path.splitext(pipeline_name)[0]
+    cfg = os.path.join(mpta_dir, "pipeline.json")
-    
+    if not os.path.exists(cfg):
-    # Find the directory with pipeline.json
+        logging.error("pipeline.json not found in archive.")
    mpta_dir = None
    # First try the expected directory name
    expected_dir = os.path.join(target_dir, pipeline_name)
    if os.path.exists(expected_dir) and os.path.exists(os.path.join(expected_dir, "pipeline.json")):
        mpta_dir = expected_dir
        logger.debug(f"Found pipeline.json in the expected directory: {mpta_dir}")
    else:
        # Look through all subdirectories for pipeline.json
        for subdir in actual_dirs:
            potential_dir = os.path.join(target_dir, subdir)
            if os.path.exists(os.path.join(potential_dir, "pipeline.json")):
                mpta_dir = potential_dir
                logger.info(f"Found pipeline.json in directory: {mpta_dir} (different from expected: {expected_dir})")
                break
    if not mpta_dir:
        logger.error(f"Could not find pipeline.json in any extracted directory. Directory content: {os.listdir(target_dir)}")
        return None
    pipeline_json_path = os.path.join(mpta_dir, "pipeline.json")
    if not os.path.exists(pipeline_json_path):
        logger.error(f"pipeline.json not found in the .mpta file. Files in directory: {os.listdir(mpta_dir)}")
        return None
-    try:
+    with open(cfg) as f:
-        with open(pipeline_json_path, "r") as f:
+        pipeline_config = json.load(f)
-            pipeline_config = json.load(f)
+    return load_pipeline_node(pipeline_config["pipeline"], mpta_dir)
        logger.info(f"Successfully loaded pipeline configuration from {pipeline_json_path}")
        logger.debug(f"Pipeline config: {json.dumps(pipeline_config, indent=2)}")
        # Establish Redis connection if configured
        redis_client = None
        if "redis" in pipeline_config:
            redis_config = pipeline_config["redis"]
            try:
                redis_client = redis.Redis(
                    host=redis_config["host"],
                    port=redis_config["port"],
                    password=redis_config.get("password"),
                    db=redis_config.get("db", 0),
                    decode_responses=True
                )
                redis_client.ping()
                logger.info(f"Successfully connected to Redis at {redis_config['host']}:{redis_config['port']}")
            except redis.exceptions.ConnectionError as e:
                logger.error(f"Failed to connect to Redis: {e}")
                redis_client = None
        return load_pipeline_node(pipeline_config["pipeline"], mpta_dir, redis_client)
    except json.JSONDecodeError as e:
        logger.error(f"Error parsing pipeline.json: {str(e)}", exc_info=True)
        return None
    except KeyError as e:
        logger.error(f"Missing key in pipeline.json: {str(e)}", exc_info=True)
        return None
    except Exception as e:
        logger.error(f"Error loading pipeline.json: {str(e)}", exc_info=True)
        return None
 def execute_actions(node, frame, detection_result):
    if not node["redis_client"] or not node["actions"]:
        return
    # Create a dynamic context for this detection event
    action_context = {
        **detection_result,
        "timestamp_ms": int(time.time() * 1000),
        "uuid": str(uuid.uuid4()),
    }
    for action in node["actions"]:
        try:
            if action["type"] == "redis_save_image":
                key = action["key"].format(**action_context)
                _, buffer = cv2.imencode('.jpg', frame)
                expire_seconds = action.get("expire_seconds")
                if expire_seconds:
                    node["redis_client"].setex(key, expire_seconds, buffer.tobytes())
                    logger.info(f"Saved image to Redis with key: {key} (expires in {expire_seconds}s)")
                else:
                    node["redis_client"].set(key, buffer.tobytes())
                    logger.info(f"Saved image to Redis with key: {key}")
                # Add the generated key to the context for subsequent actions
                action_context["image_key"] = key
            elif action["type"] == "redis_publish":
                channel = action["channel"]
                message = action["message"].format(**action_context)
                node["redis_client"].publish(channel, message)
                logger.info(f"Published message to Redis channel '{channel}': {message}")
        except Exception as e:
            logger.error(f"Error executing action {action['type']}: {e}")
 def run_pipeline(frame, node: dict, return_bbox: bool=False):
    """
@ -241,6 +82,26 @@ def run_pipeline(frame, node: dict, return_bbox: bool=False):
        task = getattr(node["model"], "task", None)
        # ─── Classification stage ───────────────────────────────────
        # if task == "classify":
        #     results = node["model"].predict(frame, stream=False)
        #     dets = []
        #     for r in results:
        #         probs = r.probs
        #         if probs is not None:
        #             # sort descending
        #             idxs = probs.argsort(descending=True)
        #             for cid in idxs:
        #                 dets.append({
        #                     "class": node["model"].names[int(cid)],
        #                     "confidence": float(probs[int(cid)]),
        #                     "id": None
        #                 })
        #     if not dets:
        #         return (None, None) if return_bbox else None
        #     best = dets[0]
        #     return (best, None) if return_bbox else best
        if task == "classify":
            # run the classifier and grab its top-1 directly via the Probs API
            results = node["model"].predict(frame, stream=False)
@ -263,7 +124,6 @@ def run_pipeline(frame, node: dict, return_bbox: bool=False):
                "confidence": top1_conf,
                "id": None
            }
            execute_actions(node, frame, det)
            return (det, None) if return_bbox else det
@ -312,11 +172,9 @@ def run_pipeline(frame, node: dict, return_bbox: bool=False):
                det2, _ = run_pipeline(sub, br, return_bbox=True)
                if det2:
                    # return classification result + original bbox
                    execute_actions(br, sub, det2)
                    return (det2, best_box) if return_bbox else det2
        # ─── No branch matched → return this detection ─────────────
        execute_actions(node, frame, best_det)
        return (best_det, best_box) if return_bbox else best_det
    except Exception as e:
--- a/test_protocol.py
+++ b/test_protocol.py
@ -1,125 +0,0 @@
 #!/usr/bin/env python3
 """
 Test script to verify the worker implementation follows the protocol
 """
 import json
 import asyncio
 import websockets
 import time
 async def test_protocol():
    """Test the worker protocol implementation"""
    uri = "ws://localhost:8000"
    try:
        async with websockets.connect(uri) as websocket:
            print("✓ Connected to worker")
            # Test 1: Check if we receive heartbeat (stateReport)
            print("\n1. Testing heartbeat...")
            try:
                message = await asyncio.wait_for(websocket.recv(), timeout=5)
                data = json.loads(message)
                if data.get("type") == "stateReport":
                    print("✓ Received stateReport heartbeat")
                    print(f"  - CPU Usage: {data.get('cpuUsage', 'N/A')}%")
                    print(f"  - Memory Usage: {data.get('memoryUsage', 'N/A')}%")
                    print(f"  - Camera Connections: {len(data.get('cameraConnections', []))}")
                else:
                    print(f"✗ Expected stateReport, got {data.get('type')}")
            except asyncio.TimeoutError:
                print("✗ No heartbeat received within 5 seconds")
            # Test 2: Request state
            print("\n2. Testing requestState...")
            await websocket.send(json.dumps({"type": "requestState"}))
            try:
                message = await asyncio.wait_for(websocket.recv(), timeout=5)
                data = json.loads(message)
                if data.get("type") == "stateReport":
                    print("✓ Received stateReport response")
                else:
                    print(f"✗ Expected stateReport, got {data.get('type')}")
            except asyncio.TimeoutError:
                print("✗ No response to requestState within 5 seconds")
            # Test 3: Set session ID
            print("\n3. Testing setSessionId...")
            session_message = {
                "type": "setSessionId",
                "payload": {
                    "displayIdentifier": "display-001",
                    "sessionId": 12345
                }
            }
            await websocket.send(json.dumps(session_message))
            print("✓ Sent setSessionId message")
            # Test 4: Test patchSession
            print("\n4. Testing patchSession...")
            patch_message = {
                "type": "patchSession",
                "sessionId": 12345,
                "data": {
                    "currentCar": {
                        "carModel": "Civic",
                        "carBrand": "Honda"
                    }
                }
            }
            await websocket.send(json.dumps(patch_message))
            # Wait for patchSessionResult
            try:
                message = await asyncio.wait_for(websocket.recv(), timeout=5)
                data = json.loads(message)
                if data.get("type") == "patchSessionResult":
                    print("✓ Received patchSessionResult")
                    print(f"  - Success: {data.get('payload', {}).get('success')}")
                    print(f"  - Message: {data.get('payload', {}).get('message')}")
                else:
                    print(f"✗ Expected patchSessionResult, got {data.get('type')}")
            except asyncio.TimeoutError:
                print("✗ No patchSessionResult received within 5 seconds")
            # Test 5: Test subscribe message format (without actual camera)
            print("\n5. Testing subscribe message format...")
            subscribe_message = {
                "type": "subscribe",
                "payload": {
                    "subscriptionIdentifier": "display-001;cam-001",
                    "snapshotUrl": "http://example.com/snapshot.jpg",
                    "snapshotInterval": 5000,
                    "modelUrl": "http://example.com/model.mpta",
                    "modelName": "Test Model",
                    "modelId": 101,
                    "cropX1": 100,
                    "cropY1": 200,
                    "cropX2": 300,
                    "cropY2": 400
                }
            }
            await websocket.send(json.dumps(subscribe_message))
            print("✓ Sent subscribe message (will fail without actual camera/model)")
            # Listen for a few more messages to catch any errors
            print("\n6. Listening for additional messages...")
            for i in range(3):
                try:
                    message = await asyncio.wait_for(websocket.recv(), timeout=2)
                    data = json.loads(message)
                    msg_type = data.get("type")
                    print(f"  - Received {msg_type}")
                    if msg_type == "error":
                        print(f"    Error: {data.get('error')}")
                except asyncio.TimeoutError:
                    break
            print("\n✓ Protocol test completed successfully!")
    except Exception as e:
        print(f"✗ Connection failed: {e}")
        print("Make sure the worker is running on localhost:8000")
 if __name__ == "__main__":
    asyncio.run(test_protocol())
--- a/worker.md
+++ b/worker.md
@ -1,483 +0,0 @@
 # Worker Communication Protocol
 This document outlines the WebSocket-based communication protocol between the CMS backend and a detector worker. As a worker developer, your primary responsibility is to implement a WebSocket server that adheres to this protocol.
 ## 1. Connection
 The worker must run a WebSocket server, preferably on port `8000`. The backend system, which is managed by a container orchestration service, will automatically discover and establish a WebSocket connection to your worker.
 Upon a successful connection from the backend, you should begin sending `stateReport` messages as heartbeats.
 ## 2. Communication Overview
 Communication is bidirectional and asynchronous. All messages are JSON objects with a `type` field that indicates the message's purpose, and an optional `payload` field containing the data.
 - **Worker -> Backend:** You will send messages to the backend to report status, forward detection events, or request changes to session data.
 - **Backend -> Worker:** The backend will send commands to you to manage camera subscriptions.
 ## 3. Dynamic Configuration via MPTA File
 To enable modularity and dynamic configuration, the backend will send you a URL to a `.mpta` file when it issues a `subscribe` command. This file is a renamed `.zip` archive that contains everything your worker needs to perform its task.
 **Your worker is responsible for:**
 1. Fetching this file from the provided URL.
 2. Extracting its contents.
 3. Interpreting the contents to configure its internal pipeline.
 **The contents of the `.mpta` file are entirely up to the user who configures the model in the CMS.** This allows for maximum flexibility. For example, the archive could contain:
 - AI/ML Models: Pre-trained models for libraries like TensorFlow, PyTorch, or ONNX.
 - Configuration Files: A `config.json` or `pipeline.yaml` that defines a sequence of operations, specifies model paths, or sets detection thresholds.
 - Scripts: Custom Python scripts for pre-processing or post-processing.
 - API Integration Details: A JSON file with endpoint information and credentials for interacting with third-party detection services.
 Essentially, the `.mpta` file is a self-contained package that tells your worker *how* to process the video stream for a given subscription.
 ## 4. Messages from Worker to Backend
 These are the messages your worker is expected to send to the backend.
 ### 4.1. State Report (Heartbeat)
 This message is crucial for the backend to monitor your worker's health and status, including GPU usage.
 - **Type:** `stateReport`
 - **When to Send:** Periodically (e.g., every 2 seconds) after a connection is established.
 **Payload:**
 ```json
 {
  "type": "stateReport",
  "cpuUsage": 75.5,
  "memoryUsage": 40.2,
  "gpuUsage": 60.0,
  "gpuMemoryUsage": 25.1,
  "cameraConnections": [
    {
      "subscriptionIdentifier": "display-001;cam-001",
      "modelId": 101,
      "modelName": "General Object Detection",
      "online": true,
      "cropX1": 100,
      "cropY1": 200,
      "cropX2": 300,
      "cropY2": 400
    }
  ]
 }
 ```
 > **Note:**
 >
 > - `cropX1`, `cropY1`, `cropX2`, `cropY2` (optional, integer) should be included in each camera connection to indicate the crop coordinates for that subscription.
 ### 4.2. Image Detection
 Sent when the worker detects a relevant object. The `detection` object should be flat and contain key-value pairs corresponding to the detected attributes.
 - **Type:** `imageDetection`
 **Payload Example:**
 ```json
 {
  "type": "imageDetection",
  "subscriptionIdentifier": "display-001;cam-001",
  "timestamp": "2025-07-14T12:34:56.789Z",
  "data": {
    "detection": {
      "carModel": "Civic",
      "carBrand": "Honda",
      "carYear": 2023,
      "bodyType": "Sedan",
      "licensePlateText": "ABCD1234",
      "licensePlateConfidence": 0.95
    },
    "modelId": 101,
    "modelName": "US-LPR-and-Vehicle-ID"
  }
 }
 ```
 ### 4.3. Patch Session
 > **Note:** Patch messages are only used when the worker can't keep up and needs to retroactively send detections. Normally, detections should be sent in real-time using `imageDetection` messages. Use `patchSession` only to update session data after the fact.
 Allows the worker to request a modification to an active session's data. The `data` payload must be a partial object of the `DisplayPersistentData` structure.
 - **Type:** `patchSession`
 **Payload Example:**
 ```json
 {
  "type": "patchSession",
  "sessionId": 12345,
  "data": {
    "currentCar": {
        "carModel": "Civic",
        "carBrand": "Honda",
        "licensePlateText": "ABCD1234"
    }
  }
 }
 ```
 The backend will respond with a `patchSessionResult` command.
 #### `DisplayPersistentData` Structure
 The `data` object in the `patchSession` message is merged with the existing `DisplayPersistentData` on the backend. Here is its structure:
 ```typescript
 interface DisplayPersistentData {
    progressionStage: "welcome" | "car_fueling" | "car_waitpayment" | "car_postpayment" | null;
    qrCode: string | null;
    adsPlayback: {
        playlistSlotOrder: number; // The 'order' of the current slot
        adsId: number | null;
        adsUrl: string | null;
    } | null;
    currentCar: {
        carModel?: string;
        carBrand?: string;
        carYear?: number;
        bodyType?: string;
        licensePlateText?: string;
        licensePlateType?: string;
    } | null;
    fuelPump: { /* FuelPumpData structure */ } | null;
    weatherData: { /* WeatherResponse structure */ } | null;
    sessionId: number | null;
 }
 ```
 #### Patching Behavior
 - The patch is a **deep merge**.
 - **`undefined`** values are ignored.
 - **`null`** values will set the corresponding field to `null`.
 - Nested objects are merged recursively.
 ## 5. Commands from Backend to Worker
 These are the commands your worker will receive from the backend.
 ### 5.1. Subscribe to Camera
 Instructs the worker to process a camera's RTSP stream using the configuration from the specified `.mpta` file.
 - **Type:** `subscribe`
 **Payload:**
 ```json
 {
  "type": "subscribe",
  "payload": {
    "subscriptionIdentifier": "display-001;cam-002",
    "rtspUrl": "rtsp://user:pass@host:port/stream",
    "snapshotUrl": "http://go2rtc/snapshot/1",
    "snapshotInterval": 5000,
    "modelUrl": "http://storage/models/us-lpr.mpta",
    "modelName": "US-LPR-and-Vehicle-ID",
    "modelId": 102,
    "cropX1": 100,
    "cropY1": 200,
    "cropX2": 300,
    "cropY2": 400
  }
 }
 ```
 > **Note:**
 >
 > - `cropX1`, `cropY1`, `cropX2`, `cropY2` (optional, integer) specify the crop coordinates for the camera stream. These values are configured per display and passed in the subscription payload. If not provided, the worker should process the full frame.
 >
 > **Important:**
 > If multiple displays are bound to the same camera, your worker must ensure that only **one stream** is opened per camera. When you receive multiple subscriptions for the same camera (with different `subscriptionIdentifier` values), you should:
 >
 > - Open the RTSP stream **once** for that camera if using RTSP.
 > - Capture each snapshot only once per cycle, and reuse it for all display subscriptions sharing that camera.
 > - Capture each frame/image only once per cycle.
 > - Reuse the same captured image and snapshot for all display subscriptions that share the camera, processing and routing detection results separately for each display as needed.
 > This avoids unnecessary load and bandwidth usage, and ensures consistent detection results and snapshots across all displays sharing the same camera.
 ### 5.2. Unsubscribe from Camera
 Instructs the worker to stop processing a camera's stream.
 - **Type:** `unsubscribe`
 **Payload:**
 ```json
 {
  "type": "unsubscribe",
  "payload": {
    "subscriptionIdentifier": "display-001;cam-002"
  }
 }
 ```
 ### 5.3. Request State
 Direct request for the worker's current state. Respond with a `stateReport` message.
 - **Type:** `requestState`
 **Payload:**
 ```json
 {
  "type": "requestState"
 }
 ```
 ### 5.4. Patch Session Result
 Backend's response to a `patchSession` message.
 - **Type:** `patchSessionResult`
 **Payload:**
 ```json
 {
  "type": "patchSessionResult",
  "payload": {
    "sessionId": 12345,
    "success": true,
    "message": "Session updated successfully."
  }
 }
 ```
 ### 5.5. Set Session ID
 Allows the backend to instruct the worker to associate a session ID with a subscription. This is useful for linking detection events to a specific session. The session ID can be `null` to indicate no active session.
 - **Type:** `setSessionId`
 **Payload:**
 ```json
 {
  "type": "setSessionId",
  "payload": {
    "displayIdentifier": "display-001",
    "sessionId": 12345
  }
 }
 ```
 Or to clear the session:
 ```json
 {
  "type": "setSessionId",
  "payload": {
    "displayIdentifier": "display-001",
    "sessionId": null
  }
 }
 ```
 > **Note:**
 >
 > - The worker should store the session ID for the given subscription and use it in subsequent detection or patch messages as appropriate. If `sessionId` is `null`, the worker should treat the subscription as having no active session.
 ## Subscription Identifier Format
 The `subscriptionIdentifier` used in all messages is constructed as:
 ```
 displayIdentifier;cameraIdentifier
 ```
 This uniquely identifies a camera subscription for a specific display.
 ### Session ID Association
 When the backend sends a `setSessionId` command, it will only provide the `displayIdentifier` (not the full `subscriptionIdentifier`).
 **Worker Responsibility:**
 - The worker must match the `displayIdentifier` to all active subscriptions for that display (i.e., all `subscriptionIdentifier` values that start with `displayIdentifier;`).
 - The worker should set or clear the session ID for all matching subscriptions.
 ## 6. Example Communication Log
 This section shows a typical sequence of messages between the backend and the worker. Patch messages are not included, as they are only used when the worker cannot keep up.
 > **Note:** Unsubscribe is triggered when a user removes a camera or when the node is too heavily loaded and needs rebalancing.
 1.  **Connection Established** & **Heartbeat**
    *   **Worker -> Backend**
    ```json
    {
      "type": "stateReport",
      "cpuUsage": 70.2,
      "memoryUsage": 38.1,
      "gpuUsage": 55.0,
      "gpuMemoryUsage": 20.0,
      "cameraConnections": []
    }
    ```
 2.  **Backend Subscribes Camera**
    *   **Backend -> Worker**
    ```json
    {
      "type": "subscribe",
      "payload": {
        "subscriptionIdentifier": "display-001;entry-cam-01",
        "rtspUrl": "rtsp://192.168.1.100/stream1",
        "modelUrl": "http://storage/models/vehicle-id.mpta",
        "modelName": "Vehicle Identification",
        "modelId": 201
      }
    }
    ```
 3.  **Worker Acknowledges in Heartbeat**
    *   **Worker -> Backend**
    ```json
    {
      "type": "stateReport",
      "cpuUsage": 72.5,
      "memoryUsage": 39.0,
      "gpuUsage": 57.0,
      "gpuMemoryUsage": 21.0,
      "cameraConnections": [
        {
          "subscriptionIdentifier": "display-001;entry-cam-01",
          "modelId": 201,
          "modelName": "Vehicle Identification",
          "online": true
        }
      ]
    }
    ```
 4.  **Worker Detects a Car**
    *   **Worker -> Backend**
    ```json
    {
      "type": "imageDetection",
      "subscriptionIdentifier": "display-001;entry-cam-01",
      "timestamp": "2025-07-15T10:00:00.000Z",
      "data": {
        "detection": {
          "carBrand": "Honda",
          "carModel": "CR-V",
          "bodyType": "SUV",
          "licensePlateText": "GEMINI-AI",
          "licensePlateConfidence": 0.98
        },
        "modelId": 201,
        "modelName": "Vehicle Identification"
      }
    }
    ```
    *   **Worker -> Backend**
    ```json
    {
      "type": "imageDetection",
      "subscriptionIdentifier": "display-001;entry-cam-01",
      "timestamp": "2025-07-15T10:00:01.000Z",
      "data": {
        "detection": {
          "carBrand": "Toyota",
          "carModel": "Corolla",
          "bodyType": "Sedan",
          "licensePlateText": "CMS-1234",
          "licensePlateConfidence": 0.97
        },
        "modelId": 201,
        "modelName": "Vehicle Identification"
      }
    }
    ```
    *   **Worker -> Backend**
    ```json
    {
      "type": "imageDetection",
      "subscriptionIdentifier": "display-001;entry-cam-01",
      "timestamp": "2025-07-15T10:00:02.000Z",
      "data": {
        "detection": {
          "carBrand": "Ford",
          "carModel": "Focus",
          "bodyType": "Hatchback",
          "licensePlateText": "CMS-5678",
          "licensePlateConfidence": 0.96
        },
        "modelId": 201,
        "modelName": "Vehicle Identification"
      }
    }
    ```
 5.  **Backend Unsubscribes Camera**
    *   **Backend -> Worker**
    ```json
    {
      "type": "unsubscribe",
      "payload": {
        "subscriptionIdentifier": "display-001;entry-cam-01"
      }
    }
    ```
 6.  **Worker Acknowledges Unsubscription**
    *   **Worker -> Backend**
    ```json
    {
      "type": "stateReport",
      "cpuUsage": 68.0,
      "memoryUsage": 37.0,
      "gpuUsage": 50.0,
      "gpuMemoryUsage": 18.0,
      "cameraConnections": []
    }
    ```
 ## 7. HTTP API: Image Retrieval
 In addition to the WebSocket protocol, the worker exposes an HTTP endpoint for retrieving the latest image frame from a camera.
 ### Endpoint
 ```
 GET /camera/{camera_id}/image
 ```
 - **`camera_id`**: The full `subscriptionIdentifier` (e.g., `display-001;cam-001`).
 ### Response
 - **Success (200):** Returns the latest JPEG image from the camera stream.
    - `Content-Type: image/jpeg`
    - Binary JPEG data.
 - **Error (404):** If the camera is not found or no frame is available.
    - JSON error response.
 - **Error (500):** Internal server error.
 ### Example Request
 ```
 GET /camera/display-001;cam-001/image
 ```
 ### Example Response
 - **Headers:**
    ```
    Content-Type: image/jpeg
    ```
 - **Body:** Binary JPEG image.
 ### Notes
 - The endpoint returns the most recent frame available for the specified camera subscription.
 - If multiple displays share the same camera, each subscription has its own buffer; the endpoint uses the buffer for the given `camera_id`.
 - This API is useful for debugging, monitoring, or integrating with external systems that require direct image access.