diff --git a/claude.md b/claude.md
index 75f833d..8547b53 100644
--- a/claude.md
+++ b/claude.md
@@ -122,7 +122,7 @@ jpeg_data = encoder.encode(nv_image, "jpeg", encode_params)
 
 ## Performance Metrics
 
-### VRAM Usage (Python Process)
+### VRAM Usage (at 720p)
 
 | Streams | Total VRAM | Overhead | Per Stream | Marginal Cost |
 |---------|-----------|----------|------------|---------------|
@@ -134,24 +134,10 @@ jpeg_data = encoder.encode(nv_image, "jpeg", encode_params)
 
 **Result:** Perfect linear scaling at ~60 MB per stream
 
-### Capacity Estimates
-
-With 60 MB per stream + 216 MB baseline:
-
-- **16GB GPU**: ~269 cameras (conservative: ~250)
-- **24GB GPU**: ~407 cameras (conservative: ~380)
-- **48GB GPU**: ~815 cameras (conservative: ~780)
-- **For 1000 streams**: ~60GB VRAM required
-
-### Throughput
-
-- **Frame Rate**: 7-7.5 FPS per stream @ 720p
-- **JPEG Encoding**: 1-2ms per frame
-- **Connection Time**: ~15s for stream stabilization
-
 ## Project Structure
 
 ```
+
 python-rtsp-worker/
 ├── app.py                      # FastAPI application
 ├── services/
@@ -166,23 +152,11 @@ python-rtsp-worker/
 ├── requirements.txt           # Python dependencies
 ├── .env                       # Camera URLs (gitignored)
 ├── .env.example              # Template for camera URLs
+
 └── .gitignore
 
 ```
 
-## Dependencies
-
-```
-fastapi                    # Web framework
-uvicorn[standard]         # ASGI server
-torch                     # GPU tensor operations
-PyNvVideoCodec            # NVDEC hardware decoding
-av                        # FFmpeg/RTSP client
-cuda-python               # CUDA driver bindings
-nvidia-nvimgcodec-cu12    # nvJPEG encoding
-python-dotenv             # Environment variables
-```
-
 ## Configuration
 
 ### Environment Variables (.env)
@@ -319,48 +293,6 @@ python test_jpeg_encode.py
 **Cause**: Likely CUDA context cleanup order issues
 **Workaround**: Functionality works correctly; cleanup errors can be ignored
 
-## Technical Decisions
-
-### Why PyNvVideoCodec?
-- Direct access to NVDEC hardware decoder
-- Minimal overhead compared to FFmpeg/torchaudio
-- Returns GPU tensors via DLPack
-- Better control over decode sessions
-
-### Why Shared CUDA Context?
-- Reduces VRAM from ~200MB to ~60MB per stream (70% savings)
-- Enables 1000-stream target on 60GB GPU
-- Minimal complexity overhead with singleton pattern
-
-### Why nvImageCodec?
-- GPU-native JPEG encoding (nvJPEG)
-- Zero-copy with PyTorch via `__cuda_array_interface__`
-- 1-2ms encoding time per 720p frame
-- Keeps data on GPU until final compression
-
-### Why Thread-Safe Ring Buffer?
-- Decouples decoding from inference pipeline
-- Prevents frame drops during processing spikes
-- Allows async frame access
-- Configurable buffer size per stream
-
-## Future Considerations
-
-### Hardware Decode Session Limits
-- NVIDIA GPUs typically support 5-30 concurrent decode sessions
-- May need multiple GPUs for 1000 streams
-- Test with actual hardware to verify limits
-
-### Scaling Beyond 1000 Streams
-- Multi-GPU support with context per GPU
-- Load balancing across GPUs
-- Network bandwidth considerations
-
-### TensorRT Integration
-- Next step: Integrate with TensorRT inference pipeline
-- GPU frames → TensorRT → Results
-- Keep entire pipeline on GPU
-
 ## References
 
 - [PyNvVideoCodec Documentation](https://developer.nvidia.com/pynvvideocodec)