Troubleshooting
This guide provides comprehensive troubleshooting techniques for diagnosing and resolving issues with Stream. Use these tools and methods to identify problems with camera connections, performance issues, and missing detection events.
Heartbeat Server Monitoring
The Heartbeat server provides real-time status information for automated monitoring and troubleshooting.
Key Characteristics
- Update frequency: Every 10 seconds with timestamp of last update per camera
- Optimal polling interval: 10-11 seconds (more frequent polling provides no additional benefit)
- Data scope: Current status only, no historical tracking or time window averaging
- Primary use: Liveness checks for automated services like Kubernetes
Camera Status Indicators
Starting Status
- Initial state: Cameras begin with
"starting"
status and"health": -1
- Normal transition: Status updates to
"running"
with current health value when frame processing begins - Stuck in starting: If camera remains in
"starting"
status for more than 20-30 seconds, indicates connection issues:- Network interface is down or lagging
- Camera is blocked by firewall
- Connection established but no frames received
GPU and Jetson versions may remain in "starting"
state longer due to initial model compilation time.
Stopped Status
- Triggers: Connection dropped and cannot be recovered, or initial connection failure
- Behavior: Stream continues periodic reconnection attempts
- Recovery: Status and health return to
"running"
when connection is restored
Making Time-Based Decisions
While the Heartbeat server provides only current status, you can implement time-based monitoring by:
- Polling periodically and storing results in a file or database
- Analyzing patterns across multiple updates
- Setting thresholds for consecutive problematic states
Example decision logic:
- Camera stuck in
"starting"
for 2+ Heartbeat updates = connection issue - Camera shows
"stopped"
status = connection dropped, monitor for recovery
Log-based Debugging
Enable debug logging for comprehensive troubleshooting by adding -e LOGGING=10
to your Docker run command or Docker Compose configuration.
Most error messages are documented in the error messages guide. Key areas for missing vehicle diagnostics include Health Score monitoring, RTSP error analysis, and webhook troubleshooting.
Health Score Analysis
Health Score represents the percentage of received frames successfully processed by Stream.
Monitoring and Interpretation
- Reporting frequency: Every 5 seconds
- Calculation window: Amortized over 20-30 seconds
- Acceptable range: 75% or higher
- Warning threshold: Below 75% indicates performance issues
Performance Drop Indicators
When Health Score drops significantly, Stream logs additional system information:
Drop in performance. {'disk_usage_percent': ..., 'loadavg': ..., 'cpu_count': ..., 'cpu_percent': ..., 'mem_total_mb': ..., 'mem_percent': ...}
This early warning appears before the drop is reflected in the reported Health Score due to amortization.
Diagnostic Patterns
System-wide drops (affecting multiple cameras):
- Cause: Hardware bottleneck
- Solution: Upgrade hardware or reduce processing load
Camera-specific drops (affecting individual cameras):
- Cause: Network issues (buffering, frame loss, corrupted frames)
- Diagnosis: Isolate problematic cameras and test individually
Connection Testing Methods
Test with problematic cameras only:
- Configure Stream to process only the affected cameras
- Monitor if the Health Score pattern persists
- Persistent issues indicate unstable camera-to-Stream connection
VLC verification:
# Test RTSP stream stability from the Stream machine
vlc rtsp://camera_url
Watch for artifacts, jittering, or freezing during extended viewing.
FFmpeg dump for headless systems:
If the machine running Stream does not have a graphical interface (i.e., it's running in headless mode), you won’t be able to use visual tools like VLC to check the camera stream.
In such cases, an alternative is to use ffmpeg
to record the RTSP stream into a video file. This allows you to inspect the stream quality later on any device.
ffmpeg -i rtsp://camera_url -c copy output_file.mp4
This command connects to the camera stream and saves the data directly to output.mp4 without re-encoding (-c copy), making it easy to review the footage for freezing, artifacts, or connection issues.
RTSP Error Messages
Debug logging (-e LOGGING=10
) reveals RTSP error messages normally suppressed during operation.
Typical Error Format
[hevc @ 0xFFFFFFF] error message details
Where hevc
may be replaced with other codecs like rtsp
, rtmp
, etc.
Normal vs. Problematic Levels
- Normal: ~5 errors every 10-15 minutes per camera
- Problematic: Frequent errors (averaged per camera) indicating connection issues
Diagnostic Actions
When experiencing frequent RTSP errors:
- Visual verification: Check camera functionality using VLC or similar player
- Connection testing: Test from the same machine running Stream
- Network analysis: Review network stability and bandwidth
Webhook Troubleshooting
Webhook delivery issues can cause missing vehicle data even when detection is working correctly.
Error Message Format
Webhook: Some error [Error details]: detailed_error_information
These messages are prominently displayed and easy to identify in logs.
Common Webhook Issues
- Network connectivity: Webhook target unreachable
- Server errors: Target server returning error responses
- Authentication: Invalid credentials or authentication failures
- Payload issues: Malformed data or unsupported content types
Debugging Steps
- Test webhook endpoint independently
- Verify network connectivity from Stream machine
- Check authentication credentials and methods
- Review payload format and content expectations
Video Quality Diagnostics
When hardware capacity allows, video output features provide visual confirmation of Stream's processing behavior.
Enabling Video Diagnostics
Configure these parameters for troubleshooting:
- video_format: Save video clips of detections
- video_overlay: Add visual overlays to clips
Benefits and Limitations
Benefits:
- Visual confirmation of detection quality
- Frame-by-frame analysis capability
- Overlay information for debugging
Limitations:
- Resource intensive: Requires additional hardware capacity
- Not scalable: Suitable only for per-camera troubleshooting
- Storage requirements: Generates significant disk usage
Use Cases
- Quality verification: Confirm video input quality matches expectations
- Detection validation: Verify Stream is processing expected areas
- Performance analysis: Identify frame drops or processing delays
Video diagnostics significantly increase CPU, memory, and storage usage. Use only when dedicated troubleshooting capacity is available or when running isolated camera tests.
Best Practices
- Enable debug logging (
-e LOGGING=10
) when troubleshooting - Monitor Health Score trends over time, not individual readings
- Test camera connections independently using VLC or FFmpeg
- Isolate problematic cameras for focused diagnosis
- Use Heartbeat server for automated monitoring and alerting
- Document patterns observed during troubleshooting for future reference