Quick Start¶
This guide walks you through running RTSM and making your first semantic query.
1. Start RTSM¶
RTSM expects an RGB-D stream with poses via ZeroMQ. Start the main service:
This launches:
| Service | Address |
|---|---|
| REST API | http://localhost:8000 |
| WebSocket (visualization) | ws://localhost:8081 |
2. Verify It's Running¶
You should see system stats including frame count, object count, and memory usage.
3. List Detected Objects¶
Once frames are streaming, objects will appear in memory:
Response:
4. Semantic Search¶
Ask natural language queries:
Response:
{
"query": "red mug",
"results": [
{
"id": "b7d4e2",
"label": "mug",
"xyz": [0.8, 0.2, 1.5],
"score": 0.82
}
]
}
5. View in 3D (Optional)¶
Open the visualization demo in your browser:
This shows a Three.js point cloud with detected objects overlaid.
Next Steps¶
- Configuration — Tune thresholds and endpoints
- REST API Reference — Full API documentation
- RTAB-Map Setup — Connect your SLAM system