MCP (Model Context Protocol)¶

RTSM exposes its spatial memory as MCP tools, so any AI agent can query "where is the coffee mug?" without custom REST code. MCP is the standard protocol for connecting AI models to external tools — supported by Claude, Cursor, LangGraph, CrewAI, and others.

Two Transport Options¶

Transport	Use case	How it works
SSE (embedded)	Agent connects over HTTP	MCP server mounted on RTSM's API server at `/mcp/sse`
stdio (standalone)	Agent launches RTSM MCP as a subprocess	`rtsm-mcp` binary, communicates via stdin/stdout

Quick Start: SSE (embedded)¶

1. Enable in config¶

# config/rtsm.yaml
mcp:
  enable: true

2. Start RTSM¶

pip install "rtsm[gpu]"
python -m rtsm --replay recordings/session1

The MCP endpoint is now live at http://localhost:8002/mcp/sse.

3. Connect your agent¶

Claude Code — add to your MCP settings:

{
  "mcpServers": {
    "rtsm": {
      "url": "http://localhost:8002/mcp/sse"
    }
  }
}

Cursor — add to .cursor/mcp.json:

{
  "mcpServers": {
    "rtsm": {
      "url": "http://localhost:8002/mcp/sse"
    }
  }
}

Python (any agent framework):

from mcp import ClientSession
from mcp.client.sse import sse_client

async with sse_client("http://localhost:8002/mcp/sse") as (read, write):
    async with ClientSession(read, write) as session:
        await session.initialize()

        # Query spatial memory
        result = await session.call_tool(
            "rtsm.semantic_query",
            arguments={"query": "coffee mug", "top_k": 3}
        )
        print(result.content[0].text)

Quick Start: stdio (standalone)¶

The stdio transport is useful when your agent framework launches MCP servers as subprocesses.

1. Install¶

pip install "rtsm[gpu]"

2. Configure your agent¶

Claude Code — add to MCP settings:

{
  "mcpServers": {
    "rtsm": {
      "command": "rtsm-mcp",
      "env": {
        "RTSM_API_URL": "http://localhost:8002"
      }
    }
  }
}

The rtsm-mcp command connects to a running RTSM instance via its REST API and exposes the same 6 tools over stdio.

Available Tools¶

Tool	Description	Example use
`rtsm.semantic_query`	Search by natural language	"where is the coffee mug?"
`rtsm.spatial_query`	Find objects near a 3D point	Objects within 2m of [1.0, 0.5, 0.8]
`rtsm.relational_query`	Find objects near a named object	"what is next to the laptop?"
`rtsm.list_objects`	List all tracked objects	Get full scene inventory
`rtsm.get_object`	Get details for one object	Full info by object ID
`rtsm.status`	System health and stats	Pipeline status, object counts

Tool Parameters¶

rtsm.semantic_query

Parameter	Type	Default	Description
`query`	string	required	Natural language search query
`top_k`	int	5	Max results to return
`threshold`	float	0.2	Min cosine similarity (0-1)

rtsm.spatial_query

Parameter	Type	Default	Description
`x`	float	required	X coordinate (meters)
`y`	float	required	Y coordinate (meters)
`z`	float	required	Z coordinate (meters)
`radius_m`	float	1.0	Search radius in meters

rtsm.relational_query

Parameter	Type	Default	Description
`query`	string	required	Name of reference object
`radius_m`	float	1.0	Search radius in meters

rtsm.get_object

Parameter	Type	Default	Description
`object_id`	string	required	Object ID
`include_vectors`	bool	false	Include embedding vectors

Example: Agent Asks "Where is the mug?"¶

Agent → rtsm.semantic_query(query="mug", top_k=1)

RTSM → {
  "query": "mug",
  "robot_pose": {
    "xyz": [0.12, 0.05, 0.31],
    "quaternion_xyzw": [0.0, 0.0, 0.0, 1.0],
    "timestamp": 1712345678.5
  },
  "results": [
    {
      "id": "obj_231",
      "score": 0.85,
      "label_hint": "mug",
      "confirmed": true,
      "xyz_world": [2.31, -0.15, 1.18]
    }
  ]
}

The agent gets both the mug's position AND the robot's current position in one atomic response. It can immediately compute heading and distance to navigate there — no second API call needed.

All search responses (semantic_query, spatial_query, relational_query) include robot_pose. The status tool also returns it. RTSM stores but does not compute pose — it's a passthrough from the sensor (ARKit, RTABMap, etc.).