MCP performance and rate limits

Loomi Connect MCP applies rate limits and caches responses to protect service stability. This page covers performance considerations, how limits and cache behave, and what your client should do.

Performance

Dashboard-level APIs: Tools call the same APIs that power the Bloomreach UI, not dedicated backend endpoints. Responses can be slower or less consistent than a production API.
Analytics latency: Ad-hoc analytics computations can take 10 to 30 seconds on large datasets.
Cross-project calls: Cross-project overview tools make one API call per project and can be slow on organizations with many projects.

Rate limits

Rate limits prevent burst traffic from overwhelming downstream systems.

A single request can be subject to multiple limits. All applicable limits must pass before the tool runs.

If a request exceeds a limit, the MCP returns an error with retry guidance. Back off and retry once the suggested wait time has passed.

⚠️
Warning
The server returns the following error message in response if a tool call is rate limited:
Too many requests: rate limit reached for key '<key>' (<limit>). Retry after ~<N> second(s).

Downstream project APIs impose their own limits as well. Handle throttling errors even after a request passes MCP-level checks.

Handle rate limit errors

Apply retry and backoff logic in your client. Exponential backoff is a common pattern.

Caching

Loomi Connect MCP caches responses to reduce load on the Engagement API and improve latency. Expect some reads to come from cache. If your workflow depends on near-real-time data, account for a short delay between when a change is written and when it becomes visible in MCP reads.

Cache lifetimes

Cache lifetimes vary by resource type:

Resources that change frequently are cached for only a few seconds.
More stable metadata caches for longer.