All products Conversational data

MarcoPolo

Governed conversational access to distributed enterprise data — query across sources in plain English without dismantling the governance model that keeps those sources safe.

Read-only Execution enforcement
5+ sources Supported data types
RBAC Workspace isolation
Full audit Every query & refresh

What MarcoPolo does

Enterprise data is fragmented by necessity: structured tables in relational databases, unstructured documents in NoSQL stores, flat files in object storage, spreadsheets in shared drives. Bringing these together for analysis normally requires either a data engineer and a week of pipeline work, or a governance compromise that exposes more than it should.

MarcoPolo's answer: govern the access, not just the query. Natural language questions are translated into validated query plans. Read-only execution is enforced at the engine level, not just in the user interface. DuckDB stitches cross-source results in memory within configurable row and memory limits. Every answer is attributable to the sources it came from.

Dashboard creation follows the same rules. When a user pins a query result as a dashboard and refreshes it a week later, the refresh runs through identical RBAC policies and datasource allowlists as the original query — not a cached permissive shortcut.

Core capabilities

Multi-source natural language query

Query PostgreSQL, MongoDB, S3-compatible object storage, JSON files, and Excel spreadsheets — from a single natural language question, without writing SQL or building connectors.

Read-only execution enforcement

Query plans are validated before execution. No writes, no schema changes, no privilege escalation. Read-only is enforced at the engine level, not enforced by trust.

DuckDB-powered cross-source joins

Results from multiple sources are joined in memory by DuckDB within strict row count and memory limits. No permanent intermediate tables; no data movement outside the defined execution boundary.

Persistent governed dashboards

Pin any query result as a persistent dashboard. Dashboard refreshes re-run the underlying query through the same RBAC and allowlist policies as the original — no governance bypass at refresh time.

Workspace isolation & RBAC

Team-scoped workspaces with role-based access control. Each user sees only the datasources on their allowlist. Cross-workspace access is not possible without explicit permission grants.

Full query audit trail

Every query and dashboard refresh is logged — who asked, what sources were accessed, what rows were returned, and which policies were applied. Audit records are protected from modification.

How it works

  1. Natural language query intake

    The user submits a question in plain English from within their permitted workspace. MarcoPolo identifies the relevant datasources based on the question semantics and the user's allowlist.

  2. Query plan generation and validation

    A structured query plan is generated for each relevant datasource. Plans are validated against a schema of permitted operations before any execution begins — read-only operations only, bounded by row and column limits.

  3. Per-source execution

    Validated plans execute against each permitted datasource. Results are returned as bounded datasets — excess rows are truncated, not silently dropped; users are informed when limits apply.

  4. Cross-source stitching with DuckDB

    Where the query spans multiple sources, DuckDB joins the result sets in memory within the configured execution limits. No intermediate data is written to persistent storage during this step.

  5. Answer delivery and audit log write

    The final answer is presented in natural language with source attribution. The complete execution — query plan, sources accessed, rows returned, user identity, policies applied — is written to the audit log.

Supported data sources

Source type Details
PostgreSQLStandard relational queries; column and row-level permissions honoured
MongoDBDocument queries with field projection; collection allowlists enforced
S3-compatible storageAWS S3, Cloudflare R2, MinIO; object and prefix allowlists
JSON filesStructured and newline-delimited JSON; path-based access controls
Excel / CSVWorkbooks and flat files; sheet and column allowlists

Get the MarcoPolo white paper

Query architecture, governance model, execution limits, RBAC design, and deployment guide — available on request.

Blog