Python Reference

Sync surface

rawctx.search()rawctx.info()rawctx.snapshot_download()rawctx.download()rawctx.load()rawctx.to_prompt()

These helpers open a short-lived client for you and are the fastest way to wire rawctx into scripts.

Client surface

RawctxClientAsyncRawctxClientsearch()info()load()diff()

Use clients when you need a single configured registry, token, or timeout across multiple calls in a worker or notebook session.

Diff surface

semantic_diff()prompt_diff()diff_artifacts()client.diff()

Diff helpers return JSON-shaped reports. They compare artifacts only and do not query a warehouse or source system.

Reference

Top-level API surface

These calls cover public package review, local snapshot handoff, and artifact-only diffing.

import rawctx

search_result = rawctx.search("stripe subscriptions", sort="recent")
info_result = rawctx.info("@pasar6987/stripe-subscriptions")
snapshot_dir = rawctx.snapshot_download("@pasar6987/stripe-subscriptions")
model = rawctx.load("@pasar6987/stripe-subscriptions")
prompt = rawctx.to_prompt(
    "@pasar6987/stripe-subscriptions",
    datasets=["subscriptions", "invoices"],
    max_tokens=2000,
)
semantic = rawctx.semantic_diff("./pkg-v1", "./pkg-v2")
prompt_report = rawctx.prompt_diff("./pkg-v1", "./pkg-v2", max_tokens=2000)
combined = rawctx.diff_artifacts("./pkg-v1", "./pkg-v2")

Reference

LoadedPackage fields

load() normalizes OSI and native MetricFlow packages into the same runtime shape so your code does not branch on package lane.

LoadedPackage(
    package="@pasar6987/stripe-subscriptions",
    version="1.0.0",
    model_paths=["models/subscriptions.yml", "models/invoices.yml"],
    snapshot_dir=Path("..."),
    format_name="metricflow",
    datasets=["subscriptions", "invoices"],
    measures=[Measure(name="mrr", dataset="subscriptions", ...)],
    metrics=[Metric(name="net_revenue", dataset="subscriptions", ...)],
    dimensions=[Dimension(name="subscription_status", dataset="subscriptions", ...)],
    relationships=[Relationship(name="invoice_to_subscription", ...)],
    models=[SemanticModel(name="subscriptions", path="models/subscriptions.yml", ...)],
)

to_prompt() call flow

Top-level helpers open a short-lived client, then generate prompt-ready context from the resolved package snapshot.
The same normalized semantic objects used by load() are used for prompt generation.
Package metadata, dataset filters, and prompt budget settings shape the rendered output.

datasets filtering

None includes every loaded dataset.
String and list inputs are accepted; order is preserved and duplicates are ignored.
Unknown names fail with UsageError: Unknown dataset(s).
Subset prompts keep metrics for selected datasets and relationships touching selected datasets.

Format-specific enrichment

OSI packages and native MetricFlow packages are normalized before prompt rendering.
The prompt can include useful semantic labels, keys, time defaults, metrics, and relationships when present in the package.
Both lanes render through the same public SDK surface.

Prompt shape

`to_prompt()` output sections

The section order is stable so agents and prompt diff reports can compare context predictably.

Domain: {domain} ({package_name})

Models:
- semantic model summaries for the selected datasets

Datasets:
- dataset labels, source, keys, grain, time defaults, measures, metrics, dimensions

Metrics:
- metric names, types, and dependencies when available

Relationships:
- relevant joins for the selected datasets

Prompt budget

How `max_tokens` is applied

The cap is a practical prompt-size target. rawctx prioritizes the semantic objects most useful for agents and compacts detail when the requested budget is tight.

- max_tokens is a practical size target, not a model-specific tokenizer guarantee
- rawctx keeps high-signal semantic context first and compacts lower-priority detail when needed
- use return_context=True to inspect selected objects, estimated size, render hash, omissions, and warnings

search() return shape

items: package entries that mirror rawctx Hub search results.
meta: pagination plus engine details such as query expansion mode when available.

info() return shape

package: package-level facts such as scope, origin, source, license, and README.
versions: published version records, including checksum and summary metadata when exposed.
selected_version: populated when the package ref pins an exact version.
model_paths and model_paths_version: quick access to declared model files for the selected or latest version.

Failure conditions

UsageError: invalid refs, missing datasets, invalid paging, or unsupported invocation.
AuthRequiredError: claim, publish, favorites, or private access without a token.
RegistryError: remote request or OAuth flow failures.
ValidationError: manifest, OSI, or MetricFlow file validation problems.
OfflineCacheMissError: offline reads without a cached archive or cached package metadata.

Related docs

Use the failure guide when you need shell or CI handling

Python exceptions and CLI failures line up, but the handling surface is different. Use the error guide for shell exit behavior, auth gating, and offline-cache failure notes.

Back to Python guide Open failure handling Open CLI-first guide

Know the exact objects and helper functions you get back

Top-level API surface

LoadedPackage fields

to_prompt() output sections

How max_tokens is applied

Use the failure guide when you need shell or CI handling

`to_prompt()` output sections

How `max_tokens` is applied