HubPublic

Python reference

Know the exact objects and helper functions you get back

This page stays API-shaped: public functions, clients, payload keys, and the object fields that matter in notebooks, services, and agents.

Sync surface
rawctx.search()rawctx.info()rawctx.snapshot_download()rawctx.download()rawctx.load()rawctx.to_prompt()

These helpers open a short-lived client for you and are the fastest way to wire rawctx into scripts.

Client surface
RawctxClientAsyncRawctxClientsearch()info()load()diff()

Use clients when you need a single configured registry, token, or timeout across multiple calls in a worker or notebook session.

Diff surface
semantic_diff()prompt_diff()diff_artifacts()client.diff()

Diff helpers return JSON-shaped reports. They compare artifacts only and do not query a warehouse or source system.

Reference

Top-level API surface

These calls cover public package review, local snapshot handoff, and artifact-only diffing.

import rawctx

search_result = rawctx.search("stripe subscriptions", sort="recent")
info_result = rawctx.info("@pasar6987/stripe-subscriptions")
snapshot_dir = rawctx.snapshot_download("@pasar6987/stripe-subscriptions")
model = rawctx.load("@pasar6987/stripe-subscriptions")
prompt = rawctx.to_prompt(
    "@pasar6987/stripe-subscriptions",
    datasets=["subscriptions", "invoices"],
    max_tokens=2000,
)
semantic = rawctx.semantic_diff("./pkg-v1", "./pkg-v2")
prompt_report = rawctx.prompt_diff("./pkg-v1", "./pkg-v2", max_tokens=2000)
combined = rawctx.diff_artifacts("./pkg-v1", "./pkg-v2")

Reference

LoadedPackage fields

load() normalizes OSI and native MetricFlow packages into the same runtime shape so your code does not branch on package lane.

LoadedPackage(
    package="@pasar6987/stripe-subscriptions",
    version="1.0.0",
    model_paths=["models/subscriptions.yml", "models/invoices.yml"],
    snapshot_dir=Path("..."),
    format_name="metricflow",
    datasets=["subscriptions", "invoices"],
    measures=[Measure(name="mrr", dataset="subscriptions", ...)],
    metrics=[Metric(name="net_revenue", dataset="subscriptions", ...)],
    dimensions=[Dimension(name="subscription_status", dataset="subscriptions", ...)],
    relationships=[Relationship(name="invoice_to_subscription", ...)],
    models=[SemanticModel(name="subscriptions", path="models/subscriptions.yml", ...)],
)
to_prompt() call flow
  • Top-level helpers open a short-lived client, then generate prompt-ready context from the resolved package snapshot.
  • The same normalized semantic objects used by load() are used for prompt generation.
  • Package metadata, dataset filters, and prompt budget settings shape the rendered output.
datasets filtering
  • None includes every loaded dataset.
  • String and list inputs are accepted; order is preserved and duplicates are ignored.
  • Unknown names fail with UsageError: Unknown dataset(s).
  • Subset prompts keep metrics for selected datasets and relationships touching selected datasets.
Format-specific enrichment
  • OSI packages and native MetricFlow packages are normalized before prompt rendering.
  • The prompt can include useful semantic labels, keys, time defaults, metrics, and relationships when present in the package.
  • Both lanes render through the same public SDK surface.

Prompt shape

to_prompt() output sections

The section order is stable so agents and prompt diff reports can compare context predictably.

Domain: {domain} ({package_name})

Models:
- semantic model summaries for the selected datasets

Datasets:
- dataset labels, source, keys, grain, time defaults, measures, metrics, dimensions

Metrics:
- metric names, types, and dependencies when available

Relationships:
- relevant joins for the selected datasets

Prompt budget

How max_tokens is applied

The cap is a practical prompt-size target. rawctx prioritizes the semantic objects most useful for agents and compacts detail when the requested budget is tight.

- max_tokens is a practical size target, not a model-specific tokenizer guarantee
- rawctx keeps high-signal semantic context first and compacts lower-priority detail when needed
- use return_context=True to inspect selected objects, estimated size, render hash, omissions, and warnings
search() return shape
  • items: package entries that mirror rawctx Hub search results.
  • meta: pagination plus engine details such as query expansion mode when available.
info() return shape
  • package: package-level facts such as scope, origin, source, license, and README.
  • versions: published version records, including checksum and summary metadata when exposed.
  • selected_version: populated when the package ref pins an exact version.
  • model_paths and model_paths_version: quick access to declared model files for the selected or latest version.
Failure conditions
  • UsageError: invalid refs, missing datasets, invalid paging, or unsupported invocation.
  • AuthRequiredError: claim, publish, favorites, or private access without a token.
  • RegistryError: remote request or OAuth flow failures.
  • ValidationError: manifest, OSI, or MetricFlow file validation problems.
  • OfflineCacheMissError: offline reads without a cached archive or cached package metadata.

Related docs

Use the failure guide when you need shell or CI handling

Python exceptions and CLI failures line up, but the handling surface is different. Use the error guide for shell exit behavior, auth gating, and offline-cache failure notes.