HubPublic

Dataset-attached evaluation

Same data. Same question. Different semantic context.

These four examples use static synthetic CSVs from the second `to-prompt` evaluation. Copy a question and its CSV files into your own model session, then compare the raw answer with an answer grounded by the linked rawctx package.

Result summarySemantic context mattered most when the dataset had competing metric columns or lifecycle facts.

Selected by impact score, oracle-quality delta, and semantic change size. This page keeps the public result summary, reusable questions, and CSVs, without operational run logistics.

Top examples

Four packages with the clearest reusable test cases.

Experiment results

What changed after adding `to-prompt` context.

Strong shift

GA4 Data API

Signal
Impact 3/4: material entity and scope shift
Quality
Oracle quality 7 -> 9
Before
The raw answer grouped first-user campaign performance through firstUserSource and firstUserCampaign, then calculated session-source performance separately.
After
The grounded answer used User.newVsReturning for the first-user segment and kept Session.sessionSource, Session.sessionCampaignName, and Ecommerce.ecommercePurchases in their session and purchase scopes.
Strong shift

Shopify Orders

Signal
Impact 3/4: material metric and numeric shift
Quality
Oracle quality 6 -> 9
Before
The raw answer used totalPriceAmount for GMV and AOV, producing larger GMV figures such as web 165 and retail 290.
After
The grounded answer followed the package metric definition and used currentTotalPriceAmount, producing web 150, retail 280, and mobile 120 for GMV.
Directional shift

HubSpot Marketing

Signal
Impact 3/4 in the material-shift case
Quality
Oracle quality 8 -> 10
Before
The raw answer was unstable: it sometimes used the precomputed email_*_rate fields and sometimes recalculated rates from raw counts and emails_delivered.
After
The grounded answer consistently used email_open_rate, email_click_rate, email_bounce_rate, email_unsubscribe_rate, and email_spam_report_rate.
Directional shift

Salesforce Revenue Usage

Signal
Impact 3/4 in the material-shift case
Quality
Oracle quality 8 -> 10
Before
The raw answer sometimes added commitment 1,000 and overage 240 together, presenting 1,240 as the billing amount.
After
The grounded answer treated UsageRatableSummary.TotalAmount = 240 as the final overage charge and kept the commitment policy as a separate lifecycle caveat.

Strong shift

GA4 Data API

Context changed first-user analysis from decoy source/campaign fields toward User.newVsReturning and session-scoped attribution.

@pasar6987/ga4-data-api@1.0.1

Question

Using the attached synthetic GA4 dataset, compare first-user segments with session source/campaign performance, then calculate purchase conversions and purchase revenue.

User.csv

Download
userId,newVsReturning,firstSessionDate,firstUserSource,firstUserCampaign
U1,new,2026-04-01,newsletter,spring_launch
U2,returning,2025-12-15,google,always_on
U3,new,2026-04-02,partner,co_sale
U4,returning,2025-11-20,newsletter,retention

Session.csv

Download
sessionId,userId,sessionSource,sessionMedium,sessionCampaignName,sessionManualSource
S1,U1,google,cpc,spring_sale,google
S2,U1,email,newsletter,spring_sale,email
S3,U2,google,organic,(not set),google
S4,U3,referral,partner,co_sale,partner
S5,U4,email,newsletter,retention,email
S6,U2,google,cpc,spring_sale,google

Event.csv

Download
eventName,sessionId,isKeyEvent,conversions,sessionKeyEventRate,userKeyEventRate
purchase_s1,S1,true,1,1,1
view_s2,S2,false,0,0,0
purchase_s3,S3,true,1,1,1
add_to_cart_s4,S4,false,0,0,0
purchase_s5,S5,true,1,1,1
view_s6,S6,false,0,0,0

Ecommerce.csv

Download
transactionId,eventName,purchaseRevenue,ecommercePurchases,transactions
T1,purchase_s1,120,1,1
T2,purchase_s3,80,1,1
T3,purchase_s5,200,1,1

Strong shift

Shopify Orders

Context changed GMV and AOV from totalPriceAmount to the semantic metric definition based on currentTotalPriceAmount.

@pasar6987/shopify-orders@0.1.0

Question

Using the attached Shopify orders, fulfillments, and abandoned checkouts dataset, calculate GMV, AOV, refund rate, cart abandonment rate, and fulfillment time by sales channel.

orders.csv

Download
id,customerId,createdAt,sourceName,currentTotalPriceAmount,totalRefundedAmount,totalPriceAmount,displayFinancialStatus,displayFulfillmentStatus,discountCode
O1,C1,2026-04-01T10:00:00,web,100,0,105,PAID,FULFILLED,SPRING
O2,C2,2026-04-01T11:00:00,web,50,10,60,PARTIALLY_REFUNDED,FULFILLED,
O3,C3,2026-04-02T09:00:00,retail,200,0,200,PAID,FULFILLED,VIP
O4,C4,2026-04-02T12:00:00,retail,80,80,90,REFUNDED,CANCELLED,
O5,C5,2026-04-03T13:00:00,mobile_app,120,0,120,PAID,FULFILLED,APP

fulfillments.csv

Download
fulfillmentId,orderId,createdAt
F1,O1,2026-04-02T10:00:00
F2,O2,2026-04-03T11:00:00
F3,O3,2026-04-02T21:00:00
F4,O5,2026-04-05T13:00:00

abandonedCheckouts.csv

Download
checkoutId,customerId,createdAt,sourceName
AC1,C1,2026-04-01T09:30:00,web
AC2,C6,2026-04-01T14:30:00,web
AC3,C4,2026-04-02T10:00:00,retail
AC4,C7,2026-04-03T16:00:00,mobile_app

Directional shift

HubSpot Marketing

Context stabilized the answer on precomputed email_*_rate metrics instead of recalculating from raw event counts.

@pasar6987/hubspot-marketing@1.0.0

Question

Using the attached HubSpot EmailMarketing dataset, compare open, click, bounce, unsubscribe, and spam-report rates by A/B variant and recipient list.

EmailMarketing.csv

Download
email_campaign_id,email_campaign_name,email_campaign_type,hs_email_ab_variant,hs_email_list_id,hs_email_list_name,emails_delivered,email_opens,email_clicks,email_bounces,email_unsubscribes,email_spam_reports,email_open_rate,email_click_rate,email_bounce_rate,email_unsubscribe_rate,email_spam_report_rate
E1,Spring Promo,A/B test,A,L_enterprise,Enterprise Leads,1000,450,90,20,10,2,0.46,0.095,0.020,0.010,0.002
E2,Spring Promo,A/B test,B,L_enterprise,Enterprise Leads,1000,400,110,10,15,5,0.405,0.112,0.010,0.015,0.005
E3,Spring Promo,A/B test,A,L_smb,SMB Leads,500,150,30,30,8,1,0.310,0.062,0.060,0.016,0.002
E4,Spring Promo,A/B test,B,L_smb,SMB Leads,500,180,45,25,5,0,0.365,0.090,0.050,0.010,0.000

Directional shift

Salesforce Revenue Usage

Context separated commitment policy from the final ratable overage charge and grounded the billing flow in Salesforce usage objects.

@pasar6987/sf-revenue-usage@1.0.1

Question

Using the attached Salesforce Revenue Usage dataset, model account A1's usage-based billing flow through entitlement, bucket, usage summary, overage, commitment, and ratable charge, then calculate the amount.

UsageResource.csv

Download
Id,Name,Code,Category,Status
UR1,API Calls,API_CALLS,transaction,active

TransactionUsageEntitlement.csv

Download
Id,AccountId,AssetId,UsageResourceId,EntitlementQuantity,TransactionQuantity,ValidityPeriodTerm,ValidityPeriodUnit,ChargeForOverage,DrawdownOrder
TUE1,A1,AS1,UR1,100000,100000,1,month,true,1

UsageEntitlementAccount.csv

Download
Id,AccountId,ProductId,BillingPeriodStartDateTime,BillingPeriodEndDateTime,BillDayOfMonth
UEA1,A1,P_API,2026-04-01T00:00:00,2026-04-30T23:59:59,30

UsageEntitlementBucket.csv

Download
Id,ParentId,TransactionUsageEntitlementId,UsageResourceId,BucketBalance,ConsumedEntitlement,TotalAsOfBalance,TotalConsumedEntitlement,CompletedRollovers
UEB1,UEA1,TUE1,UR1,20000,80000,20000,80000,0

UsageSummary.csv

Download
Id,AccountId,UsageEntitlementAccountId,UsageEntitlementBucketId,UsageResourceId,StartDateTime,EndDateTime,ConsumptionUnits,DebitedUnits,OverageUnits,RatableSummaryId,Status
US1,A1,UEA1,UEB1,UR1,2026-04-01T00:00:00,2026-04-30T23:59:59,112000,100000,12000,RS1,rated

UsageBillingPeriodItem.csv

Download
Id,AccountId,UsageEntitlementAccountId,UsageEntitlementBucketId,UsageResourceId,TotalUsedQuantity,OverageQuantity,Status
UBI1,A1,UEA1,UEB1,UR1,112000,12000,rated

UsageRatableSummary.csv

Download
Id,AccountId,UsageEntitlementAccountId,UsageEntitlementBucketId,UsageResourceId,OverageQuantity,TierQuantity,NetUnitRate,TotalAmount,Status
RS1,A1,UEA1,UEB1,UR1,12000,12000,0.02,240,complete

UsageCommitmentPolicy.csv

Download
Id,Name,CommitmentRate
UCP1,Monthly API Commitment,1000

UsageOveragePolicy.csv

Download
Id,Name,OverageChargeable
UOP1,Charge API Overage,true