Macropay
Products
Billing
Usage-based billingSubscriptionsSeats & licensesCredits & walletsTrialsDiscounts
Payments
Payments & checkoutPayouts & coverage
Compliance
Merchant of RecordTax & VATDisputes & fraud
Intelligence
Margin & cost insightsAgents & MCP billingRevenue analytics
Solutions
By AI model
AI Inference & modelsAI agents & copilotsAPIs & usage products
By business
SaaS & subscriptionsDev tools & GitHub appsDigital goodsMarketplaces & platforms
PricingCoveragevs StripeJournalDocsStart free
SolutionFor teams selling model inference.

Monetize every
token you serve.

Put the OpenAI-compatible Macropay proxy in front of your models and every request is metered, priced and margin-tracked as it streams. Bill input and output tokens per model, net your provider cost, and ship a tax-compliant invoice — all on one ledger.

OpenAI-compatible proxyPer-token, per-modelLive margin
proxy.macropay.ai / v1 / chat
drop-in
# point your SDK at the proxybase_url = "https://proxy.macropay.ai/v1"api_key  = "mk_live_…"# tokens metered + billed as they streamclient.chat.completions.create( model="gpt-4o", customer="cus_8Xa2")
Billing inference for
Sonnet LabsVectorlyParserNorthwind AIGlyphRelay
The problem

Inference is a
cost until you bill it.

Generic billing tools weren't built for per-token economics, streaming responses, or netting model cost against revenue. Macropay was.

Meter as it streams

The proxy counts input and output tokens in real time — no post-hoc log parsing, no nightly reconciliation.

Margin per model

Attach your provider cost and see gross margin per model and per customer the instant the request completes.

Tax-compliant invoices

Every invoice carries the right VAT/GST line for the buyer — we're the merchant of record, so you never file.

PRICING MODELS

Price inference your way.

Per-token, per-request, per-image or per-second — with included allowances, tiered overage, prepaid credits and per-customer rates. Mark up the model, pass through at cost, or bundle into a plan.

  • Separate input/output token prices per model
  • Prepaid credit packs that draw down on use
  • Volume tiers that step the rate down automatically
See platform pricing
gpt-4o · in$2.50 / 1M
gpt-4o · out$10.00 / 1M
claude-opus · blended$18.00 / 1M
self-hosted · gpu-min$0.012 / min
Blended margin61.8%
Why Macropay

From a single call
to global revenue.

The whole path — proxy, meter, price, margin, tax, payout — on one stack.

<800ms
Request → billable, p99
Any
Model or provider
60+
Tax jurisdictions
4.5%
+ $0.50 all-in
Questions

Inference, answered.

Do I have to change my code?
Barely. Point your existing OpenAI-compatible SDK at the Macropay proxy base URL and pass a customer id. Tokens are metered and billed as they stream — no other changes required.
Can I bill models I host myself?
Yes. Self-hosted and third-party models are metered the same way — per token, per request, or per GPU-minute — and you can attach your true cost for margin.
How do I handle streaming responses?
The proxy counts output tokens as the stream completes, so streamed responses are billed accurately without you parsing logs afterward.
Is the tax really handled?
Yes — Macropay is the merchant of record, so VAT/GST/sales tax is calculated, collected, filed and remitted for you. More on MoR →

Turn inference
into revenue.

Point your SDK at the proxy and start billing tokens today. Flat 4.5% + $0.50, all-in.