SolutionFor teams selling model inference.

Monetize every
token you serve.

Put the OpenAI-compatible Macropay proxy in front of your models and every request is metered, priced and margin-tracked as it streams. Bill input and output tokens per model, net your provider cost, and ship a tax-compliant invoice — all on one ledger.

Start billing inference How metering works

OpenAI-compatible proxyPer-token, per-modelLive margin

proxy.macropay.ai / v1 / chat

drop-in

# point your SDK at the proxybase_url = "https://proxy.macropay.ai/v1"api_key  = "mk_live_…"# tokens metered + billed as they streamclient.chat.completions.create( model="gpt-4o", customer="cus_8Xa2")

The problem

Inference is a
cost until you bill it.

Generic billing tools weren't built for per-token economics, streaming responses, or netting model cost against revenue. Macropay was.

Meter as it streams

The proxy counts input and output tokens in real time — no post-hoc log parsing, no nightly reconciliation.

Margin per model

Attach your provider cost and see gross margin per model and per customer the instant the request completes.

Tax-compliant invoices

Every invoice carries the right VAT/GST line for the buyer — we're the merchant of record, so you never file.

PRICING MODELS

Price inference your way.

Per-token, per-request, per-image or per-second — with included allowances, tiered overage, prepaid credits and per-customer rates. Mark up the model, pass through at cost, or bundle into a plan.

Separate input/output token prices per model
Prepaid credit packs that draw down on use
Volume tiers that step the rate down automatically

See platform pricing

gpt-4o · in$2.50 / 1M

gpt-4o · out$10.00 / 1M

claude-opus · blended$18.00 / 1M

self-hosted · gpu-min$0.012 / min

Blended margin61.8%

Why Macropay

From a single call
to global revenue.

The whole path — proxy, meter, price, margin, tax, payout — on one stack.

<800ms

Request → billable, p99

Any

Model or provider

60+

Tax jurisdictions

4.5%

+ $0.50 all-in

Questions

Inference, answered.

Do I have to change my code?

Barely. Point your existing OpenAI-compatible SDK at the Macropay proxy base URL and pass a customer id. Tokens are metered and billed as they stream — no other changes required.

Can I bill models I host myself?

Yes. Self-hosted and third-party models are metered the same way — per token, per request, or per GPU-minute — and you can attach your true cost for margin.

How do I handle streaming responses?

The proxy counts output tokens as the stream completes, so streamed responses are billed accurately without you parsing logs afterward.

Is the tax really handled?

Yes — Macropay is the merchant of record, so VAT/GST/sales tax is calculated, collected, filed and remitted for you. More on MoR →

Turn inference
into revenue.

Point your SDK at the proxy and start billing tokens today. Flat 4.5% + $0.50, all-in.

Start free See usage billing

Monetize everytoken you serve.

Inference is acost until you bill it.