Efficiently Tracking AI Usage with Vercel AI SDK and Stripe V2 API
タグ:
Unlock high-throughput AI usage tracking by combining Vercel AI SDK with Stripe’s V2 Meter EventStream API, enabling seamless token consumption metering and billing for modern AI applications at scale.
目次
In modern AI application development, accurately tracking API usage has become a critical component. This is especially true when building systems that need to measure token consumption and bill users accordingly.
Stripe’s usage-based billing APIs provide a convenient solution for this challenge, with their recently released V2 API offering significant improvements. In this article, I’ll show you how to implement high-throughput AI usage tracking by combining the Stripe V2 API with Vercel AI SDK.
Evolution of Stripe’s Usage Metering APIs and Their Differences
As of February 21, 2025, Stripe’s Agent Toolkit SDK for usage metering still uses the v1 Meter Events API.

While the v1 Meter Event API can handle up to 1,000 requests per second, the Meter EventStream API in V2 dramatically increases this limit to 10,000 requests per second. This is a huge advantage for services where many users are simultaneously interacting with AI models.
Prerequisites Before Implementation
Before using the V2 API, you’ll need:
- A Stripe account
- Meters created for your project
- A basic application structure using Vercel AI SDK
You’ll also need to set up the following environment variables:
STRIPE_SECRET_API_KEY
: Your Stripe secret API keySTRIPE_CUSTOMER_ID
: The Stripe customer ID you want to meter usage forSTRIPE_METER_NAME_INPUT
: The meter name for input tokensSTRIPE_METER_NAME_OUTPUT
: The meter name for output tokensCLAUDE_API_KEY
: Your Anthropic API key (or whichever AI provider you’re using)
Implementation Differences Between Meter Events and Meter EventStream
Beyond the rate limit differences, a key aspect of the V2 API is its requirement for stateless authentication sessions. You create a session with a 15-minute validity period, then use the authentication token from that session to call the Stripe API until it expires. This means you’ll need a simple way to maintain state, such as cookies.
Here’s the minimal setup. Note that you need to initialize the Stripe class twice, and the Stripe class that calls the Meter EventStream API must use the authentication_token
obtained from v2.billing.meterEventSession
:
const meterEventSession = await new Stripe(config.secretKey, {
apiVersion: '2025-01-27.acacia'
}).v2.billing.meterEventSession.create();
const stripe = new Stripe(meterEventSession.authentication_token)
In practice, you’d want to create a session using meterEventSession.id
or meterEventSession.expires_at
with cookies or other state management to reuse or recreate sessions as needed.
Building a Custom Metering Middleware
The V2 API isn’t supported by the AI AgentToolkit SDK yet, so we’ll build our own implementation based on the SDK’s source code.
Here’s a usage-based billing middleware customized for Vercel AI SDK:
type StripeUsageBasedBillingMiddlewareConfig = {
secretKey: string
billing?: {
type?: 'token';
customer: string;
meters: {
input?: string;
output?: string;
};
};
};
export const createStripeUsageBasedBillingMiddleware = (config: StripeUsageBasedBillingMiddlewareConfig): LanguageModelV1Middleware => {
const bill = async ({
promptTokens,
completionTokens,
}: {
promptTokens: number;
completionTokens: number;
}) => {
const meterEventSession = await new Stripe(config.secretKey, {
apiVersion: '2025-01-27.acacia'
}).v2.billing.meterEventSession.create();
const stripe = new Stripe(meterEventSession.authentication_token)
if (config.billing) {
if (config.billing.meters.input) {
await stripe.v2.billing.meterEventStream.create({
events: [{
event_name: config.billing.meters.input,
payload: {
stripe_customer_id: config.billing.customer,
value: promptTokens.toString(),
}
}]
});
}
if (config.billing.meters.output) {
await stripe.v2.billing.meterEventStream.create({
events: [{
event_name: config.billing.meters.output,
payload: {
stripe_customer_id: config.billing.customer,
value: completionTokens.toString(),
}
}]
});
}
}
};
return {
wrapGenerate: async ({doGenerate}) => {
const result = await doGenerate();
if (config.billing) {
await bill(result.usage);
}
return result;
},
wrapStream: async ({doStream}) => {
const {stream, ...rest} = await doStream();
const transformStream = new TransformStream<
LanguageModelV1StreamPart,
LanguageModelV1StreamPart
>({
async transform(chunk, controller) {
if (chunk.type === 'finish') {
if (config.billing) {
await bill(chunk.usage);
}
}
controller.enqueue(chunk);
},
});
return {
stream: stream.pipeThrough(transformStream),
...rest,
};
},
};
}
This code measures both input (prompt) tokens and output (completion) tokens from the AI model and records them using the Stripe Meter EventStream API.
All you need to do is pass this to the middleware
parameter of wrapLanguageModel()
:
const myStripeToolKit = createStripeUsageBasedBillingMiddleware({
secretKey: STRIPE_SECRET_API_KEY,
billing: {
customer: STRIPE_CUSTOMER_ID,
meters: {
input: STRIPE_METER_NAME_INPUT,
output: STRIPE_METER_NAME_OUTPUT
}
}
})
const model = wrapLanguageModel({
model: createAnthropic({
apiKey: CLAUDE_API_KEY
})('claude-3-5-sonnet-20241022'),
middleware: [myStripeToolKit],
})
One important thing to note: Meter EventStream API request logs don’t appear in the Stripe workbench request logs. This is an intentional design decision based on the assumption that these APIs will handle large volumes of requests.
Understanding the Custom Metering Middleware Code
Let’s break down the key components of our middleware implementation:
1. Configuration Object Structure
type StripeUsageBasedBillingMiddlewareConfig = {
secretKey: string
billing?: {
type?: 'token';
customer: string;
meters: {
input?: string;
output?: string;
};
};
};
In this type definition:
secretKey
: The Stripe secret key needed for authenticationbilling
: An optional object containing billing settingstype
: Currently only supports ‘token’ (for future expandability)customer
: The Stripe customer ID to track usage formeters
: An object specifying meter names for input and output
2. Billing Function
const bill = async ({
promptTokens,
completionTokens,
}: {
promptTokens: number;
completionTokens: number;
}) => {
const meterEventSession = await new Stripe(config.secretKey, {
apiVersion: '2025-01-27.acacia'
}).v2.billing.meterEventSession.create();
const stripe = new Stripe(meterEventSession.authentication_token)
if (config.billing) {
if (config.billing.meters.input) {
await stripe.v2.billing.meterEventStream.create({
events: [{
event_name: config.billing.meters.input,
payload: {
stripe_customer_id: config.billing.customer,
value: promptTokens.toString(),
}
}]
});
}
// Similar handling for output tokens...
}
};
This bill
function:
- Creates a new Meter EventSession to get an authentication token
- Initializes a new Stripe instance using that token
- Sends events to the respective meters for input and output tokens, if configured
An important detail here is that the value
is converted to a string. The Stripe API expects numerical values as strings, not as numbers.
3. Middleware Object Structure
return {
wrapGenerate: async ({doGenerate}) => {
const result = await doGenerate();
if (config.billing) {
await bill(result.usage);
}
return result;
},
wrapStream: async ({doStream}) => {
const {stream, ...rest} = await doStream();
const transformStream = new TransformStream<
LanguageModelV1StreamPart,
LanguageModelV1StreamPart
>({
async transform(chunk, controller) {
if (chunk.type === 'finish') {
if (config.billing) {
await bill(chunk.usage);
}
}
controller.enqueue(chunk);
},
});
return {
stream: stream.pipeThrough(transformStream),
...rest,
};
},
};
Vercel AI SDK middleware has two important functions:
wrapGenerate
This function is called when using the AI model in non-streaming mode. After the response is fully generated, it performs billing using the result object that includes usage information.
wrapStream
This function is called when using the AI model in streaming mode. It uses a TransformStream
to transform the response stream, only performing billing when it reaches the end of the stream (chunk.type === 'finish'
), passing all chunks through unchanged until then.
This approach allows the same billing logic to work in both streaming and non-streaming modes. In streaming mode particularly, this ensures we can bill based on the final usage after all tokens have been generated, without disrupting the user experience.
Real-World Use Cases
This implementation is particularly effective in the following scenarios:
1. SaaS AI Assistant Services
For SaaS services that offer different usage plans to users and bill based on token consumption, high-speed and reliable usage tracking is essential. The V2 API’s high throughput really shines in scenarios where many users are simultaneously using AI features.
2. Enterprise AI Solutions
For customized AI solutions targeting large enterprises, you often need to accurately track usage by department or use case. Using Stripe’s V2 API allows you to efficiently process large volumes of API calls while maintaining detailed usage statistics.
3. Multi-tenant AI Platforms
For platforms providing AI services to multiple companies or organizations, precise usage tracking per tenant is crucial. The V2 API’s high rate limits ensure you can scale your service while maintaining accurate usage-based billing.
Troubleshooting and Important Considerations
Authentication Token Expiration
The Meter EventStream API’s authentication tokens have a 15-minute validity period. Without proper handling of token expiration, you may encounter “authentication errors.” Consider implementing logic to refresh tokens slightly before they expire (e.g., when 5 minutes remain).
Error Handling
In high-traffic environments, temporary network errors may occur. It’s recommended to implement retry logic:
async function sendMeterEvent(stripe, eventData, retries = 3) {
try {
return await stripe.v2.billing.meterEventStream.create(eventData);
} catch (error) {
if (retries > 0 && error.type === 'api_connection_error') {
// Wait a bit and retry on connection errors
await new Promise(resolve => setTimeout(resolve, 500));
return sendMeterEvent(stripe, eventData, retries - 1);
}
console.error('Failed to send meter event after retries:', error);
// Additional error handling like sending to error logging service
throw error;
}
}
Batch Processing Optimization
When sending multiple events, batch processing is more efficient than sending individual events:
await stripe.v2.billing.meterEventStream.create({
events: [
{
event_name: config.billing.meters.input,
payload: { ... }
},
{
event_name: config.billing.meters.output,
payload: { ... }
}
// Up to 100 events can be sent at once
]
});
Conclusion and Advanced Applications
I’ve covered implementing AI usage tracking with Stripe’s V2 API, which enables sending usage data at 10x the throughput of the previous API, making it possible to operate large-scale AI services.
Building on what we’ve discussed, here are some potential next steps:
- Analytics Integration: Connect Stripe usage data with data warehouses like BigQuery or Redshift for detailed usage analysis
- Real-time Dashboards: Build admin dashboards to visualize AI usage in real-time
- Prepaid Model Implementation: Combine with a prepaid token model, deducting from token balance as usage occurs
- Multi-model Management: Create integrated systems to manage multiple AI providers (OpenAI, Anthropic, Google, etc.) and track usage and costs across models
By leveraging Stripe V2 API’s high throughput, you can manage billing for larger and more complex AI systems. This implementation is particularly valuable for services with high concurrent connections or business models requiring precise, token-level billing.
Finally, since Stripe’s V2 API is relatively new, it may continue to evolve. I recommend regularly checking Stripe’s documentation to incorporate the latest features and improvements.