Efficiently Tracking AI Usage with Vercel AI SDK and Stripe V2 API

Unlock high-throughput AI usage tracking by combining Vercel AI SDK with Stripe’s V2 Meter EventStream API, enabling seamless token consumption metering and billing for modern AI applications at scale.

In modern AI application development, accurately tracking API usage has become a critical component. This is especially true when building systems that need to measure token consumption and bill users accordingly.

Stripe’s usage-based billing APIs provide a convenient solution for this challenge, with their recently released V2 API offering significant improvements. In this article, I’ll show you how to implement high-throughput AI usage tracking by combining the Stripe V2 API with Vercel AI SDK.

Evolution of Stripe’s Usage Metering APIs and Their Differences

As of February 21, 2025, Stripe’s Agent Toolkit SDK for usage metering still uses the v1 Meter Events API.

While the v1 Meter Event API can handle up to 1,000 requests per second, the Meter EventStream API in V2 dramatically increases this limit to 10,000 requests per second. This is a huge advantage for services where many users are simultaneously interacting with AI models.

Prerequisites Before Implementation

Before using the V2 API, you’ll need:

A Stripe account

Meters created for your project

A basic application structure using Vercel AI SDK

You’ll also need to set up the following environment variables:

STRIPE_SECRET_API_KEY: Your Stripe secret API key

STRIPE_CUSTOMER_ID: The Stripe customer ID you want to meter usage for

STRIPE_METER_NAME_INPUT: The meter name for input tokens

STRIPE_METER_NAME_OUTPUT: The meter name for output tokens

CLAUDE_API_KEY: Your Anthropic API key (or whichever AI provider you’re using)

Implementation Differences Between Meter Events and Meter EventStream

Beyond the rate limit differences, a key aspect of the V2 API is its requirement for stateless authentication sessions. You create a session with a 15-minute validity period, then use the authentication token from that session to call the Stripe API until it expires. This means you’ll need a simple way to maintain state, such as cookies.

Here’s the minimal setup. Note that you need to initialize the Stripe class twice, and the Stripe class that calls the Meter EventStream API must use the authentication_token obtained from v2.billing.meterEventSession:

const meterEventSession = await new Stripe(config.secretKey, {
  apiVersion: '2025-01-27.acacia'
}).v2.billing.meterEventSession.create();
const stripe = new Stripe(meterEventSession.authentication_token)

In practice, you’d want to create a session using meterEventSession.id or meterEventSession.expires_at with cookies or other state management to reuse or recreate sessions as needed.

Building a Custom Metering Middleware

The V2 API isn’t supported by the AI AgentToolkit SDK yet, so we’ll build our own implementation based on the SDK’s source code.

Here’s a usage-based billing middleware customized for Vercel AI SDK:

type StripeUsageBasedBillingMiddlewareConfig = {
  secretKey: string
  billing?: {
    type?: 'token';
    customer: string;
    meters: {
      input?: string;
      output?: string;
    };
  };
};


export const createStripeUsageBasedBillingMiddleware = (config: StripeUsageBasedBillingMiddlewareConfig): LanguageModelV1Middleware => {
  const bill = async ({
    promptTokens,
    completionTokens,
  }: {
    promptTokens: number;
    completionTokens: number;
  }) => {
    const meterEventSession = await new Stripe(config.secretKey, {
      apiVersion: '2025-01-27.acacia'
    }).v2.billing.meterEventSession.create();
    const stripe = new Stripe(meterEventSession.authentication_token)
    
    if (config.billing) {
      if (config.billing.meters.input) {
        await stripe.v2.billing.meterEventStream.create({
          events: [{
            event_name: config.billing.meters.input,
            payload: {
              stripe_customer_id: config.billing.customer,
              value: promptTokens.toString(),
            }
          }]
        });
      }
      if (config.billing.meters.output) {
        await stripe.v2.billing.meterEventStream.create({
          events: [{
            event_name: config.billing.meters.output,
            payload: {
              stripe_customer_id: config.billing.customer,
              value: completionTokens.toString(),
            }
          }]
        });
      }
    }
  };

  return {
    wrapGenerate: async ({doGenerate}) => {
      const result = await doGenerate();

      if (config.billing) {
        await bill(result.usage);
      }

      return result;
    },

    wrapStream: async ({doStream}) => {
      const {stream, ...rest} = await doStream();

      const transformStream = new TransformStream<
        LanguageModelV1StreamPart,
        LanguageModelV1StreamPart
      >({
        async transform(chunk, controller) {
          if (chunk.type === 'finish') {
            if (config.billing) {
              await bill(chunk.usage);
            }
          }

          controller.enqueue(chunk);
        },
      });

      return {
        stream: stream.pipeThrough(transformStream),
        ...rest,
      };
    },
  };
}

This code measures both input (prompt) tokens and output (completion) tokens from the AI model and records them using the Stripe Meter EventStream API.

All you need to do is pass this to the middleware parameter of wrapLanguageModel():

const myStripeToolKit = createStripeUsageBasedBillingMiddleware({
  secretKey: STRIPE_SECRET_API_KEY,
  billing: {
    customer: STRIPE_CUSTOMER_ID,
    meters: {
      input: STRIPE_METER_NAME_INPUT,
      output: STRIPE_METER_NAME_OUTPUT
    }
  }
})
const model = wrapLanguageModel({
  model: createAnthropic({
    apiKey: CLAUDE_API_KEY
  })('claude-3-5-sonnet-20241022'),
  middleware: [myStripeToolKit],
})

One important thing to note: Meter EventStream API request logs don’t appear in the Stripe workbench request logs. This is an intentional design decision based on the assumption that these APIs will handle large volumes of requests.

Understanding the Custom Metering Middleware Code

Let’s break down the key components of our middleware implementation:

1. Configuration Object Structure

type StripeUsageBasedBillingMiddlewareConfig = {
  secretKey: string
  billing?: {
    type?: 'token';
    customer: string;
    meters: {
      input?: string;
      output?: string;
    };
  };
};

In this type definition:

secretKey: The Stripe secret key needed for authentication

billing: An optional object containing billing settings
- type: Currently only supports ‘token’ (for future expandability)
- customer: The Stripe customer ID to track usage for
- meters: An object specifying meter names for input and output

2. Billing Function

const bill = async ({
  promptTokens,
  completionTokens,
}: {
  promptTokens: number;
  completionTokens: number;
}) => {
  const meterEventSession = await new Stripe(config.secretKey, {
    apiVersion: '2025-01-27.acacia'
  }).v2.billing.meterEventSession.create();
  const stripe = new Stripe(meterEventSession.authentication_token)
  
  if (config.billing) {
    if (config.billing.meters.input) {
      await stripe.v2.billing.meterEventStream.create({
        events: [{
          event_name: config.billing.meters.input,
          payload: {
            stripe_customer_id: config.billing.customer,
            value: promptTokens.toString(),
          }
        }]
      });
    }
    // Similar handling for output tokens...
  }
};

This bill function:

Creates a new Meter EventSession to get an authentication token

Initializes a new Stripe instance using that token

Sends events to the respective meters for input and output tokens, if configured

An important detail here is that the value is converted to a string. The Stripe API expects numerical values as strings, not as numbers.

3. Middleware Object Structure

return {
  wrapGenerate: async ({doGenerate}) => {
    const result = await doGenerate();

    if (config.billing) {
      await bill(result.usage);
    }

    return result;
  },

  wrapStream: async ({doStream}) => {
    const {stream, ...rest} = await doStream();

    const transformStream = new TransformStream<
      LanguageModelV1StreamPart,
      LanguageModelV1StreamPart
    >({
      async transform(chunk, controller) {
        if (chunk.type === 'finish') {
          if (config.billing) {
            await bill(chunk.usage);
          }
        }

        controller.enqueue(chunk);
      },
    });

    return {
      stream: stream.pipeThrough(transformStream),
      ...rest,
    };
  },
};

Vercel AI SDK middleware has two important functions:

wrapGenerate

This function is called when using the AI model in non-streaming mode. After the response is fully generated, it performs billing using the result object that includes usage information.

wrapStream

This function is called when using the AI model in streaming mode. It uses a TransformStream to transform the response stream, only performing billing when it reaches the end of the stream (chunk.type === 'finish'), passing all chunks through unchanged until then.

This approach allows the same billing logic to work in both streaming and non-streaming modes. In streaming mode particularly, this ensures we can bill based on the final usage after all tokens have been generated, without disrupting the user experience.

Real-World Use Cases

This implementation is particularly effective in the following scenarios:

1. SaaS AI Assistant Services

For SaaS services that offer different usage plans to users and bill based on token consumption, high-speed and reliable usage tracking is essential. The V2 API’s high throughput really shines in scenarios where many users are simultaneously using AI features.

2. Enterprise AI Solutions

For customized AI solutions targeting large enterprises, you often need to accurately track usage by department or use case. Using Stripe’s V2 API allows you to efficiently process large volumes of API calls while maintaining detailed usage statistics.

3. Multi-tenant AI Platforms

For platforms providing AI services to multiple companies or organizations, precise usage tracking per tenant is crucial. The V2 API’s high rate limits ensure you can scale your service while maintaining accurate usage-based billing.

Troubleshooting and Important Considerations

Authentication Token Expiration

The Meter EventStream API’s authentication tokens have a 15-minute validity period. Without proper handling of token expiration, you may encounter “authentication errors.” Consider implementing logic to refresh tokens slightly before they expire (e.g., when 5 minutes remain).

Error Handling

In high-traffic environments, temporary network errors may occur. It’s recommended to implement retry logic:

async function sendMeterEvent(stripe, eventData, retries = 3) {
  try {
    return await stripe.v2.billing.meterEventStream.create(eventData);
  } catch (error) {
    if (retries > 0 && error.type === 'api_connection_error') {
      // Wait a bit and retry on connection errors
      await new Promise(resolve => setTimeout(resolve, 500));
      return sendMeterEvent(stripe, eventData, retries - 1);
    }
    console.error('Failed to send meter event after retries:', error);
    // Additional error handling like sending to error logging service
    throw error;
  }
}

Batch Processing Optimization

When sending multiple events, batch processing is more efficient than sending individual events:

await stripe.v2.billing.meterEventStream.create({
  events: [
    {
      event_name: config.billing.meters.input,
      payload: { ... }
    },
    {
      event_name: config.billing.meters.output,
      payload: { ... }
    }
    // Up to 100 events can be sent at once
  ]
});

Conclusion and Advanced Applications

I’ve covered implementing AI usage tracking with Stripe’s V2 API, which enables sending usage data at 10x the throughput of the previous API, making it possible to operate large-scale AI services.

Building on what we’ve discussed, here are some potential next steps:

Analytics Integration: Connect Stripe usage data with data warehouses like BigQuery or Redshift for detailed usage analysis

Real-time Dashboards: Build admin dashboards to visualize AI usage in real-time

Prepaid Model Implementation: Combine with a prepaid token model, deducting from token balance as usage occurs

Multi-model Management: Create integrated systems to manage multiple AI providers (OpenAI, Anthropic, Google, etc.) and track usage and costs across models

By leveraging Stripe V2 API’s high throughput, you can manage billing for larger and more complex AI systems. This implementation is particularly valuable for services with high concurrent connections or business models requiring precise, token-level billing.

Finally, since Stripe’s V2 API is relatively new, it may continue to evolve. I recommend regularly checking Stripe’s documentation to incorporate the latest features and improvements.