Efficiently Tracking AI Usage with Vercel AI SDK and Stripe V2 API

Unlock high-throughput AI usage tracking by combining Vercel AI SDK with Stripe’s V2 Meter EventStream API, enabling seamless token consumption metering and billing for modern AI applications at scale.

広告ここから
広告ここまで

目次

    In modern AI application development, accurately tracking API usage has become a critical component. This is especially true when building systems that need to measure token consumption and bill users accordingly.

    Stripe’s usage-based billing APIs provide a convenient solution for this challenge, with their recently released V2 API offering significant improvements. In this article, I’ll show you how to implement high-throughput AI usage tracking by combining the Stripe V2 API with Vercel AI SDK.

    Evolution of Stripe’s Usage Metering APIs and Their Differences

    As of February 21, 2025, Stripe’s Agent Toolkit SDK for usage metering still uses the v1 Meter Events API.

    Stripe usage metering API

    While the v1 Meter Event API can handle up to 1,000 requests per second, the Meter EventStream API in V2 dramatically increases this limit to 10,000 requests per second. This is a huge advantage for services where many users are simultaneously interacting with AI models.

    Prerequisites Before Implementation

    Before using the V2 API, you’ll need:

    1. A Stripe account
    2. Meters created for your project
    3. A basic application structure using Vercel AI SDK

    You’ll also need to set up the following environment variables:

    • STRIPE_SECRET_API_KEY: Your Stripe secret API key
    • STRIPE_CUSTOMER_ID: The Stripe customer ID you want to meter usage for
    • STRIPE_METER_NAME_INPUT: The meter name for input tokens
    • STRIPE_METER_NAME_OUTPUT: The meter name for output tokens
    • CLAUDE_API_KEY: Your Anthropic API key (or whichever AI provider you’re using)

    Implementation Differences Between Meter Events and Meter EventStream

    Beyond the rate limit differences, a key aspect of the V2 API is its requirement for stateless authentication sessions. You create a session with a 15-minute validity period, then use the authentication token from that session to call the Stripe API until it expires. This means you’ll need a simple way to maintain state, such as cookies.

    Here’s the minimal setup. Note that you need to initialize the Stripe class twice, and the Stripe class that calls the Meter EventStream API must use the authentication_token obtained from v2.billing.meterEventSession:

    const meterEventSession = await new Stripe(config.secretKey, {
      apiVersion: '2025-01-27.acacia'
    }).v2.billing.meterEventSession.create();
    const stripe = new Stripe(meterEventSession.authentication_token)
    

    In practice, you’d want to create a session using meterEventSession.id or meterEventSession.expires_at with cookies or other state management to reuse or recreate sessions as needed.

    Building a Custom Metering Middleware

    The V2 API isn’t supported by the AI AgentToolkit SDK yet, so we’ll build our own implementation based on the SDK’s source code.

    Here’s a usage-based billing middleware customized for Vercel AI SDK:

    type StripeUsageBasedBillingMiddlewareConfig = {
      secretKey: string
      billing?: {
        type?: 'token';
        customer: string;
        meters: {
          input?: string;
          output?: string;
        };
      };
    };
    
    
    export const createStripeUsageBasedBillingMiddleware = (config: StripeUsageBasedBillingMiddlewareConfig): LanguageModelV1Middleware => {
      const bill = async ({
        promptTokens,
        completionTokens,
      }: {
        promptTokens: number;
        completionTokens: number;
      }) => {
        const meterEventSession = await new Stripe(config.secretKey, {
          apiVersion: '2025-01-27.acacia'
        }).v2.billing.meterEventSession.create();
        const stripe = new Stripe(meterEventSession.authentication_token)
        
        if (config.billing) {
          if (config.billing.meters.input) {
            await stripe.v2.billing.meterEventStream.create({
              events: [{
                event_name: config.billing.meters.input,
                payload: {
                  stripe_customer_id: config.billing.customer,
                  value: promptTokens.toString(),
                }
              }]
            });
          }
          if (config.billing.meters.output) {
            await stripe.v2.billing.meterEventStream.create({
              events: [{
                event_name: config.billing.meters.output,
                payload: {
                  stripe_customer_id: config.billing.customer,
                  value: completionTokens.toString(),
                }
              }]
            });
          }
        }
      };
    
      return {
        wrapGenerate: async ({doGenerate}) => {
          const result = await doGenerate();
    
          if (config.billing) {
            await bill(result.usage);
          }
    
          return result;
        },
    
        wrapStream: async ({doStream}) => {
          const {stream, ...rest} = await doStream();
    
          const transformStream = new TransformStream<
            LanguageModelV1StreamPart,
            LanguageModelV1StreamPart
          >({
            async transform(chunk, controller) {
              if (chunk.type === 'finish') {
                if (config.billing) {
                  await bill(chunk.usage);
                }
              }
    
              controller.enqueue(chunk);
            },
          });
    
          return {
            stream: stream.pipeThrough(transformStream),
            ...rest,
          };
        },
      };
    }
    

    This code measures both input (prompt) tokens and output (completion) tokens from the AI model and records them using the Stripe Meter EventStream API.

    All you need to do is pass this to the middleware parameter of wrapLanguageModel():

    const myStripeToolKit = createStripeUsageBasedBillingMiddleware({
      secretKey: STRIPE_SECRET_API_KEY,
      billing: {
        customer: STRIPE_CUSTOMER_ID,
        meters: {
          input: STRIPE_METER_NAME_INPUT,
          output: STRIPE_METER_NAME_OUTPUT
        }
      }
    })
    const model = wrapLanguageModel({
      model: createAnthropic({
        apiKey: CLAUDE_API_KEY
      })('claude-3-5-sonnet-20241022'),
      middleware: [myStripeToolKit],
    })
    

    One important thing to note: Meter EventStream API request logs don’t appear in the Stripe workbench request logs. This is an intentional design decision based on the assumption that these APIs will handle large volumes of requests.

    Understanding the Custom Metering Middleware Code

    Let’s break down the key components of our middleware implementation:

    1. Configuration Object Structure

    type StripeUsageBasedBillingMiddlewareConfig = {
      secretKey: string
      billing?: {
        type?: 'token';
        customer: string;
        meters: {
          input?: string;
          output?: string;
        };
      };
    };
    

    In this type definition:

    • secretKey: The Stripe secret key needed for authentication
    • billing: An optional object containing billing settings
      • type: Currently only supports ‘token’ (for future expandability)
      • customer: The Stripe customer ID to track usage for
      • meters: An object specifying meter names for input and output

    2. Billing Function

    const bill = async ({
      promptTokens,
      completionTokens,
    }: {
      promptTokens: number;
      completionTokens: number;
    }) => {
      const meterEventSession = await new Stripe(config.secretKey, {
        apiVersion: '2025-01-27.acacia'
      }).v2.billing.meterEventSession.create();
      const stripe = new Stripe(meterEventSession.authentication_token)
      
      if (config.billing) {
        if (config.billing.meters.input) {
          await stripe.v2.billing.meterEventStream.create({
            events: [{
              event_name: config.billing.meters.input,
              payload: {
                stripe_customer_id: config.billing.customer,
                value: promptTokens.toString(),
              }
            }]
          });
        }
        // Similar handling for output tokens...
      }
    };
    

    This bill function:

    1. Creates a new Meter EventSession to get an authentication token
    2. Initializes a new Stripe instance using that token
    3. Sends events to the respective meters for input and output tokens, if configured

    An important detail here is that the value is converted to a string. The Stripe API expects numerical values as strings, not as numbers.

    3. Middleware Object Structure

    return {
      wrapGenerate: async ({doGenerate}) => {
        const result = await doGenerate();
    
        if (config.billing) {
          await bill(result.usage);
        }
    
        return result;
      },
    
      wrapStream: async ({doStream}) => {
        const {stream, ...rest} = await doStream();
    
        const transformStream = new TransformStream<
          LanguageModelV1StreamPart,
          LanguageModelV1StreamPart
        >({
          async transform(chunk, controller) {
            if (chunk.type === 'finish') {
              if (config.billing) {
                await bill(chunk.usage);
              }
            }
    
            controller.enqueue(chunk);
          },
        });
    
        return {
          stream: stream.pipeThrough(transformStream),
          ...rest,
        };
      },
    };
    

    Vercel AI SDK middleware has two important functions:

    wrapGenerate

    This function is called when using the AI model in non-streaming mode. After the response is fully generated, it performs billing using the result object that includes usage information.

    wrapStream

    This function is called when using the AI model in streaming mode. It uses a TransformStream to transform the response stream, only performing billing when it reaches the end of the stream (chunk.type === 'finish'), passing all chunks through unchanged until then.

    This approach allows the same billing logic to work in both streaming and non-streaming modes. In streaming mode particularly, this ensures we can bill based on the final usage after all tokens have been generated, without disrupting the user experience.

    Real-World Use Cases

    This implementation is particularly effective in the following scenarios:

    1. SaaS AI Assistant Services

    For SaaS services that offer different usage plans to users and bill based on token consumption, high-speed and reliable usage tracking is essential. The V2 API’s high throughput really shines in scenarios where many users are simultaneously using AI features.

    2. Enterprise AI Solutions

    For customized AI solutions targeting large enterprises, you often need to accurately track usage by department or use case. Using Stripe’s V2 API allows you to efficiently process large volumes of API calls while maintaining detailed usage statistics.

    3. Multi-tenant AI Platforms

    For platforms providing AI services to multiple companies or organizations, precise usage tracking per tenant is crucial. The V2 API’s high rate limits ensure you can scale your service while maintaining accurate usage-based billing.

    Troubleshooting and Important Considerations

    Authentication Token Expiration

    The Meter EventStream API’s authentication tokens have a 15-minute validity period. Without proper handling of token expiration, you may encounter “authentication errors.” Consider implementing logic to refresh tokens slightly before they expire (e.g., when 5 minutes remain).

    Error Handling

    In high-traffic environments, temporary network errors may occur. It’s recommended to implement retry logic:

    async function sendMeterEvent(stripe, eventData, retries = 3) {
      try {
        return await stripe.v2.billing.meterEventStream.create(eventData);
      } catch (error) {
        if (retries > 0 && error.type === 'api_connection_error') {
          // Wait a bit and retry on connection errors
          await new Promise(resolve => setTimeout(resolve, 500));
          return sendMeterEvent(stripe, eventData, retries - 1);
        }
        console.error('Failed to send meter event after retries:', error);
        // Additional error handling like sending to error logging service
        throw error;
      }
    }
    

    Batch Processing Optimization

    When sending multiple events, batch processing is more efficient than sending individual events:

    await stripe.v2.billing.meterEventStream.create({
      events: [
        {
          event_name: config.billing.meters.input,
          payload: { ... }
        },
        {
          event_name: config.billing.meters.output,
          payload: { ... }
        }
        // Up to 100 events can be sent at once
      ]
    });
    

    Conclusion and Advanced Applications

    I’ve covered implementing AI usage tracking with Stripe’s V2 API, which enables sending usage data at 10x the throughput of the previous API, making it possible to operate large-scale AI services.

    Building on what we’ve discussed, here are some potential next steps:

    1. Analytics Integration: Connect Stripe usage data with data warehouses like BigQuery or Redshift for detailed usage analysis
    2. Real-time Dashboards: Build admin dashboards to visualize AI usage in real-time
    3. Prepaid Model Implementation: Combine with a prepaid token model, deducting from token balance as usage occurs
    4. Multi-model Management: Create integrated systems to manage multiple AI providers (OpenAI, Anthropic, Google, etc.) and track usage and costs across models

    By leveraging Stripe V2 API’s high throughput, you can manage billing for larger and more complex AI systems. This implementation is particularly valuable for services with high concurrent connections or business models requiring precise, token-level billing.

    Finally, since Stripe’s V2 API is relatively new, it may continue to evolve. I recommend regularly checking Stripe’s documentation to incorporate the latest features and improvements.

    広告ここから
    広告ここまで

    Random posts

    Home
    Search
    Bookmark