What happens if the completion window is not met?
What happens if the completion window is not met?
What happens if the completion window is not met?
For batch and asynchronous inference, we guarantee processing up to a total of 1 million tokens per user per hour, with a maximum of 4,000 output tokens per request. If these limits are exceeded, requests may extend into the next completion window, and additional charges for the subsequent window will apply. This does not apply to real-time inference, which is designed for immediate responses.