Batch AI jobs can look cheap during testing and become expensive when the input queue grows. Prompt size controls should be part of the job design, not an afterthought.
Cap input material
Define the maximum source length before summarization or extraction. If long documents need special handling, route them through a separate workflow instead of letting them expand the default prompt.
Cap output length
Set a maximum output size that matches the downstream task. Long generated responses often increase review time and spend without improving the workflow.
Measure retry cost
Retries are part of the cost model. Track retry count and failed output length so a fragile prompt does not quietly double the batch cost.