Latency will only increase

The average latency is defined as (total duration for the batch) / max_tokens. As batch size increases, the numerator increases and denominator remains same. So, the latency will increase mathematically.

Hi @nmurugesh

Can you elaborate your doubt and how it is related tot he course you have selected. If it is related to any codes in the notebook, you can share codes here as it is from short courses.