Multi-lora-inference issue with gathered version

Daniel_Casals · March 20, 2024, 5:34pm

It seems there is an issue affecting the gathered version for serving multiples Lora. It is taking much more time than expected compared with the loop version

I’ve tested it in my local machine and got the expected behavior so there should be something related with the environment.

Deepti_Prasad · March 20, 2024, 8:40pm

Hello @Daniel_Casals

Can you share the result outcome from your local jupyter?

Could you notice any difference in codes, or module version?

You will have to share screenshots for better understanding for others to respond.

Regards
DP

Daniel_Casals · March 20, 2024, 9:49pm

Hello @Deepti_Prasad

Maybe I did not explained well, my local code ran ok and I got the expected result.
The issue I reported as well as the shared chart is from the online Deeplearning.ai jupyter of the course.

You should be able to reproduce the issue in the embedded provided notbook https://learn.deeplearning.ai/courses/efficiently-serving-llms/lesson/7/multi-lora-inference and Run All cells.

Issam88 · April 19, 2024, 4:31pm

I confirm I am having the same issue, multi-lora runs properly on my local laptop, but not on the deeplearning.ai platform notebook.

Topic		Replies	Views
OpenAI Quota Exceeded LangChain for LLM Application Development	0	131	July 9, 2023
The gathered latency is not better than the loop latency in lesson 6 Efficiently Serving LLMs	0	46	June 14, 2024
Pass all the test in improvise jazz assignment but grader says no Sequence Models coursera-platform	11	618	June 30, 2021
Course-5_Week-2_Assignment-3_Exercise-1 djmodel Sequence Models coursera-platform	1	588	June 28, 2021
Grader : Even if after passing all the test along with matching outputs, it always grade my submission to 80 out of 100 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	700	December 7, 2021

Multi-lora-inference issue with gathered version

Related topics