Translation Models

Hi Team,

I am looking to fine-tune some models with internal data for translation.

  1. Which are the best models for translation?
  2. What sort of cleaning techniques should be applied?
  3. What sort of evaluation metrics can be used for Translation?

Can someone help me with the above questions?


Hello @avinash1229

is it language related translation model?

I was not completely sure about your query when you mentioned internal data for translation?

Do you have a model? or you are asking in general?


Hi @Deepti_Prasad ,

Yes, I am working on developing a translation model for English to Spanish, German, and other languages.

When I mentioned internal data, I was referring to validated translation data within a company, specifically for English to Spanish.

I am currently trying to fine-tune a model.

I have fine-tuned Flan T5 XXL, XL, and T5-3b, but they did not give good results.

So what I am hoping to get is:

  1. Which evaluation metrics are suitable for assessing LLM translation? Since BLEU and CHRF3 are tailored for Machine Translation and COMET is acceptable but not ideal, I am seeking alternative methods to evaluate LLM output.
  2. Are there any models known for delivering excellent translation results?
  3. What cleaning techniques should be employed to preprocess data for translation?

Hello @avinash1229

That’s great :smiley:

Can you share details of results which you tried already with screenshots for better understanding of what kinds of evaluation would be better fit for further suggestion.

Also tagging an NLP mentor who can also put forward his views @arvyzukai



Did you try HELM Holistic Evaluation of Language Models, sharing a GitHub repo, probably this could give you detail assessment of your model

The usual cleaning techniques removing duplicate records, handling missing values, dealing with outliers, standardizing formats, resolving inconsistencies, feature scaling, handling categorical variable


Hi @Deepti_Prasad ,

Thank you for this. I have tried BLEU and COMET scores. I will try HELM.

is this the result from comet scores?

@Deepti_Prasad This is from BLEU and METEOR scores. I need to try COMET Actually.