Hi there
In week 2, there is a lecture on the HELM benchmark. When I visit the HELM leaderboard page it is really difficult to interpret the results. Which metric(s) would speak to the Fairness, Bias and Toxicity of a model?
1 Like