Refinements to Beam Search

Anbu · April 19, 2022, 4:46am

Hi Sir,

From the below lecture notes, we are having doubts can you please help to clarify ?

**Just to wrap up how you can beam search, as you run beam search you see a lot of sentences with length equal 1, length sentences were equal 2, length sentence that equals 3 and so on, **
and maybe you run beam search for 30 steps you consider, output sentences up to 30, let’s say beam width three, you would be keeping track of the top three possibilities for each of these possible sentence length 1, 2, 3, 4, and so on up to 30 and store them against this score and so you can take your top sentences and just computes this objective function on the sentences that you have seen through the beam search process. Then finally, of all these sentences that you evaluate this way, you pick the one to the achieves, the highest value on this normalize log probability objective, sometimes it’s called a normalized log likelihood objective and then that would be the final translation you all output.

Our doubt is, Actually , after each step we would end up with 30,000 possilble outputs (if 10,000 softmax output & beam width = 3) so we pick top 3 output from among 30,000 outputs. Here doubt is, after picking top 3 sentences only we need to compute againt objective function or for all the 30,000 outputs we need to compute the objective log function then pick the top 3 sentences ?

balaji.ambresh · April 19, 2022, 5:18am

Use length normalization to pick the best output AFTER candidate generation (i.e. after running beam search and finding top 3 till 30 steps).
Using length normalization during candidate generation won’t make a difference since you’d be comparing sentences of equal length at each branch.
Purpose of using length normalization is to ensure that you don’t punish longer sentences since the product of probabilities would be small.

KANDOORI_HARINI · October 26, 2022, 5:11pm

Using length normalization during candidate generation won’t make a difference since you’d be comparing sentences of equal length at each branch.

Topic		Replies	Views
Week 3 beam search termination criteria Sequence Models coursera-platform	1	659	March 14, 2022
Doubt in "Refinements to Beam Search" video in week3 Sequence Models coursera-platform	1	533	February 13, 2022
Week-3 Sequence Model: Doubt in Greedy Search Vs Beam Search Clarification Sequence Models week-module-3 , ai-discussions , coursera-platform	2	334	January 8, 2024
Beam Search at Training Time Sequence Models coursera-platform	1	495	October 20, 2022
Greddy Search Clarification Sequence Models coursera-platform	1	557	April 9, 2022

Refinements to Beam Search

Related topics