in this formula if the images are too similar the diffrence is 0 so the value of sigmoid will be small and output 0 not 1 so how this work ?

Here parameters (w and b) come to play. Sigmoid is calculated on the final value, including w, b, f_xi, and f_xj. Not just the difference on both f’s.

Do you mean that b and w will minimize high value coming from the formula and maximize small value from formula?

Yep, this is the idea. If the difference is small, it means both are the same person, so, w and b will try to make the sigmoid to output the value greater than the threshold.