Hi, Why didn’t we remove duplicates from the triniing dataset given it overlaps with validation set (>90%)?