"Bad" Data For More Robust Algorithm

Andrew talked about removing data from the dataset if the data does not accurately represent the x->y mapping we want, but should all the “bad” data be removed from the dataset? I’m thinking that if some “bad” data is left in the dataset, the algorithm may become more robust to the function you want it to learn.