Toxicity mean value increased after detoxification

Yes, the decrease of mean toxicity noticeably small in my case and the standard deviation increased.

toxicity [mean, std] before detox: [0.035475403208031574, 0.03445820341137294]

toxicity [mean, std] after detox: [0.030459516253110698, 0.04322473559713335]

I would also guess that this has to do with the short amount of training time / epochs.