There’s one detail to note here.
Bidirectional layers lead to more stable gradient propagation across longer sentences since the sentence is processed from both directions.
So, there’s nothing wrong with using them in place of unidirectional LSTMs.
See this as well.