Best Approach for Lip Syncing in Real-Time Video Processing?

I am working on a real-time lip-syncing project and would like to know the most efficient methods for syncing audio with facial movements in live video. Which algorithms or techniques work best for maintaining high accuracy while minimizing latency?