I have a question regarding training a single machine learning or deep learning model on multiple datasets that have partially overlapping features.
Each dataset comes from a different source (e.g., different devices, systems, or environments), and while they share a set of common features, each dataset also contains some unique features that are not present in the others.
Are there any pitfalls or best practices when training on datasets with some unique, non-overlapping features?