Collaborative and content-based filtering hybrid

It seems to me that there would be cases in which it could be easy to define possibly relevant features for the user, but not so much for the items. Could it not make sense then to build a feature vector for the items through optimization from the existing data, and to instead extract user features with a neural network? I imagine these could either be optimized at the same time, or separately.