Alibaba releases AI model that can read emotions

Subscribe for free access to :arrow_forward: Data Points!

Alibaba’s Tongyi Lab unveiled R1-Omni, an open vision model capable of inferring emotional states from video and audio inputs. The model, a reinforcement learning-enhanced version of the earlier HumanOmni, achieves state of the art performance on emotion recognition vision benchmarks. R1-Omni adds another nascent layer of understanding to vision models and is freely available on GitHub and Hugging Face. (GitHub)

2 Likes

What is their classification of emotions [states of mind meant to elicit immediate action in oneself or others] though? :face_with_monocle:

The paper on this effort:

A paper on the database:

Each clip is annotated with a compound emotional category and a couple of sentences that describe the subjects’ affective behaviors in the clip. For the compound emotion annotation, each clip is categorized into one or more of the 11 widely-used emotions, i.e., anger, disgust, fear, happiness, neutral, sadness, surprise, contempt, anxiety, helplessness, and disappointment. To ensure high quality of the labels, we filter out the unreliable annotations by an Expectation Maximization (EM) algorithm, and then obtain 11 single-label emotion categories and 32 multi-label emotion categories.

Nice.

Also, we can certainly build a system that determines psychopathy (N.B., not a classification in the DSM) through a simple interview. Attention Anthony Blinken.

1 Like