ACHEMS 2025

Presentation Details

SPLTRAK Abstract Submission

Poster #362
Predicting Olfactory Mixture Similarity Perception Through a Community Effort

Vahid Satarifard², Yikun Han³, Matej Hladiš⁵, Pedro Illidio⁴, Maxence Lalis⁵, Aharon Ravia⁶, Laura Sisson⁷, Gaia Andreotti⁸, Jake Albrecht⁸, Nicholas Christakis², Sebastien Fiorucci⁵, Ambuj Tewari³, Celine Vens⁴, Joel Mainland⁹, Pablo Meyer¹
¹IBM, yorktown heights, NY, United States
²Yale, new haven, CT, United States
³university of michigan, ann arbor, MI, United States
⁴Leuven university, Leuven, --, Belgium
⁵Universite Cote d'Azur, Nice, --, France
⁶Cornell Tech, NEW YORK, NY, United States
⁷Talent.com, NEW YORK, NY, United States
⁸sage bionetworks, seattle, WA, United States
⁹monell, philadelphia, PA, United States

A key goal of sensory sciences is to establish the rules linking shifts in physical stimulus structure to predictable shifts in perception. These rules are better defined in vision and audition than in olfaction. Their absence in olfaction hinders the digitization of this sensory domain. The quest to establish such rules includes predicting odorant verbal labels and predicting pairwise stimulus perceptual similarity. A strong framework for perceptual similarity is widely seen as key to labeling stimuli and, ultimately, digitization. The DREAM Olfactory Mixtures Prediction Challenge aimed to highlight models predicting the perceptual similarity of molecular mixture pairs from a curated dataset of multiple studies. This dataset comprises 850 unique mixtures, 235 mono-molecules, and 780 mixture pairs. Teams competed for three months to develop machine-learning models predicting how close two molecular mixtures are in odor perceptual space (0-1 scale, where 0 indicates total overlap, and 1 is furthest apart) using chemical and semantic descriptors. Feedback on the leaderboard dataset of 46 mixture-pair comparisons helped refine models and compare performance. Final predictions were made on a hidden test set of 46 comparisons. DREAM organizers evaluated models using 10,000 bootstrap iterations, with RMSE and Pearson correlations as metrics. The competition ended in a quadruple tie. We built an ensemble model from the top 8 teams’ predictions, which outperformed all individual models with an RMSE of 0.08 and Pearson correlation of 0.6. This ensemble model surpassed state-of-the-art models, including Snitz Angle, Principal Odor Map, Semantic, and Pair models. These rules may enable smell digitization, with further improvements expected from larger datasets.