Alessandro Conti

Multimodal Perception · Tavus

Machine learning researcher at Tavus, where I work on multimodal perception research. I hold a PhD from the University of Trento, where my research focused on open-world visual recognition with vision-language models — spanning vocabulary-free classification, large multimodal model evaluation, and domain adaptation. Previously, I interned at Apple.

GitHub · Scholar · LinkedIn

01 About / 自

Education

Nov 2021 – Jul 2025
PhD in Artificial Intelligence, UniTrento

Sep 2019 – Oct 2021
MSc in Computer Science, UniTrento

Sep 2016 – Sep 2019
BSc in Computer Science, UniTrento

Experience

Jan 2026 - Today
Machine Learning Researcher, Tavus

Aug 2025 - Dec 2025
Machine Learning Engineer, Mountain Maps

Jun 2024 – Sep 2024
Research Intern, Apple

02 Recent Papers / 論

CVPR · 2026 Specificity-aware reinforcement learning for fine-grained open-world classification

CVPR Findings · 2026 Large multimodal models as general in-context classifiers

TPAMI · 2026 Vocabulary-free Image Classification and Semantic Segmentation

→ All papers