[행사/세미나] 인공지능대학원 전문가 초청 세미나(Prof. José Hernández-Orallo @ UPV, 5/8(목) 16시)
- 소프트웨어융합대학
- 조회수339
- 2025-04-14
Title: Capability Rubrics with LLM Annotations
Speaker: Prof. José Hernández-Orallo @ Universitat Politècnica de València
Time : 16:00 ~ 17:00, May 8th, 2025
Location: Online
- In-person: X
- Online: https://skku-edu.zoom.us/j/82657941165?pwd=MtwBIbSgbv9rV87Ij9YTJ0BYOpuFzJ.1
- ID: 826 5794 1165
- PW: 087053
Language: English speech & English slides
Abstract:
I'll present general scales for AI evaluation that can explain what common AI benchmarks really measure, extract ability profiles of AI systems, and predict their performance for new task instances, in- and out-of-distribution. These scales are based on natural language rubrics that are used by standard language models to annotate thousands of instances from 63 textual tasks, giving good inter-rater agreements. High predictive power at the instance level becomes possible using these demand levels, providing superior estimates over black-box baseline predictors based on embeddings or finetuning, especially in out-of-distribution settings (new tasks and new benchmarks).
Bio:
José Hernández-Orallo is Professor at the Universitat Politècnica de València, Spain and Senior Research Fellow at the Leverhulme Centre for the Future of Intelligence, University of Cambridge, UK. He received a B.Sc. and a M.Sc. in Computer Science from UPV, partly completed at the École Nationale Supérieure de l'Électronique et de ses Applications (France), and a Ph.D. in Logic and Philosophy of Science with a doctoral extraordinary prize from the University of Valencia. His academic and research activities have spanned several areas of artificial intelligence, machine learning, data science and intelligence measurement, with a focus on a more insightful analysis of the capabilities, generality, progress, impact and risks of artificial intelligence. He has published five books and more than two hundred journal articles and conference papers on these topics. His research in the area of machine intelligence evaluation has been covered by several popular outlets, such as The Economist, New Scientist or Nature. He keeps exploring a more integrated view of the evaluation of natural and artificial intelligence, as vindicated in his book "The Measure of All Minds" (Cambridge University Press, 2017, PROSE Award 2018). He is a member of AAAI, CLAIRE and ELLIS, and a EurAI Fellow.