Direkt zum Inhalt


Aktuelles: Paper im BioASQ-Workshop auf der CLEF 2025

von Samy Ateia und Udo Kruschwitz

09. September 2025, von Melanie A. Kilian

  • Informatik und Data Science
  • Forschung
  • Publikation

🤖💬 Can an AI catch its own mistakes?

What if language models could critique themselves, i.e., refining answers to complex questions without human help?

🧠 Our latest work, accepted at BioASQ @ CLEF 2025, puts this to the test using cutting-edge LLMs in a high-stakes professional search context.


📄 Authors: Samy Ateia & Udo Kruschwitz

🔬 We explore how current reasoning and nonreasoning Large Language Models (LLMs) like Gemini-Flash 2.0, o3-mini, o4-mini and DeepSeek-R1 can generate, evaluate, and refine their own outputs to support domain-specific professional search, particularly in biomedical QA. 🧬

🔍 Our study tests a LLM self-feedback mechanism in a Retrieval Augmented Generation (RAG) pipeline, asking:
Can LLMs effectively critique themselves?
Can this improve performance on expert tasks like those in BioASQ?

💡 Findings suggest that performance varies across models and task types.

🧭 Our study informs future research aiming to understand when we should trust self-correcting AI to work on its own and when expert human input still matters most in complex professional search tasks.

🔗 Find the pre-print version of our study here: https://arxiv.org/abs/2508.05366 (externer Link, öffnet neues Fenster)


We are looking forward to insightful discussions at CLEF 2025 and advancing the conversation around transparency, user involvement, and AI-supported expert search. See you in Madrid! 👋🇪🇸


#ProfessionalSearch
#LargeLanguageModels #LLMs #AI #NLP
#RetrievalAugmentedGeneration #RAG
#SelfFeedback #SelfCorrection #QueryExpansion
#PhDResearch #AIResearch #BiomedicalResearch
#BioASQ #CLEF2025
#ResearchSuccess #ResearchPaperAccepted
#InformationScienceRegensburg #StayInformed

 

Informationen/Kontakt

CLEF 2025 is hosted by the UNED University at Madrid, Spain, September 9-12, 2025. Find more information on CLEF 2025, the 16th "Conference and Labs of the Evaluation Forum" here: https://clef2025.clef-initiative.eu/index.php (externer Link, öffnet neues Fenster)

Find more information on the thirteenth BioASQ Workshop here: https://www.bioasq.org/workshop2025 (externer Link, öffnet neues Fenster)

Zu Samy Ateia

Zu Udo Kruschwitz

nach oben