ML Research Hub
32.8K subscribers
4.13K photos
243 videos
23 files
4.45K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Language Model Council: Benchmarking Foundation Models on Highly Subjective Tasks by Consensus

📝 Summary:
Benchmarking LLMs on subjective tasks like emotional intelligence is challenging. The Language Model Council LMC uses a democratic process with 20 LLMs to formulate, administer, and evaluate tests. This yields more robust, less biased rankings that align better with human leaderboards.

🔹 Publication Date: Published on Jun 12, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2406.08598
• PDF: https://arxiv.org/pdf/2406.08598
• Github: https://github.com/llm-council/llm-council

Datasets citing this paper:
https://huggingface.co/datasets/llm-council/emotional_application

Spaces citing this paper:
https://huggingface.co/spaces/llm-council/llm-council
https://huggingface.co/spaces/llm-council/sandbox

==================================

For more data science resources:
https://t.me/DataScienceT

#LLM #Benchmarking #AIEvaluation #FoundationModels #ConsensusAI