Docs

Technical references for the LLM intelligence hub, model catalog, citations policy, benchmark glossary, and weekly refresh process.

Intelligence hub quickstart

How to move between leaderboard rows, model profiles, news, and benchmark guides.

Requirements for source name, URL, retrieval date, and metric scope.

How to search by provider, license, context window, price, modality, and evidence.

What GPQA, AIME, SWE-bench, Code Arena, MMMU, Toolathlon, and long-context metrics mean.

How EvalKit separates releases, research, benchmark changes, and resources.

Manual cadence for updating benchmark snapshots every week.