Realtime LLM news
Follow model releases, benchmark shifts, and research signals without losing the source.
EvalKit refreshes this feed from free public sources every 15 minutes and falls back to curated citations if a source is temporarily unavailable.
Barret Zoph is out at OpenAI again after just five months
Five months after returning to OpenAI, Barret Zoph - the company's head of enterprise AI sales - has departed, The Verge has learned. Zoph returned to OpenAI in mid-January after a stint as co-founder and CTO of Thinking Machines Lab, the...
2026-06-19Presstodaypressopenai
Source: Elastic agrees to buy CRV-backed DeductiveAI for up to $85M
DeductiveAI, a startup that uses AI to catch and resolve bugs in software, was founded just three years ago.
2026-06-19Presstodaypress
‘Queer Eye’ life coach Karamo Brown launches Kē, a wellness app featuring his AI digital clone
After spending a year and a half focusing on his own journey — from fitness and nutrition to meditation, sobriety, relationships, and personal growth — Brown wants to help others do the same.
2026-06-18Presstodaypress
Adobe’s redesigned AI studio remembers what your creations look like
Adobe is introducing some new capabilities for its Firefly AI assistant, alongside a "reimagined" AI studio that lets you edit and generate new designs from a single interface. The new Firefly experience launching today in private beta is...
2026-06-18Presstodaypress
AI data centers just got a government-mandated fast lane to the grid
FERC told grid operators to give data centers a fast lane for interconnections, but it failed to address electricity supply shortages.
2026-06-18Presstodaypress
AI inference startup Baseten reportedly raising $1.5B months after its last mega-round
Startup Baseten is reportedly close to finalizing a $1.5 billion round at a $13 billion as the “inference gold rush" marches on.
2026-06-18Presstodaypressinference
Almost half of US singles feel negatively about AI in dating, Match says
About 47% of singles look negatively at the use of AI in dating -- but many dating app users are open to AI helping with profile punch-ups and conversation starters.
2026-06-18Presstodaypress
Amazon hopes to challenge Nvidia more directly by selling its AI chips
AWS is in talks to sell its chips to other data centers. CEO Andy Jassy has said this represents a $50 billion opportunity for the company.
2026-06-18Presstodaypress
Midjourney goes from generating cat images to full-body ultrasound scans
Midjourney CEO David Holz just showed off the company's first hardware product and plans to build a San Francisco spa, which he admitted is a bit different from the "cat pictures" produced by its AI image generator. Dubbed The Midjourney S...
2026-06-18Presstodaypress
OpenAI is bringing on some big guns in the lead-up to its IPO
OpenAI is bulking up before its IPO, landing Transformer co-inventor Noam Shazeer from Google DeepMind and former Trump AI policy official Dean Ball in the same week.
2026-06-18Presstodaypressopenai
Photoshop and Premiere now have AI assistants
Adobe's plan to stick AI assistants into all of its Creative Cloud suite is now fully underway, with new chatbots now rolling out to its biggest editing and design apps. As part of a public beta launching today, Photoshop, Premiere, Illust...
2026-06-18Presstodaypress
Snap spins off AI video team into new company, Dotmo, due to costs
The Snapchat maker is spinning off yet another internal unit. Dotmo will be composed of current Snap staff who are leaving the social media company to focus on AI video development.
2026-06-18Presstodaypress
Who decides when AI is too dangerous?
On today’s episode of Decoder, my guest is Hayden Field, senior AI reporter for The Verge. Often when Hayden comes on the show, it’s because something has gone wrong in the world of AI. Last weekend, that something was a pretty intense mix...
2026-06-18Presstodaypress
Anthropic got hit by export rules nobody understands
Anthropic has spent much of this week fighting to get its newest AI models back online after the Trump administration abruptly ordered the company to cut access for all foreign nationals, including users inside the US and its own employees...
2026-06-17Presstodaypressanthropic
Improving health intelligence in ChatGPT
Learn how GPT-5.5 Instant improves ChatGPT’s health and wellness responses with stronger reasoning, better context, clearer communication, and physician-informed evaluations.
2026-06-18Officialreleaseofficialeval
New usage analytics and updated spend controls for enterprises
OpenAI introduces new spend controls and usage analytics for ChatGPT Enterprise, helping organizations manage costs and scale AI with confidence.
2026-06-18Officialreleaseofficialgpt
Using AI to help physicians diagnose rare genetic diseases affecting children
Researchers used an OpenAI reasoning model to help diagnose rare diseases, identifying 18 new diagnoses in previously unsolved cases.
2026-06-18Officialreleaseofficialmodel
A near-autonomous AI chemist improves a challenging reaction in medicinal chemistry
OpenAI and Molecule.one show how a near-autonomous AI chemist using GPT-5.4 improved a key drug-making reaction, advancing medicinal chemistry research.
2026-06-17Officialreleaseofficialgpt
Introducing LifeSciBench
Introducing LifeSciBench, an expert-authored, expert-reviewed benchmark for evaluating how AI systems handle real-world life science research tasks and decisions.
2026-06-17Officialreleaseofficialbenchmark
Beyond Global Replanning: Hierarchical Recovery for Cross-Device Agent Systems
Real-world computer-use tasks often span multiple applications and devices, requiring agents to coordinate heterogeneous environments under dynamic runtime failures. Existing multi-device agent systems support task decomposition and cross-...
2026-06-18Researchresearchagent
CATCH-ME if you RAG: a dataset of Contextually Annotated multi-Turn Counterspeech against Hate and Misinformation Exchanges
Online hate speech and misinformation frequently overlap, yet NLP research has mainly treated them in isolation. While LLMs represent a scalable solution for assisting humans in the generation of counterspeech for both threats, zero-shot m...
2026-06-18Researchresearchllm
DeepSWIP: Quotient-WMC Counterfactuals for Neural Probabilistic Logic Programs
Neurosymbolic systems such as DeepProbLog combine neural perception with probabilistic logic, but standard inference is associational. Counterfactual reasoning additionally requires a causal semantics for interventions and evidence. We int...
2026-06-18Researchresearchinferencereasoning
Execution-State Capsules: Graph-Bound Execution-State Checkpoint and Restore for Low-Latency, Small-Batch, On-Device Physical-AI Serving
Mainstream LLM serving systems reuse prefix work mainly through paged or radix key-value (KV) caches. This is highly effective for high-throughput, high-concurrency serving, but it manages only one positional fragment of execution state: t...
2026-06-18Researchresearchllm
How Transparent is DiffusionGemma?
LLM reasoning transparency is a critical affordance for understanding model decisions, mitigating misuse and misalignment, and debugging surprising model behaviors. However, DiffusionGemma performs a larger fraction of its computation in a...
2026-06-18Researchresearchllmmodel
LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents
Policy-adherent tool-calling agents in customer-service domains must maintain task states across turns while calling tools and obeying domain policies. Task states consist of relevant facts, identifiers, constraints, and conditions observe...
2026-06-18Researchresearchagent
Multi-Task Bayesian In-Context Learning
Bayesian predictive inference provides a principled framework for uncertainty quantification, data efficiency, and robust generalization. However, exact inference is often intractable, and scalable approximations may remain computationally...
2026-06-18Researchresearchinference
Optimal Deterministic Multicalibration and Omniprediction
A model is multicalibrated on a collection of group weights $G$ if it is calibrated -- i.e. unbiased even conditional on its prediction -- not just overall, but also after reweighting contexts by each $g \in G$. It is a useful property for...
2026-06-18Researchresearchmodel
PsyScore: A Psychometrically-Aware Framework for Trait-Adaptive Essay Scoring and ZPD-Scaffolded Feedback
Effective Automated Essay Scoring (AES) are expected to support both reliable assessment and actionable instructional feedback. However, existing approaches often treat scoring and feedback as separate components: neural scoring models pro...
2026-06-18Researchresearchmodel
SARLO-80: Worldwide Slant SAR Language Optic Dataset 80cm
Multimodal foundation models have advanced rapidly thanks to large optical benchmarks, but comparable resources for synthetic aperture radar (SAR) remain limited. Existing SAR--optical datasets largely rely on low-resolution, intensity-onl...
2026-06-18Researchresearchbenchmarkfoundation
Scalable Training of Spatially Grounded 2D Vision-Language Models for Radiology
We study how to train visually grounded vision-language models (VLMs) for radiology without manual spatial annotations. We introduce RefRad2D, a large-scale bilingual (German/English) dataset of 1.2M CT and MR image-text pairs derived from...
2026-06-18Researchresearchlanguagemodel
Sovereign Execution Brokers: Enforcing Certificate-Bound Authority in Agentic Control Planes
Autonomous agents are increasingly connected to cloud, deployment, and data-control workflows, but production mutation authority should not reside inside non-deterministic reasoning processes. Existing access-control mechanisms authorize i...
2026-06-18Researchresearchagentreasoning
Structuring and Tokenizing Distributed User Interest Context for Generative Recommendation
Generative recommendation is an emerging paradigm that has shown promise in industrial recommendation systems, aiming to predict users' next interactions from their historical behaviors. At the core of generative recommendation lies item t...
2026-06-18Researchresearch
StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs
Multimodal large language models (MLLMs) are increasingly deployed in personally and societally consequential settings, yet the visual cues that shape how these models judge people remain poorly understood. Prior work often compares differ...
2026-06-18Researchresearchlanguagelarge
Token-Operations-Oriented Inference Optimization Techniques for Large Models
Large model inference optimization serves as a key foundation for supporting the scalable, low-cost, and highly stable operation of large model services. Centered on token-oriented inference optimization technology, this paper proposes for...
2026-06-18Researchresearchinferencemodel
UNIEGO: Proxies as Mediators for Unified Egocentric Video Representation Learning
Egocentric video understanding is inherently limited by the narrow perspective of wearable cameras: a single viewpoint, a single modality, a single model cannot capture the full richness of human action. We argue that a truly expressive eg...
2026-06-18Researchresearchmodel
Your Mouse and Eyes Secretly Leak Your Preference: LLM Alignment using Implicit Feedback from Users
To align a Large Language Model (LLM), most existing methods collect explicit human feedback and train a reward model to predict the human preference based on the response text. These existing methods have two key limitations. First, the u...
2026-06-18Researchresearchlanguagelarge