Berlin Coyotiv and OpenServ Labs published a research paper introducing BRAID (Bounded Reasoning for Autonomous ...
B, an open-weight multimodal vision AI model designed to deliver strong math, science, document and UI reasoning with far less training data and compute than much larger systems.
Anthropic’s Claude Opus 4.6 introduces "Adaptive Thinking" and a "Compaction API" to solve context rot in long-running agents. The model supports a 1M token context window with 76% multi-needle ...
Scientists warn that current AI tests reward polite responses rather than real moral reasoning in large language models.
Microsoft's Phi-4-reasoning-vision-15B uses careful data curation and selective reasoning to compete with models trained on ...
The company mainly trained Phi-4-reasoning-vision-15B on open-source data. The data included images and text-based descriptions of the objects depicted in those images. Before it started training the ...
DescrybeLM answered all 200 bar exam questions correctly. ChatGPT, Claude, and Gemini each missed between 13 and 23—and ...
Metilience unveils a hybrid AI reasoning engine for high-stakes exams, leveraging structured cognitive error analysis ...
A prospective feasibility study in an urgent care clinic tested a conversational AI system (AMIE) with 100 real patients to evaluate whether it could safely collect medical histories before doctor ...
Open-sourcing a model allows researchers, developers, and companies to access and use the model’s weights and architecture, ...
The latest Gemini model makes impressive strides in benchmarks, but forthcoming models could give it a reality check.
OpenAI’s next GPT model is coming—and soon, according to a person with knowledge of it.Among the highlights, the new model, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results