2025-11-07 AI Newsletter

2025-11-07

External news

Introspection in LLMs

It’s been a light couple of weeks for news relevant to this newsletter, but we found Anthropic’s recent blog post on introspection in LLMs interesting. Anthropic explored whether its models can recognize and report on their own internal activity (i.e., introspect). Introspection is useful because it could help models explain their reasoning and help debug behavioral issues.

Anthropic studied this using concept injection: they artificially introduced a specific activation pattern into a model (e.g., the pattern associated with “a dog”). Then, they asked the model if it detected an injected thought and what that thought was.

From https://www.anthropic.com/research/introspection

You could think about this like a neuroscientist discovering the brain activation pattern that corresponds to thinking about dogs, hooking you up to electrodes to trigger that pattern, and then asking you about your thoughts.

The results were mixed. The Claude models tested could sometimes detect and identify the injected thought, but often failed. The best-performing model, Claude Opus 4.1, was only successful about 20% of the time.

Posit news

For the early adopters, Positron Assistant now has preview support for custom model providers via OpenAI compatible endpoints (such as models hosted on OpenRouter) and preview support for AWS Bedrock. We’re still sanding off the rough edges of these implementations, but this functionality is available for folks on the cutting edge to experiment with.
Simon shared side::kick() this week, an experimental open-source coding agent for RStudio built entirely in R. It can interact with your files, talk to your active R session, and run code. While the package will likely not head to CRAN, expect to see many of its features make their way into our official AI agents and packages in the coming months.
A new release of mcptools, which implements the Model Context Protocol in R, is now on CRAN. mcptools allows LLM applications to share context with R–for example, giving Claude Code the ability to read R package docs. On the Python side, chatlas also includes support for MCP.
Next week is the R+AI conference, hosted by the R Consortium. The event is virtual. Several Posit folks are giving talks, including a keynote by Joe Cheng and sessions from Garrick Aden-Buie and Max Kuhn.
In the latest episode of The Test Set, Posit’s data science podcast, Julia Silge talks about open source in the age of LLMs.

Terms

Anthropic’s research described earlier in the newsletter focused on introspection as an emergent capability of LLMs. A capability or behavior is emergent if it arises from the bottom-up, complex processes of a system rather than through explicit design. For example, bird flocking is an emergent behavior. No individual bird directs the formation. Instead, each bird follows a set of simple rules, like staying a certain distance from other birds and steering in the same direction as its neighbors. These simple rules create the emergent, complex behavior of the flock.

Introspection is an example of an emergent capability because it arose from the building blocks of LLM behavior. It was not programmed into the model or explicitly represented in any way. In a broad sense, almost all LLM behavior is emergent, arising from a set of rules and implementation decisions simpler than the behaviors themselves.

But introspection is also emergent in a narrower sense. The labs (as far as we know) weren’t training models to be introspective. They discovered this ability after the fact.

Generally, AI labs focus on driving down error metrics through training improvements rather than producing specific capabilities in LLMs. It remains difficult to predict at what error metric values certain capabilities will start to emerge. The threshold at which models gained abilities like adding three-digit numbers, for instance, wasn’t known until it happened. Strategies like concept injection (explained earlier) can help labs identify when these behaviors appear.

Learn more

Claude models—Claude 4.5 Sonnet, Claude 4, and Claude 4.5 Haiku—now hold all top three spots on SWE-Bench-Pro, a widely-watched evaluation of AI agents for software engineering.
Following our recent look at the staggering scale of investment in AI, Anthropic announced a partnership with Google Cloud that will drastically increase the amount of compute that they have access to.
OpenAI has completed its transition to a for-profit company, formalizing a shift of power from its nonprofit arm to its for-profit entity. Originally founded in 2015 as a nonprofit, OpenAI later introduced a capped-profit structure under which the nonprofit retained rights to profits beyond a certain cap. This new arrangement dissolves those caps, giving private investors a greater share of further profits, effectively transferring what could amount to hundreds of billions in value to the for-profit arm. The organization is now reportedly preparing for an IPO at a $1 trillion valuation.

Tags: AI AI Newsletter

Stay up-to-date with the latest AI integrations from Posit: subscribe today!

Posit is constantly developing new and innovative ways to integrate AI into data science workflows. Subscribe to our emails to stay informed about the latest releases, features, and updates.