We provided a mathematical analysis of how a rational agent would respond to data generated by a sycophantic AI that samples examples from the distribution implied by the user’s hypothesis (p(d|h∗)p(d|h^{*})) rather than the true distribution of the world (p(d|true process)p(d|\text{true process})). This analysis showed that such an agent would be likely to become increasingly confident in an incorrect hypothesis. We tested this prediction through people’s interactions with LLM chatbots and found that default, unmodified chatbots (our Default GPT condition) behave indistinguishably from chatbots explicitly prompted to provide confirmatory evidence (our Rule Confirming condition). Both suppressed rule discovery and inflated confidence. These results support our model, and the fact that default models matched an explicitly confirmatory strategy suggests that this probabilistic framework offers a useful model for understanding their behavior.
Последние новости
。PDF资料是该领域的重要参考
Россиянин получил срок за текст об украинском военном формировании14:58。电影是该领域的重要参考
Queries are evaluated on immutable snapshots with ZLinq-backed projection/filtering.
Трамп определил приоритетность Украины для США20:32