AI 每日简报 · 2026-06-16

模型

新模型、开源权重与评测。

掘金 AI 热榜Forum·8 天前75

白嫖DeepSeek V4 Pro！免费无限用，还能接入Claude-Code

编程

掘金 AI 热榜Forum·9 天前75

面试官坏笑：“你用 AI 编程一年了，怎么保证 Claude Code 写出来的代码是对的？”我：“直接上 Claude Fable 5 啊！”

编程

AI HotRSS·8 天前74

GLM-5.2 Deep Think Max 对比 GPT-5.2

Ethan Mollick 将 7 个月前的 GPT-5.2 与新款 GLM-5.2 Deep Think Max 进行对比，用同一提示词要求生成可运行于 Twigl 的着色器（描绘哥特塔楼无限城市半淹于风暴海洋）。GLM-5.2 出现了若干错误。此前 Ethan 曾提前体验 GPT-5.2，并展示了 GPT-5.2 Pro 单次生成的该着色器版本。

AI HotRSS·9 天前74

伯克利RDI发布Agents' Last Exam基准

2026年6月，伯克利RDI发布Agents' Last Exam（ALE）基准，包含1，500余项源于真实工作的任务，覆盖55个非体力职业。对Fable 5、GPT-5.5、Composer 2.5等前沿智能体的测评显示：在最困难层级成功率均为0%；整体任务表现接近，但单任务成本差异巨大（Fable 5约$15.70，GPT-5.5约$3.80，Composer 2.5约$1.33）。CLI子集ALE-CLI最佳通过率仅25.2%。主要失败模式是智能体未验证输出即宣称完成。数据集、代码及CLI子集已开源。

TechCrunch AIRSS·8 天前71

ChatGPT的市场份额首次跌破50%

这款聊天机器人仍然是全球最受欢迎的AI助手，拥有超过11亿月活跃用户，其次是Gemini（6.62亿）和Claude（2.45亿）。

产品

值得关注的产品发布与更新。

掘金 AI 热榜Forum·8 天前76

马来西亚AI代理驱动的聊天应用Respond.io融资6250万美元，瞄准并购

Respond.io是值得关注的马来西亚初创公司之一，它利用AI代理处理大量客户咨询，并按对话次数而非座位数收费。

智能体

行业

融资、政策与市场动向。

TechCrunch AIRSS·8 天前71

调查显示，60%的美国消费者认为品牌信息中的‘AI’是减分项

WordPress VIP的最新调查显示，尽管企业越来越认为AI搜索是一个重要的引流渠道，但消费者对AI生成的答案仍持警惕态度。

TechCrunch AIRSS·8 天前71

SpaceX上市：上市后你需要知道的一切

TechCrunch从早期就关注了SpaceX的起步、挣扎和成功。我们同样关注接下来会发生什么。这份关于SpaceX上市的报道包括了谁可能受益（也许也有人不会），上市前的交易，以及其S-1注册文件中隐藏的内容。

TechCrunch AIRSS·8 天前71

司法部称xAI的未经许可的燃气轮机是‘国家安全、经济和能源安全’问题

司法部表示，五角大楼需要xAI继续使用其未经许可的燃气轮机。

安全

TechCrunch AIRSS·8 天前71

Plaud称其软件业务在交付超过200万台AI记事本后年收入突破1亿美元

Plaud正试图在充满AI驱动的会议记事本的拥挤市场中占据一席之地。

TechCrunch AIRSS·8 天前71

Robinhood关于裁员10%的说明显示，指责AI已不再奏效

与许多科技行业同行不同，他们因需要优化以充分利用AI而裁掉数千名员工，Robinhood首席执行官Vlad Tenev在其裁员说明中明显未提及AI。

论文

值得一读的研究与论文。

arXiv cs.AIPaper·9 天前61

良好解释的定义及解释大语言模型输出的挑战

arXiv:2606.14838v1 Announce Type: new Abstract: How to define a good explanation is a long-standing philosophical debate which has found recent renewed interest in the context of AI outputs. Explainability is crucial for AI adoption in many contexts, but in order to produce good explanations of AI systems, we must first have an understanding of what good explanations are. In this paper we propose a definition inspired by the notion of counterfactual explanations, however we argue that one must also take into account the interlocutor's prior beliefs in each fact that could be offered in an explanation. We explore the ramifications of this definition for AI explainability and, in particular, why LLM outputs are difficult to produce good explanations for.

arXiv cs.AIPaper·9 天前61

Dr-DCI：通过动态工作区扩展实现直接语料库交互

arXiv:2606.14885v1 Announce Type: new Abstract: Agentic search over large corpora relies on retriever-mediated interfaces (e.g., BM25 or ColBERT) for scalable candidate discovery. While effective at ranking relevant documents, these interfaces expose evidence only as ranked results or bounded document views, limiting agents' ability to reorganize material and verify constraints across documents. Direct Corpus Interaction (DCI) addresses this limitation by exposing shell-executable corpus operations for flexible search, filtering, comparison, and verification. However, full-corpus terminal commands become slow and unstable as the corpus grows, degrading performance and efficiency. We introduce DR-DCI, a retriever-steered DCI framework that treats retrieval as an agent-callable action for expanding a local workspace. Rather than operating directly over the full corpus, the agent dynamically pulls relevant documents into an evolving workspace and conducts DCI operations within it. This d

端侧智能体