← The Brief

Agents Go Pro — Thursday, May 28, 2026

Agents Go Pro — Thursday, May 28, 2026

The best daily AI content from around the web to get you caught up on developments before your first cup of coffee.

2 videos, 33 articles

Executive Summary

# AI & Tech Executive Briefing

The biggest signal today is capital and timeline convergence around autonomous AI. Cognition raised $1B at a $26B valuation to scale Devin, marking autonomous software agents as enterprise-critical rather than experimental. That funding lands as Google DeepMind's Demis Hassabis publicly pegged AGI at just 3-4 years out, tightening already-aggressive industry timelines. Reinforcing the commercial maturation thesis, both OpenAI and Anthropic have quietly transitioned enterprise customers off subsidized pricing onto full API rates—a quiet but decisive shift from land-grab to revenue engine that suggests product-market fit is now real, not aspirational.

Enterprise AI infrastructure saw meaningful moves across the stack. OpenAI launched a Secure MCP Tunnel letting enterprises connect firewalled internal servers to ChatGPT, Codex, and the API without public exposure—removing a key blocker for regulated industries. Google expanded Gemini for Business with shareable Projects and automated agents, narrowing the gap to its Enterprise tier and sharpening competition with Microsoft Copilot. Anthropic extended Claude Voice to 18 new languages with mid-conversation switching, closing a longstanding gap with ChatGPT and Gemini. AWS, meanwhile, is showcasing production playbooks from Mercedes-Benz, Yahoo, and Regeneron—evidence that enterprise AI is graduating from pilots to governed, at-scale deployments.

Specialization and self-improvement emerged as a second clear theme. A specialized React Native model called Apex is outperforming frontier generalists on domain-specific tasks at lower cost, validating the vertical-model thesis. A case study on self-improving tax agents built with Codex shows AI systems now closing their own quality loops in production without engineer intervention. On the research side, Hugging Face's TRL team solved a major RL bottleneck by syncing only delta weights via a hub bucket—cutting per-step transfers from gigabytes to megabytes for trillion-parameter training. Former Google and Apple researchers also launched Trajectory, targeting continuous visual AI feedback loops for robotics and autonomous systems.

Scientific and creative AI both saw category-defining releases. Chan Zuckerberg Biohub released ESM, an open-source "world model of protein biology" that compresses therapeutic antibody binder discovery from months or years into days—a potential inflection point for computational drug design. ElevenLabs launched Music v2, generating full, structurally coherent songs with granular editing and clean licensing, raising the bar for commercially usable AI music. YouTube also overhauled its AI content labeling system, automating disclosures for both creators and viewers as synthetic media volumes climb.

Geopolitically, Nvidia's commitment of $150B annually to Taiwan directly contradicts the Trump administration's push to anchor AI manufacturing in the US, laying bare the gap between industrial policy ambitions and supply chain reality. Combined with Hassabis's AGI timeline and the OpenAI/Anthropic pricing shift, the day's throughline is unmistakable: AI is hardening into critical infrastructure—commercially, scientifically, and geopolitically—faster than policy frameworks can adapt.

More Devins in More Places | Cognition

TLDR AIThe Rundown AI

Why it matters

  • Cognition's $1B raise at a $26B valuation signals that autonomous AI software agents have crossed from experimental to enterprise-critical infrastructure.

Key details

  • Devin's enterprise usage grew 10x in 2026 alone, with run-rate revenue hitting $492M and clients including Citi, Goldman Sachs, Mercedes-Benz, and the U.S. Army and Navy.
  • Cognition's own engineers now have 89% of their committed code written by Devin, making the company a live proof-of-concept for the "self-driving software development" model it's selling.

Bottom line

  • Devin is no longer a demo—it's a scaled revenue engine reshaping how the world's largest organizations build and maintain software.

Improving AI labels for viewers and creators

TLDR AIThe Rundown AI

## YouTube Overhauls AI Content Labels for Clarity and Automation

Why it matters

  • YouTube is making AI disclosures harder to ignore by moving labels to prime real estate—directly below videos or overlaid on Shorts—while adding automatic detection to catch creators who don't self-disclose.

Key details

  • Labels for photorealistic or meaningfully AI-altered content now appear above the description on long-form videos and as an on-screen overlay on Shorts, replacing the buried description placement.
  • Starting May 2026, YouTube's systems will automatically apply AI labels when significant photorealistic AI use is detected, though creators can dispute incorrect flags via YouTube Studio—except for content made with YouTube's own tools (Veo, Dream Screen) or carrying C2PA metadata.

Bottom line

  • AI disclosure labels on YouTube are becoming both more visible and more automatic, but they carry no penalty—they don't affect recommendations or monetization eligibility.

YouTube

AI News & Strategy Daily | Nate B Jones

I Built a Deck With AI, Then Made a Second AI Attack It.

Why it's interesting

  • Most people treat AI as a faster way to produce office files; this video argues that's exactly the wrong mental model — quality collapses unless you build a *system* around AI, not just bolt it onto your existing workflow.
  • The "hostile reviewer" technique — using Claude Opus to aggressively audit what Codex builds, in a repeatable loop — is a concrete, immediately usable counter to AI-generated documents that *look* right but contain silent errors.

Key concepts

  • Four-stage document workflow: Source prep → File specification/structure → Constrained artifact creation → Hostile verification review — replacing the naive "prompt → output" approach.
  • Task risk gradient: AI is lowest-risk for formatting and layout, medium-risk for source attribution, and highest-risk for numerical synthesis, financial calculations, and claims traveling to senior leadership — each tier requires a different review burden.
  • File specification as blueprint: Before any slide or formula is created, AI should produce a narrative spine (for decks) or tab architecture with calculation flow (for workbooks) — if the blueprint doesn't show where the truth lives, the finished file won't either.
  • Ralph loop: An autonomous edit cycle where Codex builds, Opus enumerates problems without fixing them, Codex patches, and Opus re-checks — repeated until the output reaches A-level quality.

Main takeaways

  • Never ask AI to jump from messy source folders to a finished file — first ask it to inventory and index what it can see, flagging data status (current, superseded, estimated, raw) and conflicts.
  • The hostile reviewer prompt works because it flips the model's task from *generation* to *enumeration*: "Don't fix anything, just list every unsupported claim, untraceable number, and inconsistent formula."
  • Splitting PowerPoint creation into a storyboard pass (argument + evidence trail, no visuals) and a render pass prevents visual polish from hiding a weak underlying argument.
  • For Excel, the single reliability test is: *if I change one assumption, does the relevant output change for the right reason?* — a model that can't pass that test isn't a financial model, it's a costume.
  • Deep knowledge work can't be reduced to a push-button workflow because it's profoundly domain-specific — the human must own the truth layer even as AI handles the construction.

Bottom line

  • A prompt asks for an output; a workflow defines the stages that output must survive before it can be trusted — until you operate in workflow mode rather than prompt mode, AI-generated office documents will keep failing silently in consequential moments.

Every

We Automated Everything With AI and Tripled Our Headcount

## We Automated Everything With AI and Tripled Our Headcount

Why it's interesting

  • Every, an AI-native media company, has grown from 4 to 30 people *while* aggressively automating with agents — directly contradicting the dominant narrative that AI shrinks headcount.
  • The host challenges the "AI kills jobs" doomer framing not with theory but with live operational evidence from inside a company where agents outnumber humans in Slack.

Key concepts

  • "AI makes yesterday's expert competence cheap" — models are trained on existing outputs, so they flood the zone with work that looks expert-level but is generically correct rather than situationally right, creating *more* demand for actual experts to fix and elevate it.
  • "The further an agent gets from a human, the less valuable it is" — close human-agent collaboration consistently outperforms fully autonomous pipelines; agents still need humans to define what matters.
  • The Achilles framing — AI sprints ahead on any articulated task, but always stops and looks back asking "what next?"; the inability to self-direct (true agency) is the structural gap that preserves human relevance.
  • Benchmark saturation problem — exponential benchmark improvement is real but misleading; every time a benchmark is saturated, a broader frame resets the model to near-zero, so progress ≠ human-equivalent capability.

Main takeaways

  • Automation creates a glut of "close but not quite right" work, which *increases* demand for experts who can build systems to quality-control and elevate that output.
  • Companies announcing layoffs alongside AI adoption often have pre-existing structural problems (bloat, bad strategy) and are using AI as cover — treat those announcements with skepticism.
  • Customer resistance to AI (e.g., call center callers demanding humans) is a genuine adoption brake; technology availability and technology adoption are very different timelines.
  • The right personal response to AI disruption is simple: ride the models — learn each new generation of tools as they arrive and apply them to your own work.
  • Employment contracts and compensation models may need rethinking once workers' expertise becomes training data; the value of human contribution depreciates fast once it's captured.

Bottom line

  • AI doesn't eliminate the need for humans — it eliminates the need for humans to do *articulated, repeatable tasks*, while expanding demand for the judgment, direction, and situational awareness that can't yet be fully specified.

No new videos: Greg Isenberg, Lenny's Podcast, Y Combinator, The Boring Marketer

Newsletter Articles

More Devins in More Places | Cognition

via TLDR AI

Why it matters

  • Cognition's $1B raise at a $26B valuation signals that autonomous AI software agents have crossed from experimental to enterprise-critical infrastructure.

Key details

  • Devin's enterprise usage grew 10x in 2026 alone, with run-rate revenue hitting $492M and clients including Citi, Goldman Sachs, Mercedes-Benz, and the U.S. Army and Navy.
  • Cognition's own engineers now have 89% of their committed code written by Devin, making the company a live proof-of-concept for the "self-driving software development" model it's selling.

Bottom line

  • Devin is no longer a demo—it's a scaled revenue engine reshaping how the world's largest organizations build and maintain software.

Introducing Music v2

via TLDR AI

Why it matters

  • ElevenLabs is pushing AI music generation from short clips to full, structurally coherent songs with granular editing control, raising the bar for what's commercially viable without licensing headaches.

Key details

  • Music v2 supports genre-switching within a single track, fast rap delivery, embedded sound effects, section-by-section composition, and improved inpainting for targeted edits.
  • Pricing drops up to 50% for API customers and up to 40% for ElevenCreative self-serve users, with all output cleared for commercial use on licensed training data.

Bottom line

  • Music v2 is a practical upgrade for developers and brands who need controllable, licensable AI music at scale, not just a demo-friendly novelty.

Biohub releases a world model of protein biology

via TLDR AI

Why it matters

  • Biohub's open-source protein AI toolkit can compress drug binder discovery from years of lab screening into days of computation.

Key details

  • ESMFold2 outperforms AlphaFold 3 on antibody-antigen prediction and achieved experimental hit rates of 36–88% designing binders against five cancer/immunology targets including PD-L1 and EGFR.
  • ESM Atlas indexes 6.8 billion protein sequences and 1.1 billion predicted structures, making vast amounts of unannotated biology searchable for the first time.

Bottom line

  • By releasing ESMC, ESMFold2, and ESM Atlas freely, Biohub hands every researcher a state-of-the-art protein design engine that turns computational predictions into lab-validated therapeutics.

Shipping a Trillion Parameters With a Hub Bucket: Delta Weight Sync in TRL

via TLDR AI

Why it matters

  • Shipping full model weights every RL training step is a massive bottleneck—this fix cuts per-step transfer from gigabytes to megabytes by sending only what actually changed.

Key details

  • Because bf16 arithmetic absorbs tiny Adam updates at RL learning rates, over 99% of weights are bit-identical between consecutive steps, shrinking a 1.2 GB Qwen3-0.6B payload down to 20–35 MB.
  • The implementation routes sparse weight diffs through a Hugging Face Bucket (backed by content-deduplicating Xet storage), letting the trainer and vLLM inference server sync weights across different clouds or regions with no shared network fabric.

Bottom line

  • A merged TRL pull request now makes frontier-scale async RL weight sync cheap and geographically flexible via `pip install`, requiring only two Hub API calls and ~30 lines of vLLM extension code.

Building self-improving tax agents with Codex

via TLDR AI

Why it matters

  • AI agents can now close their own improvement loop in production without waiting for engineers to manually find and fix failures.

Key details

  • Tax AI processed 7,000 returns this season, jumping from 25% to 86% of returns hitting 75% accuracy within just six weeks of deployment.
  • The system uses Codex to automatically inspect production traces, convert recurring practitioner corrections into structured evals, and propose targeted code fixes for review.

Bottom line

  • The real breakthrough isn't the tax automation itself—it's the reusable architecture where practitioner feedback, production traces, and Codex-driven iteration compound into continuous, measurable self-improvement.

I think Anthropic and OpenAI have found product-market fit

via TLDR AI

Why it matters

  • Both OpenAI and Anthropic have quietly shifted enterprise customers to full API pricing, turning AI from a subsidized perk into a serious revenue engine.

Key details

  • A power user burning ~$2,180/month in API tokens pays just $200 via consumer plans, but enterprises now pay full API rates—driving Anthropic toward its first profitable quarter at a rumored $10.9B in Q2 revenue.
  • Anthropic signed a $1.25B/month compute deal with SpaceX's Colossus for inference capacity, signaling that demand for coding agents like Claude Code has grown large enough to require massive new infrastructure.

Bottom line

  • Coding agents finally cracked the monetization problem: high-token, high-value professional use cases at enterprise API pricing are generating real money where consumer subscriptions never could.

Secure MCP Tunnel | OpenAI API

via TLDR AI

Why it matters

  • OpenAI now lets enterprises connect private, firewalled MCP servers to ChatGPT, Codex, and its API without exposing those servers to the public internet.

Key details

  • The `tunnel-client` tool runs inside your network, makes outbound-only HTTPS calls to `api.openai.com:443`, and long-polls for queued MCP work—no inbound firewall ports required.
  • Deployment is flexible (Kubernetes sidecar, dedicated VM, systemd service) and supports enterprise requirements like outbound proxies, custom CA bundles, and mutual TLS.

Bottom line

  • If your organization has private MCP servers behind a firewall, Secure MCP Tunnel is the official path to safely connect them to OpenAI products without any public ingress exposure.

Introducing Apex: A Fast, Specialized Model for React Native

via TLDR AI

Why it matters

  • Specialized AI models are proving cheaper and faster than frontier generalists for domain-specific work, and Apex is a concrete example targeting React Native's notoriously complex cross-platform ecosystem.

Key details

  • Built on Gemma 4 with SFT and GRPO training on curated React Native repos, Apex delivers 2,000–4,000+ tokens/second on dedicated NVIDIA RTX PRO 6000 Blackwell GPUs.
  • Callstack prepared ~50 model variants since February and is now running a private beta, with a full public release pending legal and commercial groundwork.

Bottom line

  • If your team ships React Native at scale, Apex promises meaningfully lower inference costs and faster, more accurate answers than general-purpose models—worth applying for the beta to validate that claim against your real workload.

LocateAnything

via TLDR AI

Why it matters

  • Current vision-language models waste time decoding bounding boxes token-by-token; fixing this unlocks faster, more accurate object localization for robotics, GUIs, and document AI.

Key details

  • Parallel Box Decoding predicts all four box coordinates in one step, hitting 12.7 boxes-per-second—over 10× faster than Qwen3-VL's 1.1 BPS—while improving LVIS F1 by +3.8% over the prior best.
  • The accompanying 138M-sample dataset spans general detection, GUI grounding, OCR, and layout tasks across 12M unique images, giving the model broad localization coverage.

Bottom line

  • LocateAnything proves you can simultaneously beat the speed *and* accuracy of autoregressive coordinate decoding by treating bounding boxes as atomic parallel units rather than sequential token strings.

Thread by @llama_index on Thread Reader App

via TLDR AI

The article text contains no substantive content from the @llama\_index thread — only Thread Reader App's donation/membership page was captured.

  • Cannot summarize: The actual thread content failed to load; only the site's paywall and support messaging is present in the scraped text.

Recommendation

  • Visit the original thread directly on X (formerly Twitter) by searching @llama\_index to access the full content.

Nvidia bets $150B on Taiwan as Trump's plan to make US an AI hub backfires

via TLDR AI

Why it matters

  • Nvidia's $150B/year Taiwan bet directly undercuts Trump's push to make the US the global AI hub, exposing a fundamental tension between corporate supply chain reality and US industrial policy.

Key details

  • Nvidia is building a new Taiwan HQ operational by 2030, deepening ties with TSMC, Foxconn, and others for advanced chip packaging not yet available at US facilities.
  • Trump's plan to take a 25% cut of Nvidia chips sold to China backfired completely—China refused to buy the chips over fears the US would tamper with them.

Bottom line

  • Despite public commitments to US investment, Nvidia's leadership clearly believes Taiwan's manufacturing ecosystem is irreplaceable for meeting surging AI demand, and is acting accordingly.

Improving AI labels for viewers and creators

via TLDR AI

## YouTube Overhauls AI Content Labels for Clarity and Automation

Why it matters

  • YouTube is making AI disclosures harder to ignore by moving labels to prime real estate—directly below videos or overlaid on Shorts—while adding automatic detection to catch creators who don't self-disclose.

Key details

  • Labels for photorealistic or meaningfully AI-altered content now appear above the description on long-form videos and as an on-screen overlay on Shorts, replacing the buried description placement.
  • Starting May 2026, YouTube's systems will automatically apply AI labels when significant photorealistic AI use is detected, though creators can dispute incorrect flags via YouTube Studio—except for content made with YouTube's own tools (Veo, Dream Screen) or carrying C2PA metadata.

Bottom line

  • AI disclosure labels on YouTube are becoming both more visible and more automatic, but they carry no penalty—they don't affect recommendations or monetization eligibility.

Epicure: Navigating the Emergent Geometry of Food Ingredient Embeddings

via TLDR AI

Why it matters

  • Structured ingredient embeddings that blend culinary co-occurrence with flavor chemistry could power smarter recipe recommendation, substitution, and food-pairing tools.

Key details

  • The models are trained on 4.14M recipes across seven languages, normalized to 1,790 canonical ingredients using an LLM-assisted pipeline.
  • Three Metapath2Vec variants (Cooc, Chem, Core) let users dial between pure recipe-context signals and pure flavor-chemistry signals, or blend both.

Bottom line

  • Epicure is a publicly grounded, multilingual ingredient embedding toolkit that systematically encodes both how ingredients are cooked together and how they share chemical compounds.

Former Google and Apple researchers launch Trajectory to enhance AI feedback loops

via TLDR AI

Why it matters

  • Visual AI reasoning remains a critical unsolved problem, and Trajectory's continuous feedback loop approach could unlock real-world applications in robotics, autonomous vehicles, and manufacturing.

Key details

  • The founding team includes Andrew Dai (14+ years at Google DeepMind, led Gemini pre-training) and Apple's former chief research scientist Yinfei Yang, giving the startup rare top-tier pedigree.
  • Trajectory is targeting ~$50M in seed funding to build multimodal AI systems that learn continuously from visual data rather than relying on infrequent, large-scale training runs.

Bottom line

  • Rather than chasing another general-purpose chatbot, Trajectory is placing a focused bet on closing the gap between AI's strong text performance and its still-primitive visual reasoning capabilities.

FINDING HIGH-SEVERITY SECURITY ISSUES WITH PUBLICLY AVAILABLE MODELS

via TLDR AI

The article content failed to load due to a privacy extension or access issue on X (formerly Twitter), so I'm unable to summarize the actual details of this post.

  • Why it matters
  • Cannot be determined — the article body did not load successfully.

Key details

  • The source is a post by @RampLabs on X, titled about finding high-severity security issues using publicly available models.
  • No specific facts, numbers, or technical details are accessible from the loaded content.

Bottom line

  • To read this content, disable privacy extensions and visit the URL directly, or search for RampLabs' post on AI-assisted security vulnerability discovery.

Google DeepMind’s Hassabis: AGI is 3 to 4 years away

via TLDR AI

Why it matters

  • AGI timelines are compressing fast, and the world's leading AI researchers are now betting it arrives within this decade.

Key details

  • Hassabis shortened his AGI forecast from 2030–2035 to 2029–2030, citing the rapid acceleration of AI agents as the key driver.
  • Other tech leaders are split: Sutskever puts AGI between 2030–2045, while Jensen Huang controversially claims it has already arrived.

Bottom line

  • The people building AGI are converging on a ~2029–2030 target, meaning society has roughly 3–4 years to prepare for what Hassabis calls "the foothills of the singularity."

Google expands Gemini for Business with shareable Projects

via TLDR AI

Why it matters

  • Google is closing the gap between its Business and Enterprise Gemini tiers, giving paying teams shared workspaces and automated agents that directly challenge Microsoft Copilot, Anthropic, and OpenAI.

Key details

  • Projects on Gemini for Business act as multi-user container workspaces with dedicated folders, uploaded files, custom system instructions, and shared chat access for collaborators.
  • A new workflow agent builder lets Business users configure scheduled, automated tasks that connect Gmail, Drive, Calendar, and third-party tools—a capability previously limited to Enterprise.

Bottom line

  • Google's core competitive bet is that project-level memory paired with scheduled multi-step agents can make Gemini the default always-on assistant for entire teams, not just individual users.

Anthropic to expand Claude Voice Mode to more languages

via TLDR AI

Why it matters

  • Claude Voice has lagged behind ChatGPT and Gemini on multilingual support; this update closes that gap with 18 new languages and mid-conversation language switching.

Key details

  • The upcoming update adds German, Portuguese, Chinese, Japanese, Russian, Ukrainian, and more, with 1–2 voices per language versus English's five existing personas.
  • Language switching will work on-the-fly via voice command, a new capability built on top of existing text-to-speech providers like ElevenLabs rather than a native audio model.

Bottom line

  • Anthropic is patching one of Claude's most visible competitive weaknesses in voice, though no launch date is confirmed and the language list may still change.

ESM: A World Model of Protein Biology

via The Rundown AI

Why it matters

  • Computational protein design can now produce therapeutic-grade antibody binders in days rather than the months-to-years required by traditional lab methods.

Key details

  • ESMFold2 correctly predicts 55% of antibody-antigen complexes (outperforming AlphaFold 3) and designed validated nanomolar-affinity binders against five cancer/immunology targets including PD-L1 and EGFR.
  • The underlying language model ESMC was trained on 2.8 billion protein sequences, and scaling compute during design directly improved scFv binder success rates from 12% to 21%.

Bottom line

  • Biohub's ESM system has crossed a practical threshold where purely computational protein design produces lab-validated, therapeutically relevant binders—compressing years of early drug discovery work into days.

Amazon Web Services (AWS) - Cloud Computing Services

via The Rundown AI

Why it matters

  • Enterprise AI is moving beyond pilots — leaders from Mercedes-Benz, Yahoo, and Regeneron are sharing real playbooks for scaling AI to production with governance built in.

Key details

  • The 90-minute on-demand panel covers four concrete focus areas: building data foundations, aligning AI to business decisions, continuous learning systems, and database selection for AI apps.
  • AWS Marketplace is positioned as the procurement layer, streamlining how enterprises discover and deploy AI/data solutions, with Amazon Bedrock highlighted for secure generative AI adoption.

Bottom line

  • If your organization is stuck in AI pilot purgatory, this panel offers practitioner-level strategies — not vendor pitches — from companies already operating AI at enterprise scale.

Economic Futures in the Age of AI

via The Rundown AI

Why it matters

  • AI-driven economic disruption is accelerating faster than existing institutions can measure or respond to, creating an urgent window to build safety nets before they're desperately needed.

Key details

  • The OpenAI Foundation is deploying $250M toward three areas: measuring AI's economic impact, supporting displaced workers, and designing long-term wealth-sharing mechanisms like sovereign funds and capital taxation.
  • The initiative explicitly targets scenarios where wage income shrinks dramatically, proposing adaptive fiscal tools—such as taxing capital over labor and Alaska Permanent Fund-style dividends—to redistribute AI-generated gains broadly.

Bottom line

  • This is the largest AI-focused economic resilience investment announced to date, signaling that even AI developers believe proactive policy architecture is necessary to prevent extreme wealth concentration from AI productivity gains.

Tely Health — Get booked patients from AI search

via The Rundown AI

Why it matters

  • Healthcare practices are losing patients to competitors already appearing in AI-generated search results from ChatGPT, Perplexity, and Google.

Key details

  • Tely Health automates the full patient acquisition funnel — AI search visibility, 24/7 chat/voice booking, insurance verification, SMS follow-up, retargeting, and direct EHR integration (Epic, Athena, Cerner, and others).
  • A Miami cardiology client reportedly reached 30,000+ patients/month and generated an estimated $3.68M in new-patient revenue with 1,050 new patients monthly.

Bottom line

  • Tely Health is positioning itself as an end-to-end AI patient acquisition layer for U.S. healthcare practices, replacing ad agencies and front-desk intake with fully automated, HIPAA-compliant infrastructure.

Former Google and Apple Researchers Launch a Startup to Build AI’s Missing Feedback Loop

via The Rundown AI

Why it matters

  • Most AI models stop learning after training, and Trajectory is building infrastructure to fix that gap for any company, not just elite labs.

Key details

  • The startup raised a $15M seed round at a $115M valuation, backed by Conviction, Bessemer, Jeff Dean, and Fei-Fei Li.
  • Its platform post-trains open-source models on real user failure data—currently weekly—with early customers including Clay, Harvey, and Decagon.

Bottom line

  • Trajectory is betting that continuous, data-driven model improvement—already quietly powering AI coding tools like Cursor—can be productized and sold to every company that deploys AI.

launched

via The Rundown AI

  • The article content could not be loaded due to a technical error or privacy extension interference on X (formerly Twitter).

Why it matters

  • Unable to determine significance without accessible article content.

Key details

  • No factual details, numbers, or developments could be retrieved from the source.
  • The URL points to an X post by user @rronak\_ that remains inaccessible.

Bottom line

  • The content cannot be summarized until the source is accessible; disabling privacy extensions or trying a different browser may resolve the issue.

Sesame

via The Rundown AI

Why it matters

  • Sesame is positioning ambient AI as a wearable, always-on experience rather than a screen-based one.

Key details

  • The company is building a suite of "personal agents" designed for casual, exploratory use in everyday moments.
  • Sesame eyewear with high-quality audio and hands-free AI access is slated for a 2027 launch.

Bottom line

  • Sesame is betting that ambient, voice-first AI embedded in glasses will be the next natural interface for personal computing.

Harvey – Professional Class AI

via The Rundown AI

Why it matters

  • AI purpose-built for legal work is gaining serious enterprise traction, signaling the legal industry's shift from AI skepticism to adoption.

Key details

  • Harvey is used by 60+ AmLaw 100 firms and 1,500+ law firms/in-house teams, with 142,000+ professionals saving 20+ hours per month.
  • The platform offers enterprise-grade security including SAML SSO, audit logs, IP allow-listing, and data lifecycle management across 60+ countries.

Bottom line

  • Harvey has become the dominant AI infrastructure layer for elite legal teams, with adoption at the top of the market already largely locked in.

announced

via The Rundown AI

I'm unable to summarize this article because the content failed to load — the page returned an error message rather than actual article text, likely due to X's privacy/access restrictions.

Why it matters

  • Without readable content, there is no verifiable information to report on.

Key details

  • The URL points to a post by @thsottiaux on X, but no post text was retrieved.
  • The error suggests a privacy extension or access block prevented content from loading.

Bottom line

  • To get a proper summary, try accessing the URL directly in a browser with privacy extensions disabled, then resubmit the actual post text.

debuted

via The Rundown AI

I'm unable to summarize this article because the content failed to load — the page returned an error message rather than actual article text, likely due to X's privacy/access restrictions.

Why it matters

  • No content was retrieved, so no meaningful analysis can be made.

Key details

  • The URL points to a Google Gemma post on X, but the body text contains only an error prompt.
  • The only context available is the label "debuted," suggesting a product or feature launch was announced.

Bottom line

  • Without accessible content, any summary would be speculation rather than fact-based reporting.

rolled out

via The Rundown AI

I'm unable to summarize this article because the content failed to load — the page returned an error message rather than actual article text, likely due to X's access restrictions or privacy-related blocking.

  • No factual content is available to summarize from this source.

If you can share the actual text of the post or article, I'm happy to write the full digest.

Robinhood is Now Open to Agents

via The Rundown AI

Why it matters

  • Robinhood is the first major retail brokerage to offer official AI agent access via MCP servers, letting users automate both trading and spending without relying on unofficial workarounds.

Key details

  • Agentic Trading launches in beta for equities only, with a sandboxed account, real-time P&L feed, and push notifications per trade; options, crypto, and futures support are coming later.
  • The Agentic Credit Card gives AI agents a dedicated virtual card with a user-set spending limit, manual approval toggle, and 3% cash back, available now to Robinhood Gold Card holders.

Bottom line

  • Robinhood has made autonomous AI-driven investing and spending a retail-accessible reality, but its own disclosures make clear that users bear full responsibility for whatever their agents do.

Improving AI labels for viewers and creators

via The Rundown AI

## YouTube Improves AI Content Labels for Viewers and Creators

Why it matters

  • YouTube is making AI disclosures harder to miss, moving labels to prime real estate directly below videos and as Shorts overlays starting May 2026.

Key details

  • Automatic AI detection will now apply labels even when creators skip manual disclosure, with permanent labels for content made with YouTube's own tools (Veo, Dream Screen) or carrying C2PA metadata.
  • AI labels do not affect a video's recommendation ranking or monetization eligibility, keeping the changes purely informational.

Bottom line

  • YouTube is shifting from an honor-system disclosure model to an automated one, reducing creators' ability to quietly skip labeling realistic AI-generated content.

More Devins in More Places | Cognition

via The Rundown AI

Why it matters

  • Cognition's $1B raise at a $26B valuation signals that autonomous AI software agents have crossed from experiment to enterprise-grade infrastructure.

Key details

  • Devin's enterprise usage grew 10x in 2026 and hit $492M run-rate revenue, with major clients including Citi, Goldman Sachs, Mercedes-Benz, and the U.S. Army and Navy.
  • At Cognition itself, Devin writes 89% of all committed code, offering a concrete benchmark for how far AI-driven development has already progressed.

Bottom line

  • Cognition is positioning Devin not as a coding assistant but as a replacement for routine engineering execution, with humans reserved for high-level problem structuring.

Exclusive: Demis Hassabis on AGI, curing diseases with AI

via The Rundown AI

Why it matters

  • AGI arriving by ~2030 would compress decades of scientific progress into years, with direct implications for drug discovery, job markets, and human identity.

Key details

  • Hassabis pinpoints four unsolved gaps blocking AGI: world physics, memory, consistency, and continual learning — with oncology and immunology as AI's first disease targets.
  • A Stanford study of 4M job applications found AI hiring tools disproportionately screen out Black and Asian applicants, with shared models amplifying bias across 42 employers simultaneously.

Bottom line

  • AGI is close enough that its specific technical gaps and embedded societal biases demand urgent attention now, not after arrival.