Read Briefing · 2026-05-12

Briefing

100 items ·2026-05-12T22:22

MUST READ

Read these first.

4 items

substack.com 2026-05-11 3 min read

Research|LITE: From Cyclical Optical Name to AI Datacenter Bottleneck

Why it matters

On 2026-05-11 LITE rallied following Nasdaq-100 inclusion, but FundaAI identifies the move as a catalyst on top of last week’s operational validation—demand intensity in 1.6T, InP/EML, and OCS showed revenue, gross-margin, and operating-leverage improvement.

Key details

Innolight (China) investor communication flagged 2.4T and “coherent-lite” (May 2026), shifting market focus from a 1.6T ramp toward next‑gen 2.4T architectures.
2.4T/coherent‑lite is more than a bandwidth bump: it requires higher lane rates, more complex modulation schemes, longer‑reach links, and substantially greater photonic‑electronic integration (EML, InP lasers, OCS/CPO).
FundaAI and prior coverage (OFC 2026 preview, Mar 11) have flagged 400G per lane and identify LITE and COHR as best positioned among suppliers with deep EML, InP, laser, and OCS capabilities.

Brief

LITE rallied on May 11 after Nasdaq‑100 inclusion, but FundaAI argues the price action reflects a deeper re‑rating: last week’s results validated strong demand for 1.6T modules and InP/EML/OCS technologies with improvements in revenue, gross margin, and operating leverage, and market attention is already moving to 2.4T/coherent‑lite. Innolight’s investor note calling out 2.4T and coherent‑lite signals that next‑gen datacenter interconnects will demand higher lane rates, more complex modulation, longer reaches, and much tighter photonic‑electronic integration. Those technical requirements favor vendors with real depth in EML, InP lasers, and OCS/CPO hardware—specifically LITE and COHR. FundaAI had earlier flagged 400G per lane (OFC 2026 preview), reinforcing the view that optical suppliers capable of scaling per‑lane speed and integration will become bottleneck assets for AI datacenter buildouts.

By FundaAI

Open reader Original

stratechery.com 2026-05-11 11 min read

The Inference Shift (Stratechery Article 5-11-2026)

Why it matters

Cerebras’ WSE-3 is optimized for low-latency inference with 44 GB on-chip SRAM and ~21 PB/s bandwidth — roughly 6,000× the H100’s memory bandwidth (NVIDIA H100: 80 GB HBM at 3.35 TB/s) — but wafer-scale manufacturing yields raise cost, and Cerebras was raising its IPO range to $150–$160/share and marketing 30 million shares in May 2026.

Key details

GPUs (notably NVIDIA’s H100 and CUDA ecosystem) remain dominant for training because training is highly parallel but requires aggregating memory and interconnect across many chips; Anthropic’s SpaceX deal cites access to >300 MW of capacity (over 220,000 NVIDIA GPUs) to support both training and inference.
Inference consists of prefill (parallel, compute-heavy) and two interleaved decode steps (serial, memory-bandwidth bound): each output token requires reading the KV cache (which grows per token) and the model weights in full, making decode sensitive to memory capacity and bandwidth.
Agentic inference — agents accomplishing tasks without humans in the loop — shifts priorities from token-per-second speed to large, persistent context and state: high-capacity, lower-cost memory (DRAM, SSDs, databases, embeddings) and a memory-centric hierarchy become more important than ultra-low latency compute.

Brief

The Inference Shift (Ben Thompson, May 11, 2026) argues that the next architectural inflection in AI will come from agentic inference — agents doing work without humans in the loop — and that this will reorient infrastructure from ultra-low-latency compute toward memory-centric systems. Thompson contrasts the GPU era (NVIDIA H100, CUDA, HBM and chip-to-chip networking) — optimal for training and flexible inference — with wafer-scale designs like Cerebras’ WSE-3 (44 GB SRAM at ~21 PB/s) that deliver extreme token-speed but at high cost and limited scalability when working sets exceed on-chip memory. He breaks inference into prefill and two interleaved decode stages, noting decode is serial and memory-bandwidth bound because each token requires full reads of both the growing KV cache and model weights.

Because true agents require long-lived context, state, databases, embeddings and large KV caches, Thompson predicts agentic inference will trade latency for capacity: slower, cheaper DRAM/SSD-backed hierarchies with "good enough" compute will be preferable when no human waits for a response. That view preserves NVIDIA’s dominance in training and a market for ultra-fast answer-inference hardware, but forecasts a much larger agentic market that will unbundle GPUs, favor large-capacity memory stacks, enable hyperscalers and non-leading-edge ecosystems (including China and space-based datacenters), and reduce the premium on bleeding-edge silicon.

By Ben Thompson

Open reader Original

substack.com 2026-05-11 4 min read

Nvidia On The Verge Of Losing GPU Lead For A Couple Of Generations

Why it matters

Author (Beyond The Hype, published 2026-05-11) argues Nvidia rushed Rubin to beat AMD MI450/MI455 and that Jensen Huang’s CES 2026 claim Rubin was in “full production” was likely a PR-driven “risk production” move rather than completed validation.

Key details

Rubin’s ramp faces three concrete technical hurdles: HBM4 qualification (vendors Micron, Samsung, SK Hynix can meet original HBM4 but not Nvidia’s eleventh-hour over‑clocked HBM4 pin‑speed demands), ConnectX‑9 integration issues, and cooling challenges — problems the author links to earlier delays on Blackwell B200/GB200.
SemiAnalysis TCO/performance estimates (cited in the article) show AMD MI455 ahead of Rubin on TCO, leading the author to warn Nvidia could lose GPU leadership for a couple of generations if Rubin misses H2 2026 timelines.

Brief

Nvidia is portrayed as rushing its Rubin GPU program to preempt AMD’s MI450/MI455, with the author claiming Jensen Huang’s CES 2026 “full production” announcement was a PR maneuver and that Rubin validation was incomplete. Citing a SemiAnalysis TCO/performance comparison that places AMD’s MI455 ahead of Rubin, the piece identifies three primary Rubin ramp risks: HBM4 qualification (Micron, Samsung and SK Hynix meet original HBM4 but struggle with Nvidia’s higher, last‑minute pin‑speed targets), ConnectX‑9 connectivity problems, and cooling design issues. The note recalls an earlier delayed and immature Blackwell B200/GB200 ramp as precedent, concludes Nvidia’s execution is no longer the unquestioned industry gold standard, and warns investors Rubin slip-ups could cost Nvidia GPU leadership for multiple generations.

By Beyond The Hype from EnerTuition

Open reader Original

divenewsletter.com 2026-05-11 5 min read

Tutor Perini told investors it expects a “blowout” 2026 after reporting a $19.8…

Why it matters

Tutor Perini told investors it expects a “blowout” 2026 after reporting a $19.8 billion backlog and is pursuing numerous megaproject/data-center bids (reported in the May 11, 2026 Daily Dive earnings roundup).

Key details

Skanska said in its Q1 2026 earnings that U.S. construction work is insulating the firm from major economic headwinds and helping it manage risk while delivering profits.
Three builders—Skanska, Turner and Balfour Beatty—are using AI on jobsites for training, situational analysis and to prevent serious injuries on roadwork projects (highlighted during Safety Week 2026).
Life sciences: Eli Lilly will add $4.5 billion in manufacturing investment across Indiana to meet demand for genetic therapies and weight-loss drugs; engineering firm WSP reported Q1 growth led by power-generation work and AI capabilities and flagged M&A (including the recent TRC buy) as a priority.

Brief

Construction industry headlines on May 11, 2026 center on earnings, AI-driven safety and major corporate investments: Tutor Perini told investors it expects a “blowout” 2026 with a $19.8 billion backlog and multiple megaproject/data‑center bids; Skanska’s Q1 results show U.S. work helping it manage risk and sustain profits; Safety Week coverage detailed how Skanska, Turner and Balfour Beatty deploy AI for training, situational analysis and serious‑injury prevention on roadwork sites; Eli Lilly expanded manufacturing spend by $4.5 billion across Indiana to scale production of genetic therapies and weight‑loss drugs; and WSP reported Q1 growth fueled by power‑generation projects and AI, with M&A (notably the TRC acquisition) remaining a strategic priority amid a booming—and capacity‑constrained—U.S. data‑center construction market.

By Construction Dive

Open reader Original

WORTH READING

Deeper context and second-pass items.

30 items

ArXiv 2026-05-08 1 min read

Beyond Pairs: Your Language Model is Secretly Optimizing a Preference Graph

Why it matters

GraphDPO generalizes Direct Preference Optimization (DPO) to directed acyclic preference graphs induced by multiple rollouts per prompt, encoding dominance relations as edges, enforcing transitivity, and recovering standard DPO as a special case.

Key details

GraphDPO optimizes a Plackett–Luce–inspired graph objective with an equivalence-class construction that assigns zero loss to intra-layer edges for discrete/sparse signals, and it maintains linear per-prompt complexity via efficient log-sum-exp aggregation.
GraphDPO supports optional ground-truth anchoring by inserting verified dominant nodes and applying an annealed schedule to stabilize early training; experiments on reasoning and program synthesis (Ning Liu et al., arXiv 2026-05-08) report superior performance over pairwise DPO.

Brief

GraphDPO, introduced by Ning Liu et al. (arXiv 2026-05-08), extends DPO to operate over directed acyclic preference graphs built from multiple rollouts per prompt. It optimizes a Plackett–Luce–inspired, graph-aggregated objective, uses equivalence-class layers to avoid spurious gradients, keeps linear per-prompt cost via log-sum-exp, and—with optional annealed ground-truth anchoring—outperforms pairwise DPO on reasoning and program synthesis tasks.

Authors: Ning Liu, Chuanneng Sun, Kristina Klinkner...

Open reader Original

Twitter/X 2026-05-11 4 min read

February 2007: KKR, TPG, and Goldman Sachs acquired TXU for ~$45B enterprise…

Why it matters

February 2007: KKR, TPG, and Goldman Sachs acquired TXU for ~$45B enterprise value at $69.25/share (≈$32B equity purchase, $13B refinanced debt). The LBO used roughly $8B of sponsor equity and $37B of new debt, leaving net debt ≈7.0x LTM EBITDA and EV/NTM EBITDA ≈8x.

Key details

TXU was vertically integrated into three segments: Luminant (18,300 MW generation, ≈$3B EBITDA, sells into ERCOT wholesale), TXU Energy retail (≈$400M EBITDA), and Oncor T&D (regulated monopoly, ≈$1B EBITDA, ~10% allowed ROE). About 75% of consolidated EBITDA was exposed to natural gas price swings.
Macroeconomic shocks and supply shifts crushed the model: 2008 global financial crisis cut industrial demand, hydraulic fracturing drove Henry Hub gas from ~$13 to ~$3/MMBtu within a year, and by 2011 sponsors extended maturities; KKR wrote its equity to zero in 2013 and TXU filed bankruptcy in 2014 with roughly $42B of debt.
Bankruptcy outcomes and precedents: noteholders won a Third Circuit make-whole ruling that affected indenture drafting; Luminant + retail were sold to first-lien creditors and became Vistra (now market cap >$50B), Sempra bought 80% of Oncor for ~$9.5B cash at a TEV ≈$19B (~11x EBITDA), and KKR/TPG/Goldman took an ≈$8.3B equity wipeout.

Brief

TXU was a vertically integrated Texas power company (post-1999 deregulation) whose 2007 $45B leveraged buyout by KKR, TPG, and Goldman financed roughly $37B of new debt against ~$4.4B of EBITDA concentrated in three segments: Luminant generation (18,300 MW, ≈$3B EBITDA, ERCOT-exposed), TXU Energy retail (≈$400M EBITDA), and regulated Oncor T&D (≈$1B EBITDA, ~10% allowed return). The consortium assumed elevated gas prices and high merchant spreads; the 2008 demand shock plus a fracking-driven plunge in Henry Hub (roughly $13→$3/MMBtu) collapsed merchant margins. After an amend-and-extend in 2011, sponsors wrote down equity by 2013 and TXU filed Chapter 11 in 2014 with ≈$42B debt. The long bankruptcy produced a Third Circuit make-whole precedent, split the assets (merchant/retail → Vistra; Oncor → Sempra for ~$9.5B cash), and resulted in an ≈$8.3B equity wipeout for the sponsors.

By @BoringBiz_

Open reader Original

Twitter/X 2026-05-11 2 min read

Korean DRAM export prices hit $64,000 per kilogram (May 2026), up from under…

Why it matters

Korean DRAM export prices hit $64,000 per kilogram (May 2026), up from under $11,000/kg a year earlier — an over 5.8x increase.

Key details

AI demand is consuming memory capacity: NVIDIA takes roughly 90% of SK Hynix HBM output; SK Hynix sold out 2025 allocation by mid‑2024, locked 2026 by mid‑2025 and is now pricing 2027; AI is projected to consume ~20% of total DRAM production in 2026; OpenAI signed an LOI with Samsung and SK Hynix for 900,000 wafers per month for 'Stargate.'
Market impacts: retail DDR5 prices rose 123% in 2025 with a further 45% forecast for 2026; a 32GB DDR5‑6000 kit moved from $80–$100 (early 2025) to $364–$529 today; Xiaomi expects DRAM cost per device +25% in 2026; NVIDIA cut RTX gaming production 30–40% in H1 2026; SK Hynix now runs >50% operating margin and SK Group says the wafer shortage will persist until 2030 due to five‑year cleanroom buildouts.

Brief

Korean DRAM export prices have surged to $64,000/kg (from under $11,000/kg a year earlier) as AI demand for HBM and conventional DRAM strains wafer capacity. NVIDIA consumes ~90% of SK Hynix HBM; vendors are sold out into 2026 while OpenAI has a 900,000‑wafers/month LOI. Consumer DDR5 and phone memory costs have jumped, and manufacturers are cutting or downgrading products.

By @aakashgupta

Open reader Original

ArXiv 2026-05-08 1 min read

Fast Byte Latent Transformer

Why it matters

BLT-D (Byte Latent Transformer Diffusion) adds a block-wise diffusion auxiliary objective to next-byte training, enabling multi-byte parallel generation per decoding step and is reported as the fastest BLT variant.

Key details

Two speculative/verification extensions—BLT Self-speculation (BLT-S), where the local decoder drafts bytes past patch boundaries and they are verified with a single full-model forward pass, and BLT Diffusion+Verification (BLT-DV), which adds an autoregressive verification after diffusion—trade speed for improved quality.
All proposed methods can achieve an estimated memory-bandwidth cost over 50% lower than the original BLT on generation tasks, reducing a key practical bottleneck for byte-level LMs.

Brief

Fast Byte Latent Transformer (BLT) variants tackle slow byte-by-byte autoregressive generation by introducing BLT-D — a block-wise diffusion-trained model enabling multi-byte parallel decoding — plus BLT-S and BLT-DV speculative/verification extensions; experiments (abstract-only) report an estimated >50% memory-bandwidth reduction, full text not available.

Authors: Julie Kallini, Artidoro Pagnoni, Tomasz Limisiewicz...

Open reader Original

substack.com 2026-05-11 7 min read

Weekly | Why Memory / CPU Rally Accelerated and Outperformed Optics, SMTC & AMD Deep Dive, DDOG, APP, LITE, COHR, …

Why it matters

Coatue increased Semis + Infra long exposure to 58% (from 35% at the start of 2026), signaling large funds are materially re‑allocating from Software/Internet into AI semiconductors and infrastructure.

Key details

Altimeter’s Brad Gerstner said memory shortages could persist into 2028/29 and will discuss LTAs with Micron’s CEO, a narrative cited as catalyzing FOMO among hedge funds and long-onlys.
Long‑Term Agreements (LTAs) are changing market classification: SNDK’s LTA announcement and growing LTA mix (>50% cited) are being used to argue NAND can be viewed as non‑cyclical; some investors now benchmark NAND to HDD at ~8–10x FCF.
Selected company results: DDOG 1Q26 met EPS/rev and guided 2Q26 ~+30% YoY (implying ~35% on a typical 5pt beat) and FY26 to 26–27% (low‑to‑mid 30s on actuals); APP 1Q26 revenue +59% YoY and management guides U.S. e‑commerce to ~20% of revenue with +25% QoQ in 2Q26; RDW 1Q26 revenue +58% YoY but missed by 7%, GM 26.6%, backlog $498M, book‑to‑bill 1.92x and a $350M ATM raised after hours.

Brief

FundaAI’s weekly (May 11, 2026) frames the recent acceleration in AI semiconductor rallies — led by memory and CPUs — as driven by large funds rotating out of software into hardware. Evidence includes Coatue’s jump in Semis+Infra long exposure to 58% (from 35% at the year start) and Altimeter’s Brad Gerstner publicly signaling multi‑year memory tightness (possible shortage until 2028/29) ahead of a podcast with Micron’s CEO. The note argues that LTAs (highlighted by SNDK’s LTA disclosure) are reframing NAND as less cyclical once LTAs exceed ~50% of mix, and that some investors now value NAND closer to HDD at ~8–10x FCF. FundaAI also summarizes earnings flow: Datadog’s 1Q26 showed a large guidance beat (2Q26 ~+30% YoY; FY26 raised), AppLovin’s 1Q26 revenue +59% YoY with e‑commerce reaccelerating (+25% QoQ guide), and Redwire’s 1Q26 saw 58% revenue growth but a 7% miss, with GM 26.6% and a $498M backlog (book‑to‑bill 1.92x).

By FundaAI

Open reader Original

ArXiv 2026-05-08 1 min read

GLiGuard: Schema-Conditioned Classification for LLM Safeguard

Why it matters

GLiGuard is a 0.3B-parameter schema-conditioned bidirectional encoder (adapted from GLiNER2) that encodes task definitions and label semantics as structured token schemas to evaluate prompt safety, response safety, refusal detection, 14 fine-grained harm categories, and 11 jailbreak strategies in a single non-autoregressive forward pass.

Key details

Across nine established safety benchmarks (paper published 2026-05-08), GLiGuard achieves F1 scores competitive with 7B–27B decoder-based guard models while being 23–90× smaller, delivering up to 16× higher throughput and 17× lower latency; code and models available at https://github.com/fastino-ai/GLiGuard.

Brief

GLiGuard is a 0.3B-parameter, schema-conditioned bidirectional encoder (adapted from GLiNER2) that encodes task definitions and label semantics into structured token schemas to evaluate prompt safety, response safety, refusal detection, 14 fine-grained harm categories, and 11 jailbreak strategies in one non‑autoregressive forward pass. On nine safety benchmarks it matches F1 of 7B–27B decoder guards while being 23–90× smaller and yielding up to 16× throughput and 17× lower latency. Summary based on the paper's abstract only.

Authors: Urchade Zaratiana, Mary Newhauser, George Hurn-Maloney...

Open reader Original

ArXiv 2026-05-08 1 min read

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Why it matters

AutoTTS is an environment-driven framework that automates test-time scaling (TTS) strategy discovery by formulating width–depth TTS as controller synthesis over pre-collected reasoning trajectories and probe signals; controllers choose when to branch, continue, probe, prune, or stop.

Key details

On mathematical reasoning benchmarks the discovered strategies improve the accuracy–cost tradeoff over strong manually designed baselines, generalize to held-out benchmarks and model scales, and were discovered in just $39.9 and 160 minutes using beta parameterization and fine-grained execution-trace feedback.

Brief

AutoTTS introduces an environment-driven approach that replaces hand-designed TTS heuristics with automatic discovery: width–depth TTS is cast as controller synthesis over stored reasoning trajectories and probe signals. The method uses beta parameterization and fine-grained execution-trace feedback to make search tractable and efficient, yielding better accuracy–cost tradeoffs on math reasoning tasks; discovery cost was $39.9 and 160 minutes.

Authors: Tong Zheng, Haolin Liu, Chengsong Huang...

Open reader Original

ArXiv 2026-05-08 1 min read

Normalizing Trajectory Models

Why it matters

Normalizing Trajectory Models (NTM) represent each reverse diffusion step as an expressive conditional normalizing flow and optimize exact trajectory likelihood; the architecture pairs shallow invertible blocks inside each step with a deep parallel predictor and can be trained end-to-end or initialized from pretrained flow-matching models.

Key details

NTM's exact-trajectory likelihood enables self-distillation: a lightweight denoiser trained on the model's own score produces high-quality text-to-image samples in four sampling steps, and NTM matches or outperforms strong image-generation baselines on text-to-image benchmarks.

Brief

Normalizing Trajectory Models (NTM) address the failure of Gaussian denoising assumptions when compressing diffusion to few steps by modeling each reverse step as a conditional normalizing flow trained with exact trajectory likelihood. The design stacks shallow invertible blocks per step with a deep parallel trajectory predictor, is trainable from scratch or from flow-matching initials, and—per the abstract—enables self-distillation that yields high-quality text-to-image samples in four steps while retaining exact likelihood.

Authors: Jiatao Gu, Tianrong Chen, Ying Shen...

Open reader Original

ghost.io 2026-05-11 16 min read

Why does Resource Extraction Produce Such Concentrated Wealth?

Why it matters

Mining produces outsized private fortunes: 3.1% of Forbes billionaires made their money in metals & mining vs. ~1.9% of global listed-equity market cap; among the 100 largest countries by GDP, eight national richest-people fortunes trace to mining (Byrne Hobart, The Diff, 11 May 2026).

Key details

Public-market performance is mixed: the S&P Metals & Mining ETF returned 6.4% annualized vs. SPY 11.4% over the same timeframe, but over the last ten years metals & mining returned 19.7% vs. 15.5% for a broad equity index.
Cyclicality concentrates outcomes: resource sectors absorb the most capital at cycle peaks, enabling extreme upside for some insiders while ordinary shareholders can be wiped out in restructurings (example: Warrior Met Coal shareholders have compounded ~33% since its 2017 IPO after Walter Energy shareholders were zeroed in 2016).
Idiosyncratic luck and single-point failures amplify returns: timing and rare disruptions matter (e.g., a 2009 outage knocked offline a mine producing ~10% of global uranium for about seven months; Mukluk was a billion-dollar dry hole in 1982).

Brief

Resource extraction generates highly concentrated wealth because of a confluence of capital-cycle dynamics, technical risk-taking, idiosyncratic luck, political rent capture, and information asymmetry. Byrne Hobart (The Diff, 11 May 2026) notes concrete disparities—3.1% of Forbes billionaires come from metals & mining despite that sector being ~1.9% of listed market cap—and shows how cyclical capital demand lets insiders capture outsized upside while public shareholders endure restructurings (Warrior Met Coal compounded ~33% since a 2017 IPO that followed Walter Energy’s 2016 wipeout). Technical “engineering ego” fuels large, hard-to‑verify projects; operational single points and tight supply chains create rare but market-moving outages (a 2009 mine supplied ~10% of global uranium and was offline for ~seven months).

Political and institutional factors matter as much as geology: royalties and state control of resources in weak-rule-of-law countries make resource rents easy to capture for elites, while commodity-exchange interventions (e.g., the 2022 London Metals Exchange nickel halt) and optimistic public “reserve” claims magnify volatility and winner-take-most outcomes. Hobart argues that these frictions explain why outside investors and founders have very different returns, and why improving corporate governance and lowering cost of capital is the path to narrowing the valuation gap—resources ultimately flow to whoever can extract them at the lowest capital cost.

By Byrne @ The Diff

Open reader Original

Twitter/X 2026-05-11 2 min read

Ask any teammate why the team chose its current pricing model and you’ll likely…

Why it matters

Ask any teammate why the team chose its current pricing model and you’ll likely hear “I wasn't here for that” or “let me dig up the deck”; Aakash Gupta says three months is enough for a real decision to become unrecoverable.

Key details

Gupta studied implementations at DoorDash, Pendo, and Google (Hannah Stulberg, Dave Killeen, Gabor Meyer) plus a solo build by Carl Vellotti; all converged on a shared-repo 'Team OS' that stores PRDs, customer call summaries, decision logs, and analytics queries.
The Team OS rolls out in three phases: Phase 1 (foundation) — PMs move PRDs and customer calls into the repo and onboard core team (one week); Phase 2 — analysts check in queries, engineers add investigation templates, non‑technical contributors open PRs; Phase 3 — the repo compounds and answers questions no single teammate remembers.
Quantified impacts and examples: new hires typically take 6–7 months to settle; 47% of companies name institutional knowledge loss their top offboarding problem; 10 context questions/day at ~10 minutes each costs ~8+ hours/week; one new engineer retrieved a three‑month‑old decision rationale from the repo in 15 seconds, freeing the PM from being a bottleneck.

Brief

Aakash Gupta argues teams stop losing critical decisions by building a shared 'Team OS' — a repo where every function checks in PRDs, customer-call summaries, decision logs, and analytics queries. He reports studying four implementations (Hannah Stulberg at DoorDash, Dave Killeen at Pendo, Gabor Meyer at Google, and Carl Vellotti) and describes a three‑phase rollout: Phase 1 (one week) moves foundational docs into the repo and designs for context engineering; Phase 2 brings analysts, engineers, and nontechnical contributors into regular PR workflows; Phase 3 lets the system compound so it answers questions nobody individually remembers. Gupta cites concrete metrics — three months to make a decision unrecoverable, new hires take 6–7 months to settle, 47% of companies flag institutional knowledge loss as their top offboarding issue, and routine context questions can cost 8+ hours/week — and gives an example where a new engineer retrieved a three‑month‑old decision in 15 seconds. A downloadable guide (six assets) and a one‑command skill promise to convert personal PM OSes into team OSes without leaking personal context.

By @aakashgupta

Open reader Original

YouTube 2015-11-20 3 min read

BP Texas City Animation - Spanish

Why it matters

On March 23, 2005, an overfill and vapor-cloud explosion at BP’s Texas City refinery killed 15 workers and injured more than 180; the event began in the raffinate splitter (isomerization) unit during a startup.

Key details

The CSB animation (Spanish) reconstructs the failure sequence: faulty or insufficient level indication and a disabled high‑level alarm, operator misjudgment during startup, overfilling of the splitter, liquid carryover into vent/blowdown lines, formation of a large hydrocarbon vapor cloud, and subsequent ignition.
The U.S. Chemical Safety Board found root causes including inadequate process safety management, poor procedures and training, deficient instrumentation maintenance, and corporate cost‑cutting that degraded safety systems; the CSB issued recommendations to BP and regulators.
The Spanish animated reconstruction is intended to teach Spanish‑speaking refinery workers and managers the specific technical failure modes and preventive measures: reliable level instrumentation and alarms, safer startup procedures, proper vent/blowdown design, and stronger safety oversight.

Brief

BP Texas City Animation - Spanish is a U.S. Chemical Safety Board animated reconstruction of the March 23, 2005 BP Texas City refinery disaster (presentation/visual tutorial) that killed 15 and injured over 180. The video walks step‑by‑step through the isomerization/raffinate splitter startup sequence, showing how inaccurate level indication and a disabled high‑level alarm led operators to overfill the tower, producing liquid carryover into vents and blowdown systems, a large hydrocarbon vapor cloud, and a catastrophic ignition. The animation highlights systemic failures identified by the CSB — weak process safety management, inadequate procedures and training, poor maintenance of key instrumentation, and cost‑driven decisions that left safety defenses unreliable — and it emphasizes concrete preventive measures (robust level controls and alarms, safer startup practices, improved venting/blowdown design, and stronger oversight) for Spanish‑speaking plant personnel and managers.

By USCSB

Open reader Original

Twitter/X 2026-05-11 2 min read

Physical AI scales by building factories, not buying GPUs

Why it matters

Physical AI scales by building factories, not buying GPUs: datacenters take ~18 months to build while factories take ~4 years, shifting the bottleneck from capital to geography.

Key details

The 2026 robotics race is consolidating to two countries because manufacturing is the gate; one country already produces ~30% of global manufactured goods, so Tesla, Figure, Boston Dynamics, Agility, and Unitree are competing on supply chain rather than AI models.
Gupta predicts robot control models will be open-sourced on Hugging Face by Q3 2026 while factories remain proprietary; Frantz Lohier (AWS Robotics) will map the 2026 physical AI landscape at the AI Skills conference on May 14, 2026 (5,000+ registered, free on Zoom).

Brief

Aakash Gupta argues that unlike software AI (which scaled by buying GPUs), physical AI scales by building factories: datacenters ~18 months versus factories ~4 years, moving the bottleneck from capital to geography. He claims manufacturing will concentrate robotics in two countries (one making ~30% of global goods), companies compete on supply chains, and models will be open-sourced by Q3 while factories stay proprietary; Frantz Lohier will present the map at the May 14, 2026 AI Skills conf (5,000+ registered).

By @aakashgupta

Open reader Original

Twitter/X 2026-05-11 1 min read

Ryan Fedasiuk (tweet, 2026-05-11) argues U.S. export controls on AI chips are…

Why it matters

Ryan Fedasiuk (tweet, 2026-05-11) argues U.S. export controls on AI chips are intended to restrict compute availability — not because of beliefs about AGI or first‑mover advantage — and that compute is the singular bottleneck blocking labs like DeepSeek, Qwen, and Moonshot from hosting competitive inference services against OpenAI, Anthropic, and Google.

Key details

Compute scarcity in 2026 — including global memory shortages and Mac Studio M3 Ultra sellouts — is forcing Chinese labs to push compute costs onto consumers, and this supply‑chain strain advantages U.S. labs that have stronger partnerships with U.S. hyperscalers and allied manufacturers.

Brief

Ryan Fedasiuk argues U.S. export controls on AI chips are aimed at limiting compute, not tied to AGI, because compute scarcity in 2026 is the main barrier preventing Chinese labs like DeepSeek, Qwen, and Moonshot from hosting competitive inference services; global memory shortages (e.g., M3 Ultra Mac Studio sellouts) and strained supply chains favor U.S. labs with hyperscaler and allied manufacturer ties.

By @RyanFedasiuk

Open reader Original

ArXiv 2026-05-08 1 min read

Asymptotically Log-Optimal Bayes-Assisted Confidence Sequences for Bounded Means

Why it matters

Valentin Kilian, Stefano Cortinovis, and François Caron (ArXiv 2026-05-08) introduce a Bayes-assisted framework that adaptively constructs time-uniform confidence sequences for the mean of bounded IID observations by using a Bayesian working predictive to select one-step martingale factors that maximize predictive expected log-growth while preserving validity under misspecification.

Key details

They prove that if the predictive distribution is Wasserstein-consistent the procedure is asymptotically log-optimal, matching the per-sample log-growth rate of an oracle that knows the true data distribution.
The framework is instantiated with Dirichlet-process mixture predictives and Bayesian exponentially tilted empirical likelihood; experiments on synthetic data, sequential best-arm identification for LLM evaluation, and prediction-powered inference show informative priors can substantially reduce confidence-sequence width and sampling effort while retaining anytime-valid coverage.

Brief

Asymptotically Log-Optimal Bayes-Assisted Confidence Sequences for Bounded Means introduces a Bayes-assisted method that uses a Bayesian working predictive model to choose one-step martingale updates maximizing predictive expected log-growth, yielding time-uniform confidence sequences for bounded IID means. The authors prove Wasserstein-consistency of the predictive implies asymptotic log-optimality (matching an oracle per-sample log-growth). They instantiate methods with Dirichlet-process mixtures and Bayesian exponentially tilted empirical likelihood and demonstrate reduced interval widths and sampling in synthetic and real tasks (LLM best-arm, prediction-powered inference).

Authors: Valentin Kilian, Stefano Cortinovis, François Caron

Open reader Original

datacenterdynamics.com 2026-05-11 1 min read

Broadcast: Why industrial connectivity is becoming a data center priority

Why it matters

Data Centre Dynamics' Telecoms & Connectivity episode (published 2026-05-11) features experts from Belden addressing industrial and edge connectivity as AI, automation and distributed infrastructure expand into industrial sites.

Key details

The session targets concrete priorities for operators: simplifying connectivity, streamlining network architectures, and improving reliability, visibility and performance to support rising data demands and next‑generation digital operations.

Brief

Data Centre Dynamics' Telecoms & Connectivity episode (published 11 May 2026) brings Belden experts to examine how industrial and edge environments must evolve connectivity to handle AI, automation and distributed infrastructure. The briefing focuses on simplifying network architectures, boosting reliability and visibility, and meeting growing data volumes across complex operational sites; viewers can register to stream the session.

By DCD Telecoms & Connectivity Channel

Open reader Original

Twitter/X 2026-05-11 1 min read

Amanda Vandyke (@AmandaVandyke13) tweeted on 2026-05-11 that we must mine as much…

Why it matters

Amanda Vandyke (@AmandaVandyke13) tweeted on 2026-05-11 that we must mine as much copper in the next 25 years as was mined in the previous 125 years.

Key details

She states the vast majority of global copper reserves grade 0.14–0.34% Cu, which implies exponentially larger open pits, more waste and tailings, greater chemical use, and higher energy per unit of recovered copper.
Vandyke asserts copper recoveries are not improving and that economies of scale are becoming diseconomies of scale, calling for a complete rethink of how we find, mine and refine copper and rare earths.

Brief

Amanda Vandyke (@AmandaVandyke13) warns that we must mine as much copper in the next 25 years as was mined in the previous 125. She says most global reserves are low grade (0.14–0.34% Cu), forcing exponentially larger pits, more waste/tailings, greater chemical use and energy; with stagnant recoveries and failing economies of scale, she calls for rethinking discovery, mining and refining, including rare earths.

By @AmandaVandyke13

Open reader Original

Twitter/X 2026-05-11 1 min read

Holding an Opportunity Zone (OZ) investment for 10 years produces a tax-free…

Why it matters

Holding an Opportunity Zone (OZ) investment for 10 years produces a tax-free gain: no capital gains tax and no depreciation recapture on the OZ gain.

Key details

Congress made Opportunity Zones permanent in July and rewrote the rules: the old fixed 2026 cutoff is replaced by a rolling 5-year deferral, a new rural designation adds enhanced benefits, and reporting requirements are tighter.
A 2025 partnership K‑1 gain can still be deferred as late as September 11, 2026 — even if a Q4 estimated tax payment was made (the estimate becomes a refund). Author @DallasAptGP is presenting a free CPA Academy webinar on Monday, May 18 at 1:00 pm CT.

Brief

Opportunity Zones were made permanent by Congress in July and the rules were rewritten: holding an OZ investment 10 years yields a tax-free gain (no capital-gains tax or depreciation recapture). The reform swaps the fixed 2026 deadline for a rolling 5-year deferral, adds a rural designation with enhanced benefits, tightens reporting, and allows deferral of 2025 K‑1 gains through Sept 11, 2026; a free CPA Academy webinar on May 18 at 1pm CT will walk through scenarios.

By @DallasAptGP

Open reader Original

Twitter/X 2026-05-11 1 min read

Peter Boockvar frames the economy as a "two-lane highway"

Why it matters

Peter Boockvar frames the economy as a "two-lane highway": Highway #1 is an AI and data-center boom with ~650 million square feet announced or under construction, boosting demand for semiconductors, electrical equipment, cement, and gravel.

Key details

Highway #2 is a struggling consumer sector: lower- and middle-income households are squeezed by cumulative inflation and high interest rates while upper-income spending stays resilient; food bank demand is rising.
Author @anthonysagami endorses Boockvar's conclusion that tech stocks remain the place to be.

Brief

Peter Boockvar sees a 'two-lane highway': one lane is an AI/data-center build-out—about 650 million sq ft announced or under construction—powering demand for chips, electrical gear, cement and gravel; the other lane is lower- and middle-income consumers hit by inflation and high rates, with rising food bank use. The poster endorses tech stocks.

By @anthonysagami

Open reader Original

e.economist.com 2026-05-11 9 min read

The World in Brief: Trump calls Iran’s proposal “unacceptable”

Why it matters

Donald Trump on May 11 called Iran’s response to the U.S. proposal to end the war “totally unacceptable”; Iranian state TV said the response included compensation for war damage and recognition of Iran’s sovereignty over the Strait of Hormuz, while it was unclear if Iran mentioned its stock of highly enriched uranium; Israel’s PM Binyamin Netanyahu said the war won’t end until Iran relinquishes that material; Brent crude rose to about $105/barrel.

Key details

China’s factory (producer) prices rose 2.8% year‑on‑year in April—the biggest increase since 2022—while consumer prices rose 1.2%; Beijing blamed rising international commodity prices and stronger demand in some domestic sectors, helping end a prior deflationary slide.
Xi Jinping and Donald Trump will meet as planned this week; Trump is due to arrive in China on May 14, with the Iran conflict and recent U.S. sanctions on three Chinese satellite firms on the agenda.
Saudi Aramco reported Q1 net profit of $33.6bn (up about 25% year‑on‑year) and used pipelines to reroute exports away from the Strait of Hormuz; CEO Amin Nasser warned prolonged hostilities could keep supply disrupted into 2027 and the company faces pipeline‑damage risks and dividend pressure.

Brief

Separately, Xi Jinping and Donald Trump will meet in China this week (Trump arrives May 14), with the Iran war and recent U.S. sanctions on Chinese satellite firms likely to figure. Corporate moves include Alphabet’s decision to issue yen‑denominated bonds as it funds an anticipated $190bn AI infrastructure push in 2026 (it raised nearly $17bn in euro and Canadian‑dollar bonds last week). Saudi Aramco posted Q1 net profit of $33.6bn (≈+25% y/y) after rerouting exports via pipelines, but the company faces risks from potential pipeline damage, dividend pressures and a warning that market normalisation may not occur until 2027. Humanitarian and domestic stories include over 20,000 Ukrainian children reported deported since 2022 (about 2,000 returned), the release on bail and hospital transfer of Nobel laureate Narges Mohammadi, and evolving U.S. abortion access where remote provision of mifepristone accounted for a large share of 2025 procedures amid pending Supreme Court disputes.

By The Economist

Open reader Original

Twitter/X 2026-05-11 1 min read

MiniCPM-V 4.6 (1.3B) launched by @OpenBMB on 2026-05-11 and uses LLaVA-UHD v4 to…

Why it matters

MiniCPM-V 4.6 (1.3B) launched by @OpenBMB on 2026-05-11 and uses LLaVA-UHD v4 to cut vision encoding costs by 55%, enabling native edge deployment and optimization for consumer-grade and mobile hardware.

Key details

OpenBMB claims MiniCPM-V 4.6 outperforms Gemma4-E2B-it and Qwen3.5-0.8B on key multimodal/Artificial Analysis benchmarks — scoring higher than Qwen3.5-0.8B while using just 2.5% of its token budget; reported TTFT = 75.7 ms (2.2× faster on 3136² images) and ~1.5× token throughput vs Qwen3.5-0.8B on a single RTX 4090.
Model and demos are publicly available: Hugging Face (openbmb/MiniCPM-V), GitHub (OpenBMB/MiniCPM-V), Modelscope, a Hugging Face web demo, and an app demo (links posted by OpenBMB).

Brief

MiniCPM-V 4.6 (1.3B) is a high-resolution multimodal model from OpenBMB released 2026-05-11 that applies LLaVA-UHD v4 to cut vision encoding costs by 55%, claiming superior benchmark results versus Gemma4-E2B-it and Qwen3.5-0.8B while using just 2.5% of Qwen's token budget; OpenBMB reports 75.7 ms TTFT (2.2× faster on 3136² images) and ~1.5× token throughput on an RTX 4090, with model and demos available on Hugging Face, GitHub and Modelscope.

By @OpenBMB

Open reader Original

ArXiv 2026-05-08 1 min read

Conformal Path Reasoning: Trustworthy Knowledge Graph Question Answering via Path-Level Calibration

Why it matters

Conformal Path Reasoning (CPR) applies query-level conformal calibration over path-level scores and introduces a Residual Conformal Value Network (RCVNet) trained via PUCT-guided exploration to improve calibration validity and score discriminability in KGQA.

Key details

On benchmarks reported in the paper, CPR increases Empirical Coverage Rate by 34% and reduces average prediction set size by 40% compared to prior conformal baselines, producing more compact, statistically guaranteed answer sets.

Brief

Conformal Path Reasoning (CPR) is a trustworthy Knowledge Graph Question Answering framework that performs conformal calibration at the query level over path-level scores and adds a lightweight Residual Conformal Value Network (RCVNet) trained with PUCT-guided exploration to produce discriminative nonconformity scores. According to the abstract, CPR substantially improves empirical coverage (+34%) while shrinking prediction sets (−40%) versus conformal baselines, yielding more reliable and compact answer sets. Full text was not used beyond the provided abstract.

Authors: Shuhang Lin, Chuhao Zhou, Xiao Lin...

Open reader Original

ArXiv 2026-05-08 1 min read

Object Hallucination-Free Reinforcement Unlearning for Vision-Language Models

Why it matters

HFRU (Object Hallucination-Free Reinforcement Unlearning) removes sensitive knowledge by operating on the vision encoder (not the language decoder) via a two-stage pipeline: alignment disruption followed by GRPO-based optimization with a composite reward that includes an 'abstraction reward' to reduce hallucinations.

Key details

On object-recognition and face-identity benchmarks reported in the abstract, HFRU achieves over 98% forgetting while preserving over 98% retention and produces negligible object hallucination, substantially outperforming prior decoder-fine-tuning methods; code: https://github.com/XMUDeepLIT/HFRU. Published 2026-05-08.

Brief

HFRU is a reinforcement unlearning method for VLMs that removes sensitive visual knowledge by modifying the vision encoder through a two-stage process: alignment disruption followed by GRPO optimization with a composite reward (including an abstraction reward to avoid object-hallucination). On object-recognition and face-identity tests reported in the abstract, it achieves >98% forgetting and retention with negligible hallucination. Summary based on abstract; full paper not reviewed.

Authors: Kaidi Jia, Yujie Lin, Chengyi Yang...

Open reader Original

Twitter/X 2026-05-11 1 min read

Thinking Machines' launch highlighted a 200ms micro-turn architecture that makes…

Why it matters

Thinking Machines' launch highlighted a 200ms micro-turn architecture that makes interactivity a native model capability rather than a harness-level hack (announced May 11, 2026).

Key details

The team identified that existing LLM inference libraries aren't optimized for frequent small prefills, implemented streaming sessions to solve it, and contributed the feature upstream to SGLang.
@GenAI_is_real reports their team is tackling the same serving challenges with SGLang Omni for streaming speech and video inference—continuous input/output with strict latency budgets—and will share results soon; tags: @thinkymachines, @lmsysorg.

Brief

Thinking Machines' launch (May 11, 2026) emphasized two technical advances: a 200ms micro-turn architecture that embeds interactivity in-model, and streaming sessions to address inference libraries' poor handling of frequent small prefills, a feature contributed to SGLang. @GenAIisreal says their SGLang Omni work targets identical continuous I/O, low-latency speech/video serving challenges and will be shared soon.

By @GenAI_is_real

Open reader Original

ArXiv 2026-05-08 1 min read

TAVIS: A Benchmark for Egocentric Active Vision and Anticipatory Gaze in Imitation Learning

Why it matters

TAVIS (released 2026-05-08) provides two task suites — TAVIS-Head (5 tasks, global pan/tilt search) and TAVIS-Hands (3 tasks, wrist-camera occlusion) — on two humanoid torso embodiments (GR1T2, Reachy2) built on IsaacLab; dataset LeRobot v3.0 contains ~2,200 demonstration episodes and code/models are on GitHub and Hugging Face.

Key details

Evaluation primitives include a paired headcam-vs-fixedcam protocol, GALT (Gaze-Action Lead Time) — a new metric grounded in cognitive science/HRI to quantify anticipatory gaze — and procedural in-distribution/out-of-distribution (ID/OOD) splits.
Baselines (Diffusion Policy and π_0) show that active vision generally improves imitation performance but benefits are task-conditional; multi-task policies suffer sharp degradation under controlled distribution shifts; and imitation alone yields anticipatory gaze with median lead times comparable to the human teleoperator reference.

Brief

TAVIS is a benchmark for egocentric active vision and anticipatory gaze in imitation learning that provides two complementary suites (Head: 5 tasks; Hands: 3 tasks) on GR1T2 and Reachy2 torsos in IsaacLab, plus ~2,200 demonstrations. It introduces a paired headcam vs fixedcam protocol, the GALT anticipatory-gaze metric, and ID/OOD splits; baselines show selective benefits from active gaze, brittleness of multi-task policies under shift, and human-like anticipatory gaze emerging from imitation.

Authors: Giacomo Spigler

Open reader Original

ArXiv 2026-05-08 1 min read

Many-to-Many Multi-Agent Pickup and Delivery

Why it matters

The paper introduces M2M (Many-to-Many Multi-Agent Pickup and Delivery) with two variants—M2M (duration-minimizing) and M2M-wSKU (incorporates SKU distribution)—to solve the many-to-many MAPD problem formulated as an NP-hard four-dimensional assignment.

Key details

In simulation over 8-hour warehouse operations, M2M variants consistently match or outperform prior state of the art, with M2M completing up to 22,000 more tasks on average across different environments and inventory densities.

Brief

Many-to-Many Multi-Agent Pickup and Delivery (M2M) tackles warehouse many-to-many MAPD, where SKUs can be picked up from or delivered to multiple locations, yielding an NP-hard four-dimensional assignment. The authors propose two algorithmic variants (duration-minimizing M2M and SKU-aware M2M-wSKU) and report simulations over 8-hour operations showing up to 22,000 more tasks completed on average versus prior methods. Full text was not provided with the abstract.

Authors: Ethan Schneider, Jingkai Chen, Tianyi Gu...

Open reader Original

Twitter/X 2026-05-11 1 min read

Brown University’s Watson School real-time tracker, led by researcher Jeff…

Why it matters

Brown University’s Watson School real-time tracker, led by researcher Jeff Colgan, estimates American consumers have paid more than $37 billion in extra gasoline and diesel costs since the war with Iran began on February 28, 2026.

Key details

The tracker compares current prices to a projected pre-war baseline and finds the national average gasoline price at about $4.52 per gallon, up from under $3 when the war began; analysts note summer driving season has yet to peak.

Brief

Brown University’s real-time tracker at the Watson School, led by Jeff Colgan, calculates American consumers have incurred over $37 billion in additional gasoline and diesel costs since the Iran war began on February 28, 2026. The tracker compares present prices to a projected pre-war baseline and reports a national average gas price near $4.52/gal, up from under $3, with summer driving season not yet peaked.

By @DropSiteNews

Open reader Original

substack.com 2026-05-11 11 min read

10 Things About Berkshire Hathaway's 10-Q

Why it matters

Berkshire closed Q1 2026 with $380.2 billion in cash (Greg Abel cited this at the May 2026 annual meeting), largely held in T‑bills.

Key details

The company bought just $235 million of its own stock in Q1 (≈$225 million executed on March 4); Berkshire has been a net seller of equities for 14 consecutive quarters ($15.9B purchased vs $24.1B sold in Q1).
Public equity portfolio value was $288 billion at quarter end; the five largest holdings (Apple, American Express, Bank of America, Coca‑Cola, Chevron) accounted for 61% of that total and Apple remains the largest single position.
Operating earnings were $11.3 billion in Q1 2026 (up 17.7% year‑over‑year, or +7.2% YoY after stripping foreign currency swings of +$249M this quarter vs a -$713M FX effect in 2025).

Brief

Berkshire Hathaway's Q1 2026 10‑Q and annual meeting comments show a conglomerate sitting on extraordinary optionality: $380.2 billion in cash (mostly T‑bills) even as the company restarted buybacks but repurchased only $235 million in the quarter. The public equity portfolio stood at $288 billion and remains highly concentrated—Apple, AmEx, Bank of America, Coca‑Cola and Chevron made up 61%—while reported net selling ($24.1B sold vs $15.9B bought) may include one‑time liquidations tied to Todd Combs’ departure. Operating earnings were $11.3 billion (a 17.7% nominal YoY increase; +7.2% ex‑FX), a cleaner gauge of subsidiary performance than mark‑to‑market moves.

Insurance results were firm: after‑tax underwriting earnings of $1.7 billion on an 87.8% combined ratio and float of $176.9 billion (up $0.5B). GEICO posted an 87.3% combined ratio but saw pre‑tax earnings fall to $1.4B as growth lags peers. BNSF improved efficiency—$1.4B earnings, 2.2% volume growth, 260 fewer locomotives—but still trails top Class I margins. BHE eked out $1.1B in earnings (pipelines +24.2%), with management emphasizing disciplined data‑center deal terms and a favorable appellate development in PacifiCorp wildfire litigation. Manufacturing, Service & Retailing produced $3.2B in earnings with mixed results: Precision Castparts and IMC strong, Brooks Running notable for +20% North America and +136% China growth, while Pilot remains loss‑making as it invests $500–700M annually to modernize stores.

By Kingswell

Open reader Original

divenewsletter.com 2026-05-11 4 min read

Office leasing demand hit its highest post‑pandemic level in Q1 2026, driven by…

Why it matters

Office leasing demand hit its highest post‑pandemic level in Q1 2026, driven by heightened activity in the technology, finance and legal sectors despite an AI surge and geopolitical pressure from the Iran conflict (Facilities Dive, May 11, 2026).

Key details

Fraunhofer Institute researchers developed building‑integrated PV modules that mimic roof tiles and other materials using a light‑sensitive film, enabling visually unobtrusive solar installation on facades and roofs.
Carrier Saia consolidated roughly 10,000 devices — including warehouse cameras, access control, intercoms and other security hardware — into a single physical security management system to streamline monitoring.
A New York City–based energy management firm reported that adding AI‑enabled sensors to legacy HVAC systems can reduce emissions by about 22% and reveal issues that hinder compliance with Local Law 97.

Brief

Facilities Dive's May 11, 2026 Daily Dive highlights resilient facilities trends: office demand reached its strongest post‑pandemic quarter in Q1 2026, led by technology, finance and legal tenants even amid an AI adoption surge and tensions related to the Iran conflict. On sustainability and tech, Fraunhofer Institute work on light‑sensitive film enables building‑integrated PV that resembles tiles and other materials, offering low‑profile rooftop and facade solar options. In security and operations, trucking carrier Saia integrated about 10,000 edge devices (cameras, access controllers, intercoms) into a single physical‑security system for centralized monitoring. And in emissions compliance, an NYC energy‑management firm says retrofitting legacy HVAC with AI and sensors can cut emissions ~22% and surface faults that impede compliance with Local Law 97.

By Facilities Dive

Open reader Original

Twitter/X 2026-05-11 1 min read

@ginacostag_ (published 2026-05-11) promoted OpenBMB’s MiniCPM-V and argued…

Why it matters

@ginacostag_ (published 2026-05-11) promoted OpenBMB’s MiniCPM-V and argued “Speed and cost matter more than hype” while linking the GitHub repo and Hugging Face model.

Key details

OpenBMB released MiniCPM-V 4.6 (1.3B) and says LLaVA‑UHD v4 reduces vision‑encoding costs by 55%, enabling native edge deployment optimized for consumer‑grade and mobile hardware.
OpenBMB claims MiniCPM‑V achieves TTFT 75.7 ms (2.2× faster than Qwen3.5‑0.8B on 3136² images), ~1.5× token throughput on a single RTX 4090, and higher multimodal/Artificial Analysis benchmark scores while using just 2.5% of Qwen3.5‑0.8B's token budget.

Brief

MiniCPM‑V 4.6 (1.3B) from OpenBMB is presented as a mobile/edge‑optimized multimodal model: LLaVA‑UHD v4 reportedly cuts vision encoding costs 55%, enabling native edge deployment. The team claims TTFT 75.7 ms (2.2× faster than Qwen3.5‑0.8B at 3136²), ~1.5× token throughput on an RTX 4090, and superior benchmark scores while using only 2.5% of Qwen’s token budget.

By @ginacostag_

Open reader Original

ArXiv 2026-05-08 1 min read

Penalty-Based First-Order Methods for Bilevel Optimization with Minimax and Constrained Lower-Level Problems

Why it matters

Yiyang Shen, Yutian He, Weiran Wang, and Qihang Lin (arXiv 2026-05-08) propose penalty-based first-order methods for bilevel problems where both upper- and lower-level problems are minimax; their deterministic algorithm finds an ε-KKT point with tilde O(ε^{-4}) oracle complexity.

Key details

They show convex constrained lower-level minimization can be reformulated via Lagrangian duality as a special case of their framework, yielding a tilde O(ε^{-4}) complexity that improves on the prior tilde O(ε^{-7}) bound.
The paper extends to the stochastic setting (only stochastic gradient oracles) and proves a stochastic method finds a near-ε-KKT point with tilde O(ε^{-9}) stochastic oracle complexity.

Brief

Bilevel minimax optimization with minimax lower-level problems is addressed via penalty-based first-order methods that remove the need for lower-level strong convexity. Based on the abstract, the deterministic method attains an ε-KKT point in tilde O(ε^{-4}) oracle calls, the constrained lower-level case improves prior tilde O(ε^{-7}) to tilde O(ε^{-4}) via Lagrangian duality, and a stochastic variant achieves near-ε-KKT with tilde O(ε^{-9}).

Authors: Yiyang Shen, Yutian He, Weiran Wang...

Open reader Original

QUICK SKIM

Fast scan items.

66 items

Twitter/X 2026-05-11 1 min read

@bscholl invites viewers to a shop-floor walkthrough with Boom Supersonic’s R&D…

Why it matters

@bscholl invites viewers to a shop-floor walkthrough with Boom Supersonic’s R&D shop manager showing how cold-section engine vanes are manufactured from raw materials.

Key details

Boom Supersonic launched a 'Build Supersonic' video series; Episode 1 (posted May 11, 2026) spotlights 'Symphony' hardware and rapid iterative testing—'how we break things, learn, and iterate to move fast'—and the post encourages people to join the team.

Brief

@bscholl invites viewers to walk the shop floor with Boom Supersonic’s R&D shop manager to see how cold-section engine vanes are made from raw materials. Boom launched the 'Build Supersonic' video series; Episode 1 highlights Symphony hardware and their rapid iterate-and-test approach ('how we break things, learn, and iterate to move fast') and asks what viewers want to see next while encouraging new hires.

By @bscholl

Open reader Original

substack.com 2026-05-11 11 min read

Let's not compare data center heat exhaust to nuclear bombs

Why it matters

A 1 GW continuous data center emits 24 billion watt‑hours (24 GWh) of heat per day — comparable in magnitude to the Trinity Test’s ~27 GWh total heat release, but the Trinity blast released that energy in ~1 second (≈86,000× more concentrated).

Key details

Reporting waste heat in “nuclear bombs” is misleading: the article models a proposed site with 16 GW thermal (9 GW data center + 7 GW gas turbines) → 16 GW × 24 h = 384 GWh/day; using 1 GWh ≈ 0.86 kilotons TNT gives ~16.7 GWh per bomb (~14.3 kt, similar to Little Boy), an alarming but concentration‑incommensurate comparison.
Local and meteorological heat flows dwarf or contextualize data‑center heat: Washington DC’s total human energy use ≈65 GWh/day (~28 GWh electricity + 37 GWh transport/other) ≈ 3.7 ‘nukes’/day, while incoming solar ≈750–1000 GWh/day (≈43–57 ‘nukes’/day) and 1 mm of condensation over DC ≈108 GWh (~6 ‘nukes’).
A more useful metric is local temperature change: using an estimate of ~0.025 °C per W/m² of human energy use implies DC’s human energy adds ≈0.4 °C, and the largest proposed data center (dissipating ~16 GW) could raise very local air temperature by ~2.4 °C (~4.3 °F).

Brief

continued

By Andy Masley

Open reader Original

Twitter/X 2026-05-11 1 min read

Varick (author @vasuman) insists “AI is a services game”

Why it matters

Varick (author @vasuman) insists “AI is a services game”: implementers must refactor business processes for an AI-native reality and build custom agents rather than relying on tool adoption alone.

Key details

Varick claims a perfect track record: 100% of their clients have reached production and 100% returned for a second project; OpenAI announced the OpenAI Deployment Company on 2026-05-11, majority-owned by OpenAI and backed by 19 investment firms/consultancies/system integrators.
Author warns against vendor lock-in risks—price hikes, model quantization/regressions, rate limits, and migration costs—and is recruiting Engineers, FDEs, and Consultants, inviting top 1% talent to join Varick Agents.

Brief

Varick argues AI success requires implementation services: teams that analyze and refactor processes and build custom agents, not just selling tools. He touts Varick’s record (100% production, 100% repeat clients), notes OpenAI launched an OpenAI Deployment Company on 2026-05-11 backed by 19 partners, warns of vendor lock-in risks, and is hiring Engineers, FDEs, and Consultants.

By @vasuman

Open reader Original

Twitter/X 2026-05-11 2 min read

@ancerj asserts "One way or another Taiwan will be reunified with China," arguing…

Why it matters

@ancerj asserts "One way or another Taiwan will be reunified with China," arguing the next decade is about timing and terms and moving key technology assets out while buying time with regional military build-up to deny a Chinese Monroe Doctrine.

Key details

On May 8 the KMT and Taiwan People's Party passed a NT$780 billion (~$25 billion) defense bill (vote 59-0; DPP abstained), cutting President Lai Ching-te's proposed $40 billion budget to roughly two-thirds and eliminating all domestic procurement: 210,000 military drones, sea-attack drone programs, the Chiang Kung anti-ballistic missile (T‑Dome backbone), and NT$64 billion for Taiwan–US joint R&D.
Political and strategic fallout: DPP lawmaker Chen Kuan-ting warned reliance on imported US weapons risks ammunition and sustainment shortfalls if blockaded; KMT chair Cheng Li-wun met Xi Jinping in April 2026 weeks before the cuts; Sen. Roger Wicker expressed disappointment and Adm. Paparo has warned the US cannot want Taiwan's defense more than Taiwan itself—author argues the opposition handed Beijing a capability gap and proof of "dysfunctional" democracy.

Brief

@ancerj argues "One way or another Taiwan will be reunified with China," and that the next decade will determine timing/terms while Taiwan moves key tech assets out and builds regional military capacity to deter a Chinese Monroe Doctrine. On May 8 the KMT and TPP passed a NT$780 billion (~$25bn) defense bill (59-0; DPP abstained) stripping domestic programs—210,000 drones, sea-attack drones, Chiang Kung ABM and NT$64bn for Taiwan–US R&D—creating capability gaps and political controversy.

By @ancerj

Open reader Original

Twitter/X 2026-05-11 1 min read

Author @cryptopunk7213 claims the Cerebras IPO was 20x oversubscribed and that…

Why it matters

Author @cryptopunk7213 claims the Cerebras IPO was 20x oversubscribed and that demand led the company to raise an extra $1.3B since last week’s IPO spec, signaling investor hunger for an NVIDIA alternative.

Key details

The author asserts inference will be 10-50x the value of training and highlights Cerebras’ SRAM chips as specialized for low-latency inference, noting Codex Spark and Amazon Bedrock run on their chips and provide distribution.
Reuters/Bloomberg reporting: IPO price range was increased to $150–$160 (from an original $115–$125), shares rose to 30 million from 28 million, potential raise ≈ $4.8B at $160, indications of interest > $10B, pricing set for May 13, 2026.

Brief

Cerebras’ IPO surge: the author claims a 20x oversubscription and an extra $1.3B raise, arguing strong investor appetite for an NVIDIA alternative if Cerebras captures even 10% market share. He emphasizes Cerebras’ SRAM chips for low-latency inference (Codex Spark, Amazon Bedrock) and notes Reuters reports the range rose to $150–$160 with pricing on May 13, 2026.

By @cryptopunk7213

Open reader Original

ArXiv 2026-05-08 1 min read

Black-box model classification under the discriminative factorization

Why it matters

Introduces the 'discriminative factorization' to evaluate query-set quality for black-box model-level classification; under this framework the probability of chance-level classification decays exponentially with the query budget.

Key details

Empirical validation on three auditing tasks (Helm, Ohata, Priebe; arXiv 2026-05-08) shows estimated factorization parameters predict the observed performance-decay rate, and query sets chosen by the estimated discriminative field reproduce the empirical ordering of oracle query sets.

Brief

Discriminative factorization is proposed to quantify and distinguish high- versus low-quality query sets for black-box model-level classification. The framework yields a theoretical result: the probability of chance-level classification decays exponentially with query budget. On three auditing tasks the authors show estimated parameters track empirical decay and enable query selection that mirrors oracle ordering.

Authors: Hayden Helm, Merrick Ohata, Carey Priebe

Open reader Original

ArXiv 2026-05-08 1 min read

Don't Get Your Kroneckers in a Twist: Gaussian Processes on High-Dimensional Incomplete Grids

Why it matters

CUTS-GPR introduces an extremely fast kernel matrix–vector product that attains near-linear or linear scaling with training size N and low-order polynomial scaling with dimensionality D by combining an additive kernel with an incomplete grid structure.

Key details

Authors demonstrate scalability with benchmarks involving billions of data points and thousands of dimensions; a full Gaussian process regression (including hyperparameter optimization) was completed in hours for N = 447,265 and D = 24, enabling Bayesian modeling of high-dimensional potential energy surfaces.

Brief

CUTS-GPR targets the computational bottleneck of Gaussian process regression in high-dimensional problems by exploiting an additive kernel on an incomplete grid to reveal structure that yields an extremely fast kernel matrix–vector product. The method shows near-linear (or linear) scaling in N and low-order polynomial scaling in D, with benchmarks on billions of points and thousands of dimensions and a full GPR + hyperparameter run in hours for N=447,265, D=24, enabling Bayesian modeling of high-dimensional potential energy surfaces in computational chemistry.

Authors: Mads Greisen Højlund, August Smart Lykke-Møller, Henry Moss...

Open reader Original

Twitter/X 2026-05-11 1 min read

Zou, Poeppel and Ding (Nature Neuroscience) — highlighted by @ValerioCapraro on…

Why it matters

Zou, Poeppel and Ding (Nature Neuroscience) — highlighted by @ValerioCapraro on 2026-05-11 — show the human brain predicts words but prediction precision is constrained by linguistic structure.

Key details

When a word continues the current phrase, brain activity tracks word surprisal in a way resembling LLMs; when a word crosses a major phrase boundary, the match with LLM-style prediction weakens.
The authors and poster assert this challenges the view that humans are mere next-token predictors: the brain asks not just "What is the next word?" but also "What structure am I currently building?"

Brief

Zou, Poeppel and Ding's Nature Neuroscience paper, highlighted by @ValerioCapraro on 2026-05-11, reports that human word prediction varies with syntactic structure: surprisal tracking aligns with LLM-like next-word prediction within phrases but breaks down across major phrase boundaries, arguing human language processing involves structural tracking beyond next-token prediction.

By @ValerioCapraro

Open reader Original

ArXiv 2026-05-08 1 min read

Semiparametric Efficient Test for Interpretable Distributional Treatment Effects

Why it matters

DR-ME, proposed by Houssam Zenati and Arthur Gretton (arXiv 2026-05-08), is the first semiparametrically efficient finite-location test that returns interpretable causal-discrepancy coordinates for distributional treatment effects rather than only a global rejection.

Key details

The method constructs orthogonal, doubly robust kernel features from observational data whose centered oracle form is the canonical gradient of the finite witness; for fixed locations the test is chi-square calibrated under the null and has noncentral chi-square local power, employing covariance whitening that optimizes local signal-to-noise.
DR-ME uses a principled location-learning criterion with sample splitting to preserve post-selection validity; experiments report near-nominal Type-I error, competitive power versus global doubly robust kernel tests, and interpretable learned locations in a semi-synthetic medical-imaging study.

Brief

Distributional treatment effects that leave means unchanged are targeted by DR-ME, a semiparametrically efficient finite-location test (Zenati & Gretton, arXiv:2605.08034v1, 2026-05-08). From observational data it derives orthogonal doubly robust kernel features whose centered oracle is the canonical gradient; for fixed locations the test is chi-square calibrated and has noncentral chi-square local power with covariance whitening optimizing local SNR. Sample splitting preserves post-selection validity; experiments show near-nominal Type-I error and competitive power, with learned locations that localize effects in a semi-synthetic medical-imaging study.

Authors: Houssam Zenati, Arthur Gretton

Open reader Original

ArXiv 2026-05-08 1 min read

Evaluation of an Actuated Spine in Agile Quadruped Locomotion

Why it matters

Bohlinger et al. (arXiv 2026-05-08) evaluate an actuated 1-DOF spine in the sagittal plane on MAB Robotics' Silver Badger in MuJoCo, finding that the spine provides increased agility and enables the robot to overcome higher stairs, steeper slopes, taller hurdles, and smaller passages versus the non-spined configuration.

Key details

The empirical simulation study covers multiple tasks—high-speed running, stair climbing, high-angle slope climbing, hurdling, and crawling—demonstrating consistent performance gains from adding a single-DOF spinal actuator for learned agile quadruped locomotion.

Brief

The paper investigates whether a single-DOF actuated sagittal spine improves learned agile locomotion for a quadruped. Using MuJoCo simulations of the Silver Badger robot, the authors evaluate high-speed running, stair and high-angle slope climbing, hurdling, and crawling. Results show the spine increases agility and enables traversing higher stairs, steeper slopes, taller obstacles, and narrower passages, suggesting spine actuation is a promising design extension for agile robots.

Authors: Nico Bohlinger, Piotr Kicki, Davide Tateo...

Open reader Original

Twitter/X 2026-05-11 2 min read

@swyx posted what he believes are the first public photos of Cognition’s "Cog…

Why it matters

@swyx posted what he believes are the first public photos of Cognition’s "Cog House" and, as an advisor, says the company will be worth $100B by the end of 2026 (his opinion).

Key details

Scott Wu, Cognition co‑founder, is a competitive‑programming prodigy: three IOI gold medals, US national middle‑school math champion, widely regarded as America’s top IOI gold‑medalist and coach.
Cognition was founded in November 2023 when Wu was 26; it shipped AI engineer "Devin" in March 2024, which after criticism helped the company reach a $445M revenue run rate within 18 months, with usage doubling every eight weeks and customers including the US Army, Goldman Sachs, and Mercedes‑Benz; the company is raising at about a $25B valuation.

Brief

@swyx shared purportedly the first public photos of Cognition’s secretive "Cog House" and praises the company’s trajectory, predicting a $100B valuation by end of 2026. The post highlights a Colossus profile of co‑founder Scott Wu—a three‑time IOI gold medalist—who founded Cognition in Nov 2023, launched Devin in Mar 2024, and grew to a $445M run rate with major customers and a ~ $25B raise.

By @swyx

Open reader Original

ArXiv 2026-05-08 1 min read

Accurate and Efficient Statistical Testing for Word Semantic Breadth

Why it matters

Introduces a Householder-aligned permutation test that applies a single Householder reflection to align mean directions of two token-vector clouds before nonparametric permutation testing, isolating dispersion (semantic breadth) from directional differences in contextualized embeddings.

Key details

On evaluated cases the alignment reduced Type‑I error by 32.5% while preserving sensitivity to true breadth differences (compared with naive tests that confound direction and dispersion).
Provides a GPU-oriented, batched implementation that achieves a 23× speedup over the CPU baseline; paper by Yo Ehara posted to arXiv 2026-05-08 and accepted to ACL 2026 Main Conference.

Brief

Yo Ehara proposes a Householder-aligned permutation test to more accurately compare word semantic breadth from contextualized token embeddings. By reflecting one token cloud to align mean directions before permutation testing, the method prevents directional differences from masquerading as dispersion effects. Empirically it cut Type‑I error by 32.5% and, with a GPU-batched implementation, ran 23× faster than a CPU baseline; accepted to ACL 2026.

Authors: Yo Ehara

Open reader Original

mg1.substack.com 2026-05-10 1 min read

Tpayneful liked The Factory's Quote is the Product

Why it matters

On 2026-05-10 23:34:48+00:00 Substack (mg1.substack.com) recorded that user 'Tpayneful' liked the post titled "The Factory's Quote is the Product."

Key details

The notification references author Spencer Burleigh (©2026) and includes a San Francisco contact address (548 Market Street PMB 72296, San Francisco, CA 94104) and a displayed figure "681" (likely a reader/engagement metric).

Brief

A Substack notification shows that user Tpayneful liked Spencer Burleigh’s post "The Factory's Quote is the Product" on mg1.substack.com, with a published timestamp of 2026-05-10 23:34:48+00:00 (created 2026-03-11, updated 2026-04-04). The message includes author metadata (©2026), profile images, action links and a displayed metric "681," plus a San Francisco mailing address.

By Tpayneful

Open reader Original

mg1.substack.com 2026-05-10 1 min read

Tpayneful liked Oil’s Financials are Falling Apart

Why it matters

On 2026-05-10 23:34:31+00:00 Substack user 'Tpayneful' liked Spencer Burleigh’s post titled "Oil’s Financials are Falling Apart" on mg1.substack.com.

Key details

Article metadata shows it was created 2026-03-11 12:44 and last updated 2026-04-04 00:00; the content is a Substack like/notification email linking to the post and Burleigh’s profile (Spencer Burleigh, 548 Market Street PMB 72296).

Brief

Tpayneful liked Spencer Burleigh’s Substack post "Oil’s Financials are Falling Apart" (notification timestamp 2026-05-10 23:34:31+00:00). The message is a Substack email/like notification linking to the original post (created 2026-03-11 12:44; last updated 2026-04-04 00:00) and Burleigh’s author profile.

By Tpayneful

Open reader Original

mg1.substack.com 2026-05-10 1 min read

Tpayneful liked Snap - When your Core Competency is a Loss Leader

Why it matters

On 2026-05-10 23:38:33+00:00 Substack user 'Tpayneful' liked the post titled 'Snap - When your Core Competency is a Loss Leader' hosted on mg1.substack.com.

Key details

Document metadata shows Created: 2026-03-11 12:44 and Last Updated: 2026-04-04 00:00; the page also displays © 2026 Spencer Burleigh and the address 548 Market Street PMB 72296, San Francisco, CA 94104.
The submitted content is a Substack notification/UI page with images and links but contains no substantive article body or analysis text beyond the like/notification details.

Brief

A Substack notification records that user 'Tpayneful' liked the post 'Snap - When your Core Competency is a Loss Leader' on mg1.substack.com (liked at 2026-05-10 23:38:33 UTC). Metadata lists Created 2026-03-11 12:44 and Last Updated 2026-04-04; the page includes © 2026 Spencer Burleigh and a San Francisco mailing address but no article body.

By Tpayneful

Open reader Original

Twitter/X 2026-05-11 1 min read

On 2026-05-11 Al Mayadeen English shared Professor Mohammad Marandi's claim that…

Why it matters

On 2026-05-11 Al Mayadeen English shared Professor Mohammad Marandi's claim that UAE President Sheikh Mohamed bin Zayed Al Nahyan (MBZ) has taken his alignment with Israel "far too far," calling the UAE "very destructive for the Islamic world, for the Arab world, and for humanity."

Key details

Marandi asserted the UAE's regional policies are "in line with Zionist interests in the Horn of Africa, in the Arabian Peninsula and in the Persian Gulf," and @ianmiles responded to the post with the blunt rebuttal: "Oh fuck off."

Brief

Professor Mohammad Marandi, quoted by Al Mayadeen English on 2026-05-11, accused UAE President Sheikh Mohamed bin Zayed Al Nahyan of moving "far too far" toward Israel and described the UAE as "very destructive for the Islamic world, for the Arab world, and for humanity," alleging alignment with "Zionist interests" across the Horn of Africa, Arabian Peninsula and Persian Gulf; the post drew a curt reply from @ianmiles: "Oh fuck off."

By @ianmiles

Open reader Original

ArXiv 2026-05-08 1 min read

Text-to-CAD Evaluation with CADTests

Why it matters

Mallis, Wang, Karadeniz, Ricci, Kacem, and Aouada (arXiv 2026-05-08) introduce CADTestBench and CADTests: the first test-based benchmark for Text-to-CAD where CADTests are executable software tests that verify geometric and topological requirements of generated CAD models.

Key details

Using CADTestBench the authors benchmark recent Text-to-CAD methods and show CADTests can also guide model generation, producing simple baselines that surpass current methods; code and data are published on GitHub and as a Hugging Face dataset.

Brief

Text-to-CAD evaluation is framed as automated testing in this work: Mallis et al. propose CADTestBench and CADTests, executable checks that validate geometric and topological constraints of generated CAD models. They benchmark recent Text-to-CAD systems on CADTestBench and demonstrate that using CADTests to guide generation yields simple baselines that outperform prior methods; code and datasets are open-sourced.

Authors: Dimitrios Mallis, Marco Wang, Ahmet Serdar Karadeniz...

Open reader Original

Twitter/X 2026-05-11 1 min read

Vibe-Trading has shipped daily updates for an entire month since its open-source…

Why it matters

Vibe-Trading has shipped daily updates for an entire month since its open-source release and surpassed 6,000 GitHub stars (post published 2026-05-11).

Key details

The framework couples AI agents with quantitative trading tools, directly integrates with data sources Tushare and AKShare, and supports Chinese A‑shares, crypto, and international markets with exports to TradingView, 通达信, and MetaTrader.
Key features include brokerage-statement upload for instant analytics and a "Shadow Account" performance view; persistent-memory AI agents that self-improve and support 13+ AI models; and enterprise-ready security with API authentication, secure execution environments, and Docker deployment.

Brief

Vibe-Trading is an open-source AI + quant trading framework that hit 6k+ GitHub stars and has shipped daily updates for a month (published 2026-05-11). It connects to Tushare and AKShare, supports A‑shares/crypto/global markets, exports to TradingView/通达信/MetaTrader, and offers brokerage analytics, a Shadow Account, persistent-memory agents (13+ models), and Docker-ready security for AI+Human workflows.

By @huang_chao4969

Open reader Original

substack.com 2026-05-11 2 min read

Citrini, Sam Bowman, and Internal Tech Emails posted new notes

Why it matters

Citrini (note posted ~7 days before the digest) reports Elon Musk saying actuators make up 56% of the Bill of Materials for Tesla's Optimus robot; that note recorded 84 likes, 8 comments, and 8 restacks.

Key details

Sam Bowman (note posted ~6 days before the digest) highlights emerging hair‑loss treatments that may be effective without travel; the post received 73 likes, 4 comments, and 6 restacks.
Internal Tech Emails (note posted ~5 days before the digest) republishes messages including a ‘Sam Altman texts Mira Murati…’ item; it drew the largest engagement in the digest with 116 likes, 9 comments, and 13 restacks.

Brief

Three Substack notes (Citrini, Sam Bowman, Internal Tech Emails) featured in the May 11, 2026 digest. Citrini cites Elon Musk claiming actuators account for 56% of Optimus’s BOM. Sam Bowman summarizes promising hair‑loss treatments. Internal Tech Emails republishes internal messages (including Sam Altman–Mira Murati texts). Engagement counts were 84/73/116 likes and 8/4/9 comments respectively.

By Substack

Open reader Original

mg1.substack.com 2026-05-10 1 min read

Tpayneful liked American Words - Still Everywhere

Why it matters

User Tpayneful liked the Substack post "American Words - Still Everywhere" on 2026-05-10 23:46:16+00:00; the post metadata shows it was created 2026-03-11 (12:44) and last updated 2026-04-04 (00:00).

Key details

The notification is from mg1.substack.com / Spencer Burleigh (© 2026) and includes publisher contact/address 548 Market Street PMB 72296, San Francisco, CA 94104, with links to the post and user profile.

Brief

American Words - Still Everywhere is a Substack post that received a like from user Tpayneful on 2026-05-10 23:46:16 UTC. The post was created 2026-03-11 (12:44) and last updated 2026-04-04 (00:00); the notification originates from mg1.substack.com under Spencer Burleigh (©2026) and contains links and UI metadata but not the article body.

By Tpayneful

Open reader Original

ArXiv 2026-05-08 1 min read

Proxy3D: Efficient 3D Representations for Vision-Language Models via Semantic Clustering and Alignment

Why it matters

Proxy3D builds compact 3D proxy representations from only video frames using semantic and geometric encoders plus semantic-aware clustering to produce scene proxies in 3D space.

Key details

The authors curated the SpaceSpan dataset and use multi-stage training to align these proxies with vision-language models; the approach yields competitive or state-of-the-art performance on 3D visual question answering, visual grounding, and spatial intelligence benchmarks while using shorter vision sequences.
Paper by Jerry Jiang, Haowen Sun, Denis Gudovskiy et al., posted to arXiv 2026-05-08 and accepted to CVPR 2026; project page: https://wzzheng.net/Proxy3D

Brief

Proxy3D proposes compact 3D proxy representations extracted from video frames via semantic and geometric encoders and semantic-aware clustering. The authors curate the SpaceSpan dataset and apply multi-stage training to align proxies with vision-language models; on 3D visual question answering, visual grounding, and spatial intelligence benchmarks the method achieves competitive or state-of-the-art results while using shorter vision sequences. (Abstract-only summary.)

Authors: Jerry Jiang, Haowen Sun, Denis Gudovskiy...

Open reader Original

mg1.substack.com 2026-05-10 1 min read

Tpayneful liked Founder's Guide to Building a Second Brain

Why it matters

On 2026-05-10 23:37:13+00:00 Substack user 'Tpayneful' liked the post 'Founder's Guide to Building a Second Brain' (notification contains links to the post and the liker’s profile).

Key details

The notification originates from Spencer Burleigh's Substack (© 2026 Spencer Burleigh) and includes action links ('View profile', 'Mute post') and a displayed numeric '843' on the page.
Document metadata shows Created: 2026-03-11-12-44 and Last Updated: 2026-04-04-00-00; the footer lists 548 Market Street PMB 72296, San Francisco, CA 94104.

Brief

Tpayneful liked the Substack post 'Founder's Guide to Building a Second Brain' on 2026-05-10 23:37:13+00:00. The notification email/page is from Spencer Burleigh's Substack (© 2026 Spencer Burleigh), includes links to the post and the liker’s profile plus controls (View profile, Mute post), and shows metadata Created 2026-03-11 12:44 and Last Updated 2026-04-04.

By Tpayneful

Open reader Original

ArXiv 2026-05-08 1 min read

Flow-OPD: On-Policy Distillation for Flow Matching Models

Why it matters

Flow-OPD applies on-policy distillation to Flow Matching text-to-image models with a two-stage alignment: (1) single-reward GRPO fine-tuning to produce domain-specialized teacher experts; (2) Flow-based Cold-Start and a three-step student consolidation (on-policy sampling, task‑routing labeling, dense trajectory-level supervision). It also introduces Manifold Anchor Regularization (MAR) to anchor outputs to a high-quality manifold.

Key details

Built on Stable Diffusion 3.5 Medium, Flow-OPD raises GenEval from 63 to 92 and OCR accuracy from 59 to 94, claims an overall improvement of roughly 10 points over vanilla GRPO, preserves image fidelity and human-preference alignment, and exhibits an emergent 'teacher-surpassing' effect (paper published 2026-05-08).

Brief

Flow-OPD addresses reward sparsity and gradient interference in multi-task alignment for Flow Matching text-to-image models by combining single-reward GRPO teachers with a Flow-based Cold-Start and on-policy distillation into a single student, plus Manifold Anchor Regularization to prevent aesthetic degradation. Built on Stable Diffusion 3.5 Medium, the abstract reports GenEval 63→92 and OCR 59→94; full text was not available.

Authors: Zhen Fang, Wenxuan Huang, Yu Zeng...

Open reader Original

ArXiv 2026-05-08 1 min read

Characterizing and Correcting Effective Target Shift in Online Learning

Why it matters

The authors derive a closed-form expression proving online kernel regression is equivalent to offline kernel regression with systematically shifted (inaccurate) target outputs, and they show that compensating for this effective target shift can provably recover the offline predictor.

Key details

They provide both a closed-form target-correction and an iterative sequential form; empirically, online SGD with iteratively corrected targets outperforms learning with the true targets on CIFAR-10 and CORe50 in continual-learning settings (paper by Ziyan Li and Naoki Hiratani, arXiv:2605.07886v1, published 2026-05-08; 22 pages, 6 figures).

Brief

Online kernel regression: Li and Hiratani (2026) derive a closed-form expression showing online kernel regression is equivalent to offline kernel regression with shifted, inaccurate target outputs. They give a closed-form and an iterative target-correction that provably recovers the offline predictor. Experiments on CIFAR-10 and CORe50 show online SGD with corrected targets outperforms using true targets in continual learning.

Authors: Ziyan Li, Naoki Hiratani

Open reader Original

mg1.substack.com 2026-05-10 1 min read

Tpayneful liked Storytelling and Hope - Everywhere in Life

Why it matters

On 2026-05-10 23:39:12+00:00 Substack user Tpayneful liked the post titled "Storytelling and Hope - Everywhere in Life" (mg1.substack.com).

Key details

The notification is from a Substack newsletter associated with Spencer Burleigh (© 2026) and includes links to view the original post, the liker’s profile, and a mute option; the footer lists 548 Market Street PMB 72296, San Francisco, CA 94104.

Brief

Tpayneful liked the Substack post "Storytelling and Hope - Everywhere in Life" on 2026-05-10 23:39:12 UTC. The item is a Substack notification (mg1.substack.com) linked to Spencer Burleigh’s newsletter (© 2026), containing direct links to the post and profile, a mute control, and a San Francisco mailing address in the email footer.

By Tpayneful

Open reader Original

mg1.substack.com 2026-05-10 1 min read

Tpayneful liked Chicken - Inconsistent with the UK's Brand

Why it matters

On 2026-05-10 23:38:07+00:00 Substack user Tpayneful liked the post titled "Chicken - Inconsistent with the UK's Brand" (notification shows the like event and the post title).

Key details

Metadata shows the item was created 2026-03-11 12:44, last updated 2026-04-04 00:00, is associated with Spencer Burleigh (© 2026) and includes a footer listing "894" and the address 548 Market Street PMB 72296, San Francisco, CA 94104.

Brief

The Substack notification records that user Tpayneful liked the post "Chicken - Inconsistent with the UK's Brand" on 2026-05-10 23:38:07 UTC. The page metadata lists creation on 2026-03-11 12:44 and last update on 2026-04-04 00:00, attributes the content to Spencer Burleigh (© 2026), and includes a footer showing "894" and a San Francisco address.

By Tpayneful

Open reader Original

awardwallet.com 2026-05-11 2 min read

Your Favorite Dining Card Just Got a Lot More Valuable

Why it matters

Anniversary refresh (reported May 11, 2026) adds a 5X earning rate on hotel spending to a popular dining-focused credit card.

Key details

The update includes nearly $100 in limited-time travel and dining credits and enrollable rental-car elite status; enrollment is required to activate these benefits.
The card is being promoted with a top-tier welcome offer as high as 100,000 points; AwardWallet notes full details are behind a blog link and says checking the offer won’t hurt your credit score.

Brief

A popular dining-focused credit card received an anniversary refresh (reported May 11, 2026) that adds 5X points on hotels, nearly $100 in limited-time travel and dining credits, and enrollable rental-car elite status; the promotion also advertises a welcome bonus up to 100,000 points and requires enrollment for some benefits.

By AwardWallet

Open reader Original

Twitter/X 2026-05-11 1 min read

On 2026-05-11, Gary Marcus replied to Geoffrey Hinton, explicitly denying he ever…

Why it matters

On 2026-05-11, Gary Marcus replied to Geoffrey Hinton, explicitly denying he ever said AI systems “JUST regurgitate” and insisting that claim is false.

Key details

Marcus concedes AI systems sometimes regurgitate and calls the evidence for that “overwhelming,” but distinguishes regurgitation from hallucinations and says he has warned about hallucinations since 2001.
Marcus says he cannot find Hinton’s alleged quote outside Hinton’s own webpage and promises to discuss Hinton’s reply and the alleged quote in a further reply.

Brief

Gary Marcus, replying to Geoffrey Hinton on May 11, 2026, rejects Hinton’s characterization that Marcus said AI systems “JUST regurgitate,” stating he never made that claim. He accepts that models sometimes regurgitate (calling the evidence overwhelming), distinguishes regurgitation from hallucinations (which he’s warned about since 2001), and notes he cannot source Hinton’s alleged quote beyond Hinton’s webpage, promising further comment.

By @GaryMarcus

Open reader Original

Twitter/X 2026-05-11 1 min read

@0xd1namit (published 2026-05-11) says prop trading is now on Polymarket and…

Why it matters

@0xd1namit (published 2026-05-11) says prop trading is now on Polymarket and urges the community to "can't sleep on it," crediting @BagCalls and his team for strong marketing and predicting top-tier execution; defines prop trading as firms providing their own capital after traders pass challenges.

Key details

Funding Predicts Beta Competition (powered by Polymarket) is live: offers over 1.4m in funded accounts and 10K+ in cash prizes, gives entrants 14 days to prove they can trade, and is promoted at fundingpredicts.com (tweet/video shared by @FundingPredicts).

Brief

Prop trading on Polymarket is now live, according to @0xd1namit (2026-05-11), who credits @BagCalls and his team for standout marketing and expects strong execution. The Funding Predicts Beta Competition (powered by Polymarket) launches with over 1.4m in funded accounts and 10K+ in cash prizes, giving traders 14 days to qualify at fundingpredicts.com.

By @0xd1namit

Open reader Original

mg1.substack.com 2026-05-10 1 min read

Tpayneful liked Founder’s Guide to Working with Filipino VAs

Why it matters

On 2026-05-10 23:35:53+00:00 Substack user Tpayneful liked the post "Founder’s Guide to Working with Filipino VAs".

Key details

The liked post is authored by Spencer Burleigh; the notification originates from mg1.substack.com, shows "742" in the footer, and includes © 2026 Spencer Burleigh with a San Francisco mailing address (548 Market Street PMB 72296, CA 94104).

Brief

A Substack notification records that user Tpayneful liked Spencer Burleigh’s post "Founder’s Guide to Working with Filipino VAs" on 2026-05-10 23:35:53+00:00. The message was sent from mg1.substack.com, displays the number "742" in the footer, and carries a © 2026 Spencer Burleigh copyright plus a San Francisco mailing address.

By Tpayneful

Open reader Original

ArXiv 2026-05-08 1 min read

SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation

Why it matters

SCOPE (Structured Decomposition and Conditional Skill Orchestration) maintains persistent semantic commitments via an evolving structured specification and conditionally invokes retrieval, reasoning, and repair skills to resolve the identified 'Conceptual Rift' in complex text-to-image intent realization.

Key details

The paper introduces the human-annotated Gen-Arena benchmark with entity- and constraint-level specifications and the Entity-Gated Intent Pass Rate (EGIP). SCOPE achieves 0.60 EGIP on Gen-Arena and strong results on WISE-V (0.907) and MindBench (0.61).

Brief

SCOPE proposes a specification-guided framework that tracks semantic commitments across generation by decomposing intents into structured specifications and conditionally calling retrieval, reasoning, and repair skills to fix violations (the 'Conceptual Rift'). Evaluated on the new Gen-Arena benchmark with the EGIP metric, SCOPE yields 0.60 EGIP and also performs strongly on WISE-V (0.907) and MindBench (0.61). Summary based on the abstract (full text not provided).

Authors: Tianfei Ren, Zhipeng Yan, Yiming Zhao...

Open reader Original

ArXiv 2026-05-08 1 min read

Consistency Regularised Gradient Flows for Inverse Problems

Why it matters

Proposes a Euclidean–Wasserstein-2 gradient-flow framework that jointly performs posterior sampling and prompt optimization in the latent space, aligning the generative prior and posterior with observed data.

Key details

Combines the single flow with few-step latent text-to-image models to enable low-NFE inference without backpropagation through autoencoders, addressing the high-NFE and heavy backprop costs of prior LDM-based solvers (e.g., Rombach et al., 2022).
Authors Alessio Spagnoletti, Tim Y. J. Wang, Marcelo Pereyra, and O. Deniz Akyildiz (arXiv:2605.07907v1, 2026-05-08) report state-of-the-art reconstructions on several canonical imaging inverse problems with significantly reduced computational cost.

Brief

The paper tackles inverse imaging with Vision–Language Latent Diffusion Models by introducing a unified Euclidean–Wasserstein-2 gradient-flow that jointly samples the posterior and optimizes prompts in latent space. By pairing this flow with few-step latent text-to-image models, the method achieves low-NFE inference and avoids backprop through autoencoders, yielding state-of-the-art results on canonical inverse problems with much lower compute; summary based on the abstract only.

Authors: Alessio Spagnoletti, Tim Y. J. Wang, Marcelo Pereyra...

Open reader Original

mg1.substack.com 2026-05-10 1 min read

Tpayneful liked Basic Capital's Big Idea: Bet Your Retirement on Interest Rates

Why it matters

On 2026-05-10 23:35:16+00:00 user Tpayneful liked the Substack post titled "Basic Capital's Big Idea: Bet Your Retirement on Interest Rates" (post by Spencer Burleigh).

Key details

The notification email originates from mg1.substack.com; the item lists creation on 2026-03-11 12:44 and last updated 2026-04-04 00:00 and includes © 2026 Spencer Burleigh with a San Francisco mailing address.

Brief

Tpayneful liked a Substack post (published 2026-05-10 23:35:16+00:00) titled "Basic Capital's Big Idea: Bet Your Retirement on Interest Rates," attributed to Spencer Burleigh. The content is a platform notification (mg1.substack.com) showing metadata: created 2026-03-11 12:44, last updated 2026-04-04 00:00, and © 2026 Spencer Burleigh with a San Francisco address.

By Tpayneful

Open reader Original

ArXiv 2026-05-08 1 min read

Position: Mechanistic Interpretability Must Disclose Identification Assumptions for Causal Claims

Why it matters

A purposive audit of 10 mechanistic-interpretability papers across four methodological strands (Lin & Liu, arXiv 2026-05-08) found no dedicated "identification-assumptions" section; papers commonly report validation metrics (faithfulness, completeness, monosemanticity, alignment, ablation effects) as causal support without stating the assumptions that would make those metrics identifying.

Key details

A two-human-coder replication on n=30 reproduced the main direction: dedicated identification sections are absent and validation-metric substitution is common (exact counts are coding-rule sensitive). The authors propose a five-item disclosure norm: state if the claim is causal, name the identification strategy, enumerate assumptions, highlight at least one key assumption, and explain how conclusions change if assumptions fail.

Brief

Lin and Liu show that mechanistic interpretability work often uses causal vocabulary (circuits, mediators, causal abstraction) while omitting explicit identification assumptions. Via a purposive audit of 10 papers and a two-coder check on 30 items, they document widespread substitution of validation metrics for identification and offer a five-step disclosure norm to make causal claims explicit. Submitted to NeurIPS 2026 (Position Track).

Authors: Zezheng Lin, Fengming Liu

Open reader Original

ArXiv 2026-05-08 1 min read

It Just Takes Two: Scaling Amortized Inference to Large Sets

Why it matters

Method: Train a mean-pool Deep Set encoder on sets of size at most two to learn representations that generalize to arbitrary deployment set size N; then finetune the inference head on pre-aggregated embeddings so training compute/memory is essentially independent of N.

Key details

Results & provenance: Authors Wehenkel, Kagan, Heinrich, and Pollard (arXiv 2026-05-08) evaluate on scalar, image, multi-view 3D, molecular, and high-dimensional conditional-generation benchmarks with N in the thousands, matching or outperforming standard baselines at a fraction of the compute.

Brief

It Just Takes Two introduces a simple, theoretically grounded strategy for amortized neural posterior estimation that decouples representation learning from posterior modeling. The authors train a mean-pool Deep Set on sets of size ≤2 to produce an encoder that generalizes to arbitrary set sizes, then finetune an inference head on aggregated embeddings; this makes training cost essentially independent of deployment set size N. Across diverse benchmarks (scalar, image, multi-view 3D, molecular, high-dimensional generation) with N in the thousands, the method matches or outperforms baselines while using much less compute.

Authors: Antoine Wehenkel, Michael Kagan, Lukas Heinrich...

Open reader Original

Twitter/X 2026-05-11 1 min read

Andrew Ng announced on 2026-05-11 that Coursera and Udemy have merged into a…

Why it matters

Andrew Ng announced on 2026-05-11 that Coursera and Udemy have merged into a single company and he will serve as Chairman, working alongside Greg Hart and the combined leadership team.

Key details

Ng asserts the merger combines broader learning content, trusted instructors, and more engaging experiences to make learning more personalized, applied, and accessible at scale.
He frames the move as critical because AI is changing the nature of work and increasing demand for continuous, job-relevant skill-building worldwide.

Brief

Andrew Ng announced on May 11, 2026 that Coursera and Udemy have joined as one company and named him Chairman alongside Greg Hart. He argues the combined platform will unite content, instructors, and experiences to deliver more personalized, applied, and accessible learning at scale to meet rising AI-driven demand for job-relevant skills.

By @AndrewYNg

Open reader Original

Twitter/X 2026-05-11 2 min read

Royal Pop is a Pump.fun Solana token tied to the Swatch x Audemars Piguet Royal…

Why it matters

Royal Pop is a Pump.fun Solana token tied to the Swatch x Audemars Piguet Royal Oak collab dropping May 16; it's ~1 day old with an estimated $130K–$240K market cap (ATH $247K), ~$380K 24h volume, ~41K in liquidity and ~450 holders.

Key details

The watch (bioceramic Royal Oak) retails ~$300–$500, in-store only, with global queues forming; watch press frames this as 'MoonSwatch 2.0' and event-driven tokens around confirmed luxury catalysts have historically done 10–50x in days when timing/framing align.
Significant downside risk: author notes >95% of Solana tokens go to zero, multiple copycats (each < ~$20K) and meme risk ('Royal Poop') exist; Royal Pop could 3–4x or halve in hours, 10–20x into May 16 is plausible, but post–May 20 odds fall unless a real community (like Labubu) forms—99% of top-50 holders bought their own supply.

Brief

Royal Pop is a Pump.fun Solana meme token built around the Swatch × Audemars Piguet Royal Oak release on May 16; it's trading with ~$130K–$240K market cap, ~$380K 24h volume and ~450 holders. The author argues momentum and MoonSwatch framing could drive a 10–20x run into launch, but >95% of Solana tokens fail and post–May 20 value depends on forming a real community.

By @ianmiles

Open reader Original

awardwallet.com 2026-05-10 1 min read

‼️Visa Business Card Companion Fare Offer from Alaska Air expiring in 3 months on August 8, 2026 ‼️

Why it matters

Visa Business Card Companion Fare Offer in Spencer Burleigh’s Alaska Airlines (Atmos Rewards) account expires on August 8, 2026 (approximately three months from the 2026-05-10 notice); the account was last updated 101 days ago.

Key details

Alert published by AwardWallet on 2026-05-10 advises logging into AwardWallet to verify the expiration and use auto-login links; message includes unsubscribe and account-restore links and AwardWallet contact/address details.

Brief

The Visa Business Card Companion Fare Offer in Spencer Burleigh’s Alaska Airlines (Atmos Rewards) account is set to expire on August 8, 2026 (about three months after the 2026-05-10 AwardWallet notice). AwardWallet reports the account was last updated 101 days ago and urges logging into AwardWallet to verify the coupon using provided auto-login and account-restore links.

By AwardWallet

Open reader Original

ArXiv 2026-05-08 1 min read

Inferring Asteroseismic Parameters from Short Observations Using Deep Learning: Application to TESS and K2 Red Giants

Why it matters

TESS has an estimated >300,000 oscillating red giants with mostly 1–2 month observations; the authors develop a deep-learning method to infer global seismic parameters from such short-duration data.

Key details

On one-month Kepler and K2 samples the ML algorithm recovers Δν and ν_max accurately for ≈50% of targets; for one-sector TESS data reliable Δν is recovered for only ≈23% of stars.
From K2 the method yields reliable dipolar period spacings (ΔΠ1) for ≈200 young red giants, reproducing the well-known Δν–ΔΠ1 degenerate sequence seen in Kepler red giants.

Brief

The authors apply deep learning to infer global asteroseismic parameters (Δν, νmax, and for K2 also ΔΠ1) from short, one-month lightcurves to enable scalable analysis of TESS/K2 red-giant samples. Their model recovers Δν and νmax for ~50% of one-month Kepler/K2 cases but only ~23% for single-sector TESS; it produces ~200 reliable ΔΠ1 measures that match the Kepler Δν–ΔΠ1 sequence. (Summary based on the abstract; full text was not provided.)

Authors: Nipun Ghanghas, Siddharth Dhanpal, Shravan Hanasoge...

Open reader Original

mg1.substack.com 2026-05-10 1 min read

Tpayneful liked Dancing with Missiles

Why it matters

On 2026-05-10T23:35:29+00:00 Substack user "Tpayneful" liked the post titled "Dancing with Missiles".

Key details

The notification originates from mg1.substack.com, references Spencer Burleigh (© 2026), and is an engagement email containing layout and image assets rather than the article text.

Brief

A Substack notification dated 2026-05-10 23:35:29+00:00 showing that user 'Tpayneful' liked the post 'Dancing with Missiles'. The message, delivered via mg1.substack.com and credited to Spencer Burleigh (© 2026), is an engagement/like email containing layout and image assets rather than substantive article content.

By Tpayneful

Open reader Original

ArXiv 2026-05-08 1 min read

Tool Calling is Linearly Readable and Steerable in Language Models

Why it matters

Tool identity is linearly readable and steerable: adding the mean-difference between two tools' average internal activations flips the model's chosen tool with 77–100% accuracy on name-only single-turn prompts (93–100% for models ≥4B), and the autoregressive JSON arguments follow the new tool's schema.

Key details

The causal effect concentrates on the output row for the target tool and a small set of mid/late-layer attention heads: injecting a unit vector along that output-row reaches 93–100% success; activation patching localises responsibility to those heads; a within-topic probe across 14 airline tools achieves 61–89% top-1 on five 4B–14B models.
Pretraining encodes tool identity before generation: cosine readout from base models recovers 69–82% tool identity while base generation is only 2–10%; model suite tested includes 12 instruction-tuned models (Gemma 3, Qwen 3, Qwen 2.5, Llama 3.1) from 270M to 27B. Also, on Gemma 3 12B/27B, queries with smallest top-1 vs top-2 activation gaps produce 14–21× more wrong calls.

Brief

The paper shows that language models (270M–27B, including Gemma 3, Qwen, Llama 3.1) encode tool selection as a linearly readable vector: adding per-tool mean-difference vectors reliably flips name-only single-turn tool choices and causes downstream JSON arguments to match the new schema. Causal attribution concentrates on one output-row and a few mid/late attention heads; base-model representations already carry tool identity (69–82% recoverable), while instruction tuning wires it to generation. Measurements are for single-turn fixed-menu settings; multi-turn transfer is noted as more fragile.

Authors: Zekun Wu, Ze Wang, Seonglae Cho...

Open reader Original

ArXiv 2026-05-08 1 min read

123D: Unifying Multi-Modal Autonomous Driving Data at Scale

Why it matters

123D is an open-source framework that unifies multi-modal autonomous-driving data under a single API by storing each sensor/modal modality as an independent timestamped event stream (no prescribed rate), enabling synchronous or asynchronous access across heterogeneous datasets; it consolidates eight real-world datasets totaling 3,300 hours and 90,000 kilometers plus a configurable synthetic dataset.

Key details

The authors use 123D to systematically compare annotation statistics and evaluate pose/calibration accuracy across datasets, and showcase two applications enabled by the framework: cross-dataset 3D object-detection transfer and reinforcement-learning for planning; code and docs are at https://github.com/kesai-labs/py123d.

Brief

123D unifies multi-modal driving datasets by representing each sensor or annotation as a timestamped event stream, allowing flexible synchronization across heterogeneous formats. The authors merge eight real-world datasets (3,300 hours, 90,000 km) and a synthetic generator, perform systematic analyses of annotations and pose/calibration, and demonstrate cross-dataset 3D detection transfer and RL planning; the framework and tools are released open-source.

Authors: Daniel Dauner, Valentin Charraut, Bastian Berle...

Open reader Original

Twitter/X 2026-05-11 1 min read

Karpathy (created 2026-03-11) recommends ending LLM prompts with the exact phrase…

Why it matters

Karpathy (created 2026-03-11) recommends ending LLM prompts with the exact phrase 'structure your response as HTML' to view generated output in a browser and reports similar success asking for slideshow-style output.

Key details

He asserts audio is humans' preferred input and vision is AIs' preferred output, noting ~one-third of the brain is dedicated to vision, and predicts a progression: raw text → markdown → HTML → interactive neural videos/simulations, ultimately from diffusion neural nets.
He urges better multimodal inputs (pointing/gesturing on-screen) before moving to Neuralink‑style BCIs, calling the current phase an ongoing 'input/output mind‑meld' with substantial work remaining.

Brief

Karpathy recommends appending 'structure your response as HTML' to LLM prompts to render outputs in a browser, argues audio will be humans' preferred input while vision becomes AI's preferred output, and outlines a progression from text→markdown→HTML→interactive neural videos (eventually diffusion‑generated), urging improved multimodal inputs before BCIs.

By @karpathy

Open reader Original

Twitter/X 2026-05-11 1 min read

Alex Luposasca (guest on Latent.Space, episode published 2026-05-11) reports…

Why it matters

Alex Luposasca (guest on Latent.Space, episode published 2026-05-11) reports GPT‑5.x derived new results in scattering amplitudes, including a simplification for single‑minus gluon tree amplitudes (discussed ~14:38–31:26) and a reconstructed proof from scratch (~38:07) to verify validity.

Key details

GPT‑5 helped solve a year‑long physics puzzle (~20:56–23:02); GPT Pro then generalized those techniques to graviton amplitudes (~42:27–53:57). Luposasca also credits GPT‑5 with solving a black‑hole perturbation problem (~1:12:46).
Luposasca characterizes AI as a 'scout' and collaborator that accelerates theoretical discovery but warns of 'AI slop' risks to publishing quality and says the bottleneck is shifting to human curation, taste, and writing papers (~53:57–1:30:19).

Brief

Alex Luposasca, on Latent.Space (published 2026-05-11), describes using GPT‑5.x to derive new scattering‑amplitude results (single‑minus gluon trees), solve a year‑long puzzle, generalize methods to graviton amplitudes, and crack a black‑hole perturbation problem. He portrays AI as a productive research 'scout' while warning of hallucination risks and a new bottleneck in paper writing and curation.

By @ALupsasca

Open reader Original

Garry's List 2026-05-11 1 min read

It's Here: The Garry's List Action Voter Guide

Why it matters

Garry's List published the Action Voter Guide on 2026-05-11 (authors: Garry Tan, Shaudi Fulp, Forrest Liu) compiling endorsements for the California primary on June 2, 2026.

Key details

The guide synthesizes recommendations from local housing groups, labor unions, and civic reformers, is labeled 'transparent, sourced, searchable' at garrysguide.org/elections, and invites additions via hello@garryslist.org or X.

Brief

The Garry's List Action Voter Guide, published May 11, 2026 by Garry Tan, Shaudi Fulp, and Forrest Liu, aggregates endorsements from local housing groups, labor unions, and civic reform organizations ahead of the California June 2 primary. The resource is presented as transparent, sourced, and searchable at garrysguide.org/elections and accepts suggested additions via email or X.

By Garry Tan, Shaudi Fulp

Open reader Original

substack.com 2026-05-11 6 min read

I spent my whole career building passive income. Here's what I got wrong.

Why it matters

Darius Foroux (Substack, published May 11, 2026) says that after spending his adult life building passive income he still experiences anxiety — the worry didn’t disappear, it simply shifted to new targets.

Key details

He invokes psychiatrist Gordon Livingston and Livingston’s book And Never Stop Dancing to argue that life’s complexity produces persistent unwanted emotions and that eliminating the ‘burden of striving’ raises the question, “What relevance do we retain?”
Foroux proposes a behavioral remedy: treat passive income as a practical foundation but prioritize daily, slightly uncomfortable habits (examples: write when you don’t feel like it, work out, do taxes, mow the lawn) to stay sharp and derive satisfaction; he recounts mowing his overgrown grass and feeling genuinely good afterward.
His conclusion: financial freedom buys time and options but is the wrong sole organizing goal — continued striving, contribution, and small daily challenges preserve purpose and reduce anxiety better than passive income alone.

Brief

Darius Foroux, writing on Substack (published May 11, 2026), reflects on decades spent building passive income and concludes that money alone did not remove anxiety — it only redirected it. Drawing on Gordon Livingston’s And Never Stop Dancing, Foroux frames the problem as a human response to life’s complexity and warns that eliminating striving threatens one’s sense of relevance. He recommends treating passive income as a foundation rather than an end and cultivating daily, slightly uncomfortable habits to stay engaged: examples include writing when unmotivated, exercising, doing taxes, and his recent anecdote of mowing overgrown grass and feeling rewarded. Foroux’s practical prescription is to use financial freedom to buy time, then invest that time in continued growth, contribution, and routines that preserve purpose and mental sharpness.

By Darius Foroux

Open reader Original

mg1.substack.com 2026-05-10 1 min read

Tpayneful liked Founder’s Guide to Hiring an Operations Team in the Philippines

Why it matters

On 2026-05-10 23:36:30+00:00 Substack user 'Tpayneful' liked the post 'Founder’s Guide to Hiring an Operations Team in the Philippines' (post hosted on mg1.substack.com).

Key details

Notification metadata shows the item was created 2026-03-11-12-44 and last updated 2026-04-04; the footer credits Spencer Burleigh © 2026 with mailing address 548 Market Street PMB 72296, San Francisco, CA 94104.
The Substack notification UI displays a numeric '261' indicator and includes standard controls (view profile, mute post) with images and assets served from substackcdn.com.

Brief

Notification records that Substack user Tpayneful liked 'Founder’s Guide to Hiring an Operations Team in the Philippines' on 2026-05-10 23:36:30+00:00. The email-style notice includes creation timestamp 2026-03-11-12-44, last updated 2026-04-04, author credit Spencer Burleigh © 2026, San Francisco mailing address, and UI elements showing '261' and profile/mute links.

By Tpayneful

Open reader Original

ArXiv 2026-05-08 1 min read

Towards Highly-Constrained Human Motion Generation with Retrieval-Guided Diffusion Noise Optimization

Why it matters

Proposes a retrieval-guided diffusion noise optimization method that mixes retrieved noise with random noise via a reward-guided mask to better initialize diffusion sampling, and claims this enables generation that satisfies highly-constrained spatiotemporal goals (e.g., severe spatial obstacles or specified numbers of walking steps).

Key details

Adds relational task parsing (using an LLM) to identify the hardest constraints and select retrieval references; the framework is training-free and was released on arXiv 2026-05-08 (authors: Hanchao Liu et al.), and the paper is accepted to CVPR 2026.

Brief

The paper addresses zero-shot human motion generation under very challenging spatiotemporal constraints by augmenting training-free diffusion noise optimization with retrieval guidance. It parses task constraints into groups (relational task parsing, powered by an LLM), retrieves reference motions for the hardest constraints, and forms a reward-guided mask to blend retrieved and random noise for improved diffusion initialization. The authors report this approach successfully handles tasks that prior methods struggle with, enabling more reliable constrained motion synthesis; accepted to CVPR 2026 (arXiv 2026-05-08).

Authors: Hanchao Liu, Fang-Lue Zhang, Shining Zhang...

Open reader Original

ArXiv 2026-05-08 1 min read

Uncertainty-Aware Structured Data Extraction from Full CMR Reports via Distilled LLMs

Why it matters

CMR-EXTR converts free-text cardiac magnetic resonance (CMR) reports into auditable structured data and provides per-field confidence; evaluated results report 99.65% variable-level accuracy (ArXiv: 2026-05-08; authors include Yi Yu and Parker Martin).

Key details

The system uses a teacher–student distillation pipeline to enable fully offline inference with limited manual annotation and an uncertainty scheme combining distribution plausibility, sampling stability, and cross-field consistency to triage human review.
Authors claim this is the first CMR-specific extraction system with integrated confidence estimation; code is available at https://github.com/yuyi1005/CMR-EXTR and the work was accepted to ISBI 2026.

Brief

CMR-EXTR converts free-text cardiac magnetic resonance (CMR) reports into auditable structured data with per-field confidence. The method employs a teacher–student distillation pipeline for fully offline inference and reduced annotation effort, plus an uncertainty model that blends distribution plausibility, sampling stability, and cross-field consistency to prioritize human review. It achieves 99.65% variable-level accuracy; code on GitHub; accepted to ISBI 2026.

Authors: Yi Yu, Parker Martin, Zhenyu Bu...

Open reader Original

ArXiv 2026-05-08 1 min read

A Note on Non-Negative $L_1$-Approximating Polynomials

Why it matters

Lee, Mehrotra, and Zampetakis (arXiv 2026-05-08) prove that every class of sets with Gaussian surface area ≤ Γ admits non-negative degree-k polynomials that ε-approximate its indicator function in L1 under the standard Gaussian, with k = ~O(Γ^2/ε^2).

Key details

The approximants have range contained in [0,∞) (a stronger pointwise guarantee than ordinary L1-approximation but weaker than sandwiching polynomials), match the best known Gaussian L1-approximation degree up to constant factors, and are motivated by applications to smoothed learning from positive-only examples.

Brief

The note proves that finite Gaussian surface area Γ implies existence of non-negative degree-k polynomials that ε-approximate indicator functions in L1 under the standard Gaussian, with k = ~O(Γ^2/ε^2). This adds a pointwise non-negativity guarantee (range [0,∞)), sits between plain L1-approximation and sandwiching polynomials, matches prior degree bounds up to constants, and targets smoothed positive-only learning. Only the abstract was available for this summary.

Authors: Jane H. Lee, Anay Mehrotra, Manolis Zampetakis

Open reader Original

ArXiv 2026-05-08 1 min read

Reinforcement Learning for Exponential Utility: Algorithms and Convergence in Discounted MDPs

Why it matters

The authors derive two Q-value–style extensions of the Bellman equation for exponential-utility optimization in discounted MDPs and show the associated operators are contractions in the L_infty and sup-log/Thompson metrics; they characterize fixed points and prove the induced greedy stationary policy is optimal among stationary policies.

Key details

They propose two model-free algorithms: a two-timescale Q-learning–style method with almost-sure convergence and finite-time convergence rates obtained via timescale separation, and a one-timescale algorithm driven by a sublinear power-law operator that lacks a global contraction but is shown to converge using local Lipschitzness, monotonicity, homogeneity, and Dini-derivative arguments (scalar finite-time analysis only).
Preprint: Gugan Thoppe, L. A. Prashanth, Ankur Naskar, Sanjay Bhat; arXiv:2605.08053v1 (cs.LG), published 2026-05-08; builds on Bellman-type exponential-utility work (e.g., Porteus 1975) to provide a foundation for value-based RL under fixed risk-aversion.

Brief

Reinforcement-learning for exponential-utility optimization in discounted MDPs: the paper derives two Q-value–style Bellman extensions whose operators are contractions in L_infty and sup-log/Thompson metrics, proves fixed-point structure and optimality of the induced greedy stationary policy among stationary policies, and presents two model-free algorithms — a two-timescale Q-learning with a.s. convergence and finite-time rates, and a one-timescale power-law method whose convergence is established via delicate local arguments. Full text on arXiv (abstract used).

Authors: Gugan Thoppe, L. A. Prashanth, Ankur Naskar...

Open reader Original

Twitter/X 2026-05-11 1 min read

Chevy Chase told a female interviewer, “No sh*t?!

Why it matters

Chevy Chase told a female interviewer, “No sh*t?! You’re not bright enough,” then added, “I know you’re not gonna put that on the air… and I hope not,” in a clip from Marina Zenovich’s film I’m Chevy Chase and You’re Not (video credited to SkyTV/CNN) that went viral on X on May 11, 2026.

Key details

@HustleBitch_ posted the clip (May 11, 2026) framing it as “CHEVY CHASE COMPLETELY INSULTS INTERVIEWER,” asked “Is he misunderstood… or just an asshole?”, and noted commenters say the exchange exemplifies Chase’s long-standing contentious reputation.

Brief

Chevy Chase insulted a female interviewer in a clip from Marina Zenovich’s I’m Chevy Chase and You’re Not, saying “No sh*t?! You’re not bright enough” and “I know you’re not gonna put that on the air… and I hope not.” A viral X post by @HustleBitch_ (May 11, 2026) highlighted the exchange and linked it to Chase’s contentious reputation.

By @HustleBitch_

Open reader Original

ghost.io 2026-05-11 6 min read

WEEKLY RADAR #400 (5/10/2026) 400 Radars. 350 Transmissions. The Hustle Is Still Very Real.

Why it matters

Real Brokerage agreed to acquire RE/MAX Holdings for approximately $880 million in a deal expected to close in H2 2026; RE/MAX shareholders can elect $13.80 cash or 5.152 shares of the new Real RE/MAX Group, with Real shareholders retaining ~59% ownership and the combined company reporting pro forma 2025 revenue of $2.3 billion and projected $30 million in annual cost savings by end of 2027.

Key details

GEM's Proptech Index (25 stocks) had a combined market cap of $234.758 billion as of 5/8/2026, up 0.51% week-over-week.
The first batch of Q1 earnings summaries published this week cover Zillow, CoStar, ProCore and Compass (full set of ten summaries to be released to Crystal subscribers once complete); exclusive decks and summaries are available to VCs/angels via application to community@geekestate.com.
GEM updates and events: Weekly Radar #400 (published 11 May 2026) marks 400 radars and 350 transmissions; upcoming member events include Fundraising office hours with Ryan Coon on 5/14 and an Innovators Roundtable Dinner during the Housing Innovation Alliance summit in Charlotte on 5/19; active fundraising deals listed include America Housing Corp ($50M Series A), LotRoll ($3M Seed), Zavvie (10% mezzanine), and Quiet Cove ($750k Seed).

Brief

GEM's Weekly Radar #400 (published 11 May 2026) combines market intel, member news and sector analysis across proptech and real estate. Key macro data: the GEM Proptech Index (25 stocks) reached a $234.758B combined market cap as of 5/8/2026 (+0.51% w/w). Earnings coverage for Q1 began with summaries of Zillow, CoStar, ProCore and Compass (ten-company series planned for subscribers). Strategically significant M&A: Real Brokerage will acquire RE/MAX for ~ $880M to form Real RE/MAX Group (RE/MAX holders may take $13.80 cash or 5.152 shares), creating a combined pro forma 2025 revenue base of $2.3B, >180,000 agents across 120+ countries, and expected $30M annual run-rate savings by end of 2027. The newsletter also lists active fundraising opportunities (e.g., America Housing Corp $50M Series A) and upcoming GEM member events on 5/14 and 5/19.

By Crystal, Powered by GEM

Open reader Original

memelord.blog 2026-05-11 2 min read

The link to download AGI

Why it matters

On May 11, 2026, founder Jason "The Memelord" Levin announced that the Memelord app has "AGI for memes" and claimed the capability is available inside the app

Key details

The newsletter corrected a previously wrong download/link, directs readers to the corrected app link and Memelord.com, offers a free trial, and includes promo code MYTHOS for 50% off

Brief

Memelord’s May 11, 2026 newsletter from founder Jason “The Memelord” Levin claims the Memelord app has in‑app 'AGI for memes' and corrects a prior wrong download/link. The note links to the app, promotes a free trial at Memelord.com, and offers promo code MYTHOS for 50% off signups.

By Memelord Magazine

Open reader Original

news.bloomberg.com 2026-05-11 14 min read

Money Stuff: KKR Buys Back Some Private Credit

Why it matters

KKR committed $300 million to FS KKR Capital Corp. (FSK): $150 million in convertible preferred (5% cash / 7% PIK) convertible at $18.83 (FSK’s quarter-end NAV) and a $150 million tender offer for common shares at $11 per share (FSK stock closed the prior week at $10.84).

Key details

Public business-development companies (BDCs) are trading below NAV; Apollo is in talks to sell MidCap Financial Investment Corp. (MFIC), whose stock trades ~85% of NAV, likely in a share-for-share deal with another BDC rather than cash at full NAV.
Private credit lending volume fell 14% in Q1 2026 while US banks’ lending rose 12.7% (the fastest growth since 2022), amid higher funding costs for private credit and regulatory easing that benefits banks, per OCC commentary.
Virginia’s redistricting referendum received 51.7% 'Yes' (48.3% 'No') in April but the Virginia Supreme Court later voided the referendum process; Kalshi’s prediction markets ($5.7M and ~$10M of bets) resolved inconsistently because an accelerated-resolution clause used four of eight designated media (NYT, AP, DDHQ, CNN, Fox, NBC, CBS, ABC) to settle one contract in April.

Brief

KKR’s $300 million intervention in FS KKR Capital Corp. (FSK) typifies how private-credit sponsors are responding to persistent BDC discounts to NAV. KKR will put in $150 million of convertible preferred—paying 5% cash or 7% in kind—with a conversion price equal to FSK’s quarter-end NAV ($18.83), and simultaneously tender for $150 million of common at $11 per share (FSK traded at $10.84). The package mixes an NAV-linked show of confidence with an economically accretive purchase of discounted shares. The move comes as many public BDCs trade below reported NAVs and dealmaking (e.g., Apollo’s talks to sell MFIC) increasingly uses stock-for-stock structures because buyers won’t pay full cash NAV.

Broader market context: private credit originations fell ~14% in Q1 2026 while bank lending jumped 12.7%, reflecting higher funding costs for private lenders and deregulatory shifts that let banks compete on leveraged lending. Separately, a legal-technicality produced conflicting outcomes in prediction markets: Virginia’s redistricting referendum drew 51.7% 'Yes' in April but was later voided by the Virginia Supreme Court; Kalshi’s contracts resolved differently because an accelerated oracle clause relied on four of eight designated media to declare a winner. And in tech labor markets, OpenAI’s expanded tender cap (to $30M per person) let ~75 employees hit the cap, illustrating how private liquidity is reshaping compensation and retention dynamics in AI firms.

By Matt Levine

Open reader Original

substack.com 2026-05-11 6 min read

Have we reached peak lounge?

Why it matters

Heathrow Terminal 3 hosts at least ten airside lounges (including oneworld lounges from American Airlines, British Airways, Cathay Pacific and Qantas, plus Emirates and Virgin Atlantic Clubhouse), making it a candidate for 'peak lounge' density (Oliver Ranson, May 11, 2026).

Key details

UK’s four largest airports (Heathrow, Gatwick, Stansted, Manchester) generated £673 million (~$915M) from car parking in 2023 — roughly £1.8M (~$2.5M) per day — and Heathrow added 900 parking spaces in 2024; non-aeronautical revenue rose from 36.3% of airport revenue in Q1‑23 to 37.2% in Q1‑24.
Ranson proposes monetising the departures environment by charging for access to retail and hospitality beyond basic gates/amenities (while keeping basic seating, toilets and gate access free), noting risks such as tenants demanding lower rents, passengers arriving later (increasing pressure on check‑in/security), and operational delays.
The article classifies lounge models (airport free departures areas; pay‑to‑access airport lounges; third‑party operators such as Collinson with >80 venues; airline‑branded lounges often outsourced; and rare special terminals like Lufthansa First at Frankfurt) and flags special terminals as high‑opportunity for monetisation.

Brief

Airport lounges are nearing saturation in some hubs — Heathrow Terminal 3 alone offers at least ten airside lounges (oneworld’s American, British Airways, Cathay Pacific and Qantas among them, plus Emirates and Virgin’s Clubhouse) — prompting Oliver Ranson (May 11, 2026) to ask whether we’ve hit “peak lounge.” He outlines five lounge categories (free airport departure areas; paid airport lounges; third‑party operators such as Collinson with 80+ venues; airline‑branded lounges; and rare premium terminals like Lufthansa First) and focuses on monetising the ordinary departures hall. Drawing parallels to car‑park pricing (UK top airports earned £673M from parking in 2023; Heathrow added 900 spaces in 2024) and rising non‑aeronautical share (36.3% → 37.2% from Q1‑23 to Q1‑24), Ranson suggests gating retail/hospitality behind paid access while keeping essential gate access free, and warns of tenant pushback, later passenger arrivals, and pressure on check‑in/security.

By Oliver Ranson from Airline Revenue Economics

Open reader Original

substack.com 2026-05-11 3 min read

Open Thread 433

Why it matters

Astral Codex Ten published Open Thread 433 on May 11, 2026 (Scott Alexander) as the weekly visible open thread with links to the ACX subreddit, Discord, bulletin board, and in-person meetups.

Key details

A pending US Congress agriculture bill would preempt existing state animal‑welfare laws — for example revoking California’s ban on keeping pigs in crates too small to turn around — and animal‑welfare groups call it “the most important legislative threat to farmed animal welfare in US history.”
Hampshire College professor Ethan Ludwin‑Peery warned the campus is shutting down and issued a rescue plea estimating $30–$60 million to buy the campus plus about $40 million to operate it for the first few years; he invited interested parties to email ethanludwinpeery@gmail.com.

Brief

Open Thread 433 (Astral Codex Ten, published May 11, 2026) is the weekly community open thread by Scott Alexander that aggregates reader items and links to ACX social channels. Two top highlights: (1) a currently considered US agriculture bill that would federally preempt all state animal‑welfare protections — the post cites California’s sow‑crate prohibition as an example — and animal‑welfare organizations are framing the measure as the single biggest legislative threat to farmed‑animal protections in US history, with calls to contact Senators. (2) Ethan Ludwin‑Peery, a former book‑review contest winner and current Hampshire College professor, says administrative financial errors have left the college facing closure and requests $30–60M to acquire the campus plus ~$40M to fund operations during a transitional period, providing his email for inquiries.

By Astral Codex Ten

Open reader Original

substack.com 2026-05-11 6 min read

13 Videos From the Cerebral Valley Voice Summit: Sierra's Bret Taylor, Wispr Flow's Tanay Kothari, MiniMax's Linda…

Why it matters

The inaugural Cerebral Valley Voice Summit (May 2026) gathered 200+ founders, investors, and operators and hosted 13 talks from leaders including Bret Taylor (Sierra), Justin Uberti (OpenAI), Anastasis Germanidis (Runway) and founders from Abridge, Wispr Flow, Deepgram, Cartesia and more.

Key details

Sierra’s Bret Taylor spoke two days after securing a fresh $950M financing, reporting >$165M in revenue and adoption by 40% of the Fortune 50; he called voice AI ‘early innings’ compared to the internet pre‑broadband.
OpenAI’s Justin Uberti and the company’s realtime team released new realtime voice models that can perform reasoning during a conversation; panelists debated tradeoffs between highly human‑like speech and accuracy/clarity for task‑oriented agents.
Healthcare and enterprise traction were highlighted: Assort Health reports ~150 million patient interactions across 5,000 providers via voice agents; startups noted low‑latency TTS (Cartesia), the value of shaving milliseconds for inference (LiveKit), and the rise of multimodal/world models (Runway, MiniMax).

Brief

The Cerebral Valley Voice Summit convened 200+ builders and investors in May 2026 for 13 recorded panels on voice AI, featuring Bret Taylor (Sierra), Justin Uberti (OpenAI), Anastasis Germanidis (Runway) and founders from Abridge, Wispr Flow, Deepgram and Cartesia. Taylor announced Sierra’s recent $950M raise, >$165M revenue and deployments at 40% of the Fortune 50, while OpenAI released realtime voice models capable of reasoning mid‑conversation. Panels debated whether agents should sound fully human versus prioritizing accurate, task‑oriented behavior; Deepgram’s Scott Stephenson said voice models hadn’t yet passed his five‑minute “voice Turing Test” but predicted that context memory advances could do so by year‑end. Speakers also emphasized enterprise wins in healthcare (Assort Health: ~150M interactions across 5,000 providers), low‑latency TTS, the importance of latency reductions for inference, and a near‑term shift toward multimodal/world models that fuse voice with video and motion.

By Newcomer

Open reader Original

ArXiv 2026-05-08 1 min read

CA-SQL: Complexity-Aware Inference Time Reasoning for Text-to-SQL via Exploration and Compute Budget Allocation

Why it matters

CA-SQL achieves a state-of-the-art 51.72% execution accuracy on the "challenging" tier of the Bird-Bench (BIRD) development set using only GPT-4o-mini (reported 2026-05-08).

Key details

The pipeline uses complexity-aware, difficulty-scaled exploration, evolutionary-search-inspired prompt seeding, and a novel voting selector; overall results on BIRD dev are 61.06% execution accuracy and 68.77% Soft F1.

Brief

CA-SQL is a complexity-aware Text-to-SQL inference pipeline that scales exploration breadth by estimated task difficulty, employs evolutionary-search-inspired prompt seeding to elicit diverse candidates, and uses a novel voting method to pick final queries. On the Bird-Bench development set it reports 51.72% on the "challenging" tier (SOTA among in-context approaches with GPT-4o-mini), plus 61.06% execution accuracy and 68.77% Soft F1.

Authors: James Petullo, Nianwen Xue

Open reader Original

e.economist.com 2026-05-11 8 min read

El Boletín: Honduras-gate, explained

Why it matters

In December 2025 President Donald Trump pardoned former Honduran president Juan Orlando Hernández, who had been serving a 45-year prison sentence for drug trafficking.

Key details

On April 30, 2026 Canal Red published leaked audio recordings of unclear origin and unverified authenticity in which a voice purported to be Hernández alleges Israel financed and brokered his release while Trump allies would smooth his political return in exchange for Honduras expanding special economic zones, hosting a new US military base, and passing laws favorable to American and Israeli corporate interests.
The leaks claim a broader influence operation — including a disinformation campaign against Mexico and Colombia — allegedly funded in part by Argentina’s president Javier Milei; Hernández denies the tapes, and neither Trump nor Israel has publicly responded.
Mexican president Claudia Sheinbaum acknowledged the report but minimized its likely effect, while Colombia’s Gustavo Petro cited the recordings as evidence of efforts to undermine progressive governments; independent verification of the recordings remains absent, underscoring institutional fragility in parts of Central America.

Brief

Honduras-gate centers on a set of leaked audio files published on April 30, 2026 by Canal Red that purport to record Juan Orlando Hernández describing a deal in which Israel helped finance and broker his release and US-aligned actors would facilitate his political comeback after Donald Trump’s December 2025 pardon removed a 45-year drug-trafficking sentence. The tapes allege concrete quid pro quos — expanded special economic zones, a new American military base and laws favoring US and Israeli corporate interests — and describe a wider campaign including disinformation against Mexico and Colombia allegedly funded in part by Argentina’s Javier Milei. Hernández denies the recordings, Trump and Israel have not replied, and no independent verification of provenance or authenticity has been published, leaving the episode as both a potential example of interstate influence operations and a marker of Central America’s fragile democratic institutions.

By The Economist

Open reader Original

ArXiv 2026-05-08 1 min read

Zero-Shot Imagined Speech Decoding via Imagined-to-Listened MEG Mapping

Why it matters

Collected paired listened and imagined MEG from trained musicians listening to rhythmic, melodic, and spoken stimuli and trained six linear and neural mapping models to predict listened responses from imagined MEG.

Key details

Trained a contrastive word decoder solely on listened MEG using four embedding strategies (including semantic, acoustic, phonetic); applying the mapped imagined→listened responses from held-out subjects produced word decoding significantly above chance by rank-based analysis, and performance improved with more training data.

Brief

Imagined speech decoding: the authors recorded paired listened and imagined MEG from trained musicians and built a three-stage pipeline that maps imagined MEG to listened MEG (six mapping models), decodes words with a contrastive listened-only decoder (four embedding strategies), and applies the decoder to mapped imagined data from held-out subjects. Proof-of-concept results show significant above-chance decoding and scalability with more training data; only the abstract was available for this summary.

Authors: Maryam Maghsoudi, Shihab Shamma

Open reader Original

ArXiv 2026-05-08 1 min read

GRAPHLCP: Structure-Aware Localized Conformal Prediction on Graphs

Why it matters

GRAPHLCP is a structure-aware, proximity-based localized conformal prediction framework for GNNs introduced by Peyman Baghershahi, Fangxin Wang, Debmalya Mandal, and Sourav Medya (arXiv 2026-05-08).

Key details

The method adds a feature-aware densification step and a Personalized PageRank (PPR)-based kernel for topology-dependent anchor sampling and calibration weighting, explicitly modeling local and long-range graph dependencies.
GRAPHLCP provably guarantees marginal coverage with finite samples and, according to experiments on multiple regression and classification datasets, attains improved test conditional coverage and more efficient prediction sets versus embedding-only localization (paper: 20 pages, 9 figures, 8 tables).

Brief

GRAPHLCP tackles conformal prediction for graph neural networks by addressing failures of embedding-space localization on graphs. It combines feature-aware densification to reduce locality bias in sparse graphs with a Personalized PageRank kernel to capture structural proximity for anchor sampling and calibration weighting. The approach yields finite-sample marginal coverage and empirically better test conditional coverage and smaller prediction sets on several regression and classification benchmarks compared to prior embedding-only localization methods.

Authors: Peyman Baghershahi, Fangxin Wang, Debmalya Mandal...

Open reader Original

ArXiv 2026-05-08 1 min read

The Memory Curse: How Expanded Recall Erodes Cooperative Intent in LLM Agents

Why it matters

Expanding context windows across 7 LLMs and 4 games over 500 rounds degraded cooperation in 18 of 28 model–game settings (Liu et al., 2026).

Key details

Mechanism analyses on 378,000 reasoning traces attribute collapse to eroded forward-looking intent (not paranoia); a LoRA adapter fine-tuned on forward-looking traces mitigates the decay and transfers zero-shot, memory sanitization (replacing history with synthetic cooperative records) restores cooperation, and ablating explicit Chain-of-Thought often reduces the collapse.

Brief

The paper 'The Memory Curse' (Liu et al., 2026) shows that expanding LLMs' context windows often erodes cooperation in multi-agent social dilemmas: across 7 models and 4 games over 500 rounds cooperation fell in 18 of 28 model–game settings. Analyses of 378,000 reasoning traces implicate loss of forward-looking intent; targeted LoRA fine-tuning, memory sanitization, and CoT ablations partly restore cooperation. (Based on the abstract.)

Authors: Jiayuan Liu, Tianqin Li, Shiyi Du...

Open reader Original

e.economist.com 2026-05-11 6 min read

The US in Brief: A petrol-tax suspension floated

Why it matters

Energy secretary Chris Wright said the Trump administration would back suspending the federal petrol tax (currently 18¢ per gallon) as average US pump prices sit about $4.34/gal; any suspension would require Congressional approval and Wright withdrew a March prediction that prices would fall below $3 by summer.

Key details

President Donald Trump publicly attacked two of his appointees on the Supreme Court, Neil Gorsuch and Amy Coney Barrett, saying “it’s really OK for them to be loyal” after they voted to strike down some of his tariffs.
Virginia’s recently drawn Democratic-leaning congressional map was struck down by the state supreme court; Democrats held a testy call (reported by the New York Times) weighing responses including imposing a judicial age limit or launching a fresh challenge to the state’s independent-redistricting law.
Two Americans from a Dutch cruise-ship hantavirus outbreak tested positive or have mild symptoms after the ship disembarked in Spain’s Canary Islands; they are being airlifted to a specialised centre at the University of Nebraska in Omaha.

Brief

The US in Brief rounds up fast-moving political and domestic developments: Energy secretary Chris Wright said the administration would support suspending the federal petrol tax (18¢/gal) to ease pump pain—average prices are about $4.34/gal—but Congress must act and Wright retracted a March expectation of sub-$3 summer fuel. On politics, President Trump publicly blasted two of his Supreme Court appointees, Neil Gorsuch and Amy Coney Barrett, for recent tariff rulings, urging loyalty. In Virginia, the state supreme court struck down a Democratic-favouring congressional map; party leaders are considering measures from imposing judicial age limits to fresh legal challenges to independent redistricting. Health officials are managing a hantavirus cluster from a Dutch cruise-ship visit to the Canary Islands, flying affected Americans to the University of Nebraska’s specialised centre in Omaha. Also, Abe Foxman, longtime head of the ADL, died at 86.

By The Economist

Open reader Original

ArXiv 2026-05-08 1 min read

NoiseGate: Learning Per-Latent Timestep Schedules as Information Gating in World Action Models

Why it matters

NoiseGate (Wen Huang et al., arXiv:2605.07794v1, published 2026-05-08) replaces the common single shared timestep t in Mixture-of-Transformers (MoT) world-action models with learnable per-latent timestep schedules, treating each predicted latent frame's noise level as an information-gating policy.

Key details

The method combines independent per-latent timestep sampling during backbone training, a lightweight Gating Policy Network that emits per-latent time increments during denoising, and task-reward optimization to train schedules without hand-crafted shape priors.
Built on a joint video–action MoT backbone, NoiseGate yields consistent gains on diverse RoboTwin random-scene manipulation tasks (reported in the paper's abstract).

Brief

NoiseGate reframes per-latent timestep selection in joint video–action world-action models as a learnable information-gating policy: by adjusting each predicted latent frame's noise level, a Gating Policy Network controls its Key/Value reliability for action generation. The approach (independent per-latent timestep sampling plus task‑reward optimization) improves RoboTwin random-scene manipulation performance. Summary based on the paper's abstract; full text not available here.

Authors: Wen Huang, Haoran Sun, Yongjian Guo...

Open reader Original

substack.com 2026-05-11 4 min read

Hantavirus is a reminder we should prepare for the next pandemic

Why it matters

Matthew Yglesias (Slow Boring, May 11, 2026) flags a hantavirus outbreak linked to the cruise ship MV Hondius with five confirmed cases and three deaths reported in coverage of the incident.

Key details

Yglesias recounts preparing personally — locating a household emergency box and an elastomeric respirator he first saw recommended by a biosecurity expert on a podcast in October — and notes that the respirator company recently went out of business, illustrating market fragility for personal protective gear.
He argues that while consulted medical professionals are not highly worried and public messaging urges people not to panic, structural trends (population growth, increased connectivity, rising prosperity) and advances in biotechnology are raising both natural and engineered pandemic risks, and he criticizes U.S. government under‑reaction and calls for more aggressive countermeasures.

Brief

Matthew Yglesias uses a May 11, 2026 piece to treat the recent hantavirus episodes aboard the cruise ship MV Hondius — reported as five cases and three deaths — as a reminder of broader pandemic preparedness failures. He describes personal steps (locating an emergency box and a specific elastomeric respirator first recommended on a podcast in October) and notes the vendor has since gone out of business, signaling weak consumer supply chains for PPE. Although clinicians he consulted are “not particularly worried,” Yglesias worries about reflexive reassurances and systemic complacency. He emphasizes that demographic and connectivity trends increase the frequency of novel outbreaks, and that biotechnology advances raise engineered‑pathogen risk, concluding that the American government has under‑reacted and that aggressive preemptive measures are warranted.

By Matthew Yglesias

Open reader Original