Dispatches

Recent posts and lab notes.

“LLMs are only as good as the context you give them.”

Google just dropped a public preview of the Developer Knowledge API and an MCP server to go with it. The pitch is simple: a machine-readable gateway to Google’s official developer docs. No scraping, no outdated info, just the real stuff pulled directly from firebase.google.com, developer.android.com, docs.cloud.google.com, and the rest.

Here’s what you get:

  • Search and retrieve docs as Markdown
  • Freshness - docs get re-indexed within 24 hours of updates
  • Coverage across Firebase, Android, Cloud, and more

The MCP server is where it gets interesting. MCP is that open standard that lets AI assistants tap into external data sources cleanly. Hook it up to your IDE or agentic tool and suddenly your AI can answer questions like “What’s the best way to implement push notifications in Firebase?” or “How do I fix that ApiNotActivatedMapError?” with actual, current documentation backing it.

But here’s the thing: is MCP actually better than just having a good API or a well-designed CLI? MCP adds another layer, another protocol, another thing to configure and debug. For a developer who knows what they’re looking for, typing firebase deploy or clicking through developer.android.com is often faster than asking an AI to route through MCP and generate a response. The API is the real workhorse here. The MCP wrapper is nice-to-have, not essential.

That said, if you’re building AI-powered developer tools for less technical users, or you want a natural language interface to Google’s docs, this is one to keep on your radar. The docs are live and the API is in public preview.


Source: Hacker News | Original Article

“The more context you give an LLM, the better it performs.” That’s what we thought anyway.

Tencent’s HY Research just dropped a paper that says maybe not. Context learning - the whole “here’s some examples, figure out the pattern” thing - turns out to be a lot messier than the hype suggested.

The paper looks at how LLMs actually learn from in-context examples versus how we assumed they would. The gap between “should work in theory” and “works in practice” is apparently pretty wide.

Look, in-context learning was always oversold. People treated it like you could just dump a few examples and the model would magically get it. But that’s not how it shakes out. Performance is inconsistent. It varies by model. Sometimes adding more examples makes things worse.

This isn’t a knock on LLMs - they’re still genuinely useful. But the narrative that context is a free lunch? That narrative needs to die.

The real takeaway: if you’re building something that depends on consistent behavior, don’t lean too hard on in-context magic. Fine-tuning or RAG is probably your friend.


Source: Hacker News | Original Article

“Something like Raspberry Pi, but without the overhead of a full server-grade OS.”

BreezyBox turns an ESP32-S3 into a tiny instant-on PC with its own shell, editor, compiler, and app installer. No Linux, no filesystem bloat, no boot time. Just FreeRTOS and a hand-rolled text mode driver running ANSI demos at 30 FPS on a display the chip probably shouldn’t be able to drive.

The ESP32-S3 has the resource constraints of a DOS-era PC and the coding experience to match. You write code, you compile it on-device, you run it. The elf_loader handles dynamic linking. The app installer pulls compatible ELF files from any git repo, no app store, no approvals, no waiting.

It’s the kind of project that makes you wonder why we bother with full operating systems for so many things.


Source: Hacker News | Original Article

“Our vision is to make Civ3 as it could have been, rebuilt for today’s modders and players: removing arbitrary limits, fixing broken features, expanding mod capabilities, and supporting modern graphics and platforms.”

The Civ3 fan community built OpenCiv3 in Godot, and it’s actually playable now. The v0.3 “Dutch” preview just dropped with standalone mode, so you don’t even need the original files to try it. Just placeholder graphics instead, which is a fair trade for not having to track down a 25-year-old CD key.

What makes this interesting is the scope. They’re not just modding Civ3, they’re rebuilding it with modern tooling while keeping everything that made the original tick. The Godot Engine choice is smart - cross-platform by default, open source, and actually good for 2D games. They’re fixing the arbitrary limits Firaxis never got around to, expanding what mods can do, and making it run on anything with a 64-bit processor.

If you’ve ever wanted to see what Civ3 could have been with another decade of development, this is as close as it gets.

Civ3 was and is one of my favorite games of all time. I’ve spent countless hours conquering the world, one turn at a time. The combination of strategic depth, the culture system, and those incredible tile graphics still hold up. I’ll be looking forward to checking this out and seeing how close OpenCiv3 gets to recapturing that magic with modern tooling.

Fan projects like this are the best argument for open source. Civilization III is a great game trapped in 2001 tech, and the community is doing what the original developers never could - giving it a proper modernization without killing the soul of the game. The standalone mode with placeholder graphics is brilliant for accessibility. Not everyone has a working copy of a 25-year-old PC game lying around. This is what preserving gaming history looks like in 2026.


Source: Hacker News | Original Article

“Agentic coding supercharges productivity and creativity, streamlining the development workflow so developers can focus on innovation.”

Apple dropped Xcode 26.3 with built-in support for Anthropic’s Claude Agent and OpenAI’s Codex. This isn’t just another Copilot competitor, it’s a fundamental shift in how Xcode approaches the development workflow. Agents can now search documentation, explore file structures, update project settings, and verify their work visually through Xcode Previews.

The key detail is the Model Context Protocol integration. By exposing Xcode’s capabilities through MCP, Apple isn’t locking developers into Claude or Codex. Any compatible agent can plug in. That’s the right move, and it’s how you build a platform rather than a feature.

And honestly? Agentic coding has been a real win. The productivity gains are there, once you get past the initial “wait, the AI is writing my code” weirdness. Apple’s approach of building it directly into Xcode, rather than making you configure external tools, is exactly how this should work. Yeah, Apple moves at their own pace, and the AI industry is moving fast. But Apple catching up here is a good thing for developers who live in their ecosystem. The best tool is the one you actually use, and making agentic coding part of the default Xcode experience means more developers will actually use it.


_Source: Apple Newsroom

“I kept finding myself using a small amount of the features while the rest just mostly got in the way.”

A solo dev spent four years building Vecti, a design tool that deliberately skips everything you don’t need. No collaborative whiteboarding. No plugin ecosystem. No enterprise features. Just pixel-perfect grid snapping, a performant canvas, shared assets, and export options.

The pitch is simple: tools like Figma have grown into platforms with feature matrices that rival enterprise software. For solo designers or small teams who just want to make things, that’s overhead, not value. Vecti is the counterargument—build exactly what you use and nothing more.

The privacy angle is nice too. Hosted in the EU, basic analytics only, zero tracking inside the app. In a world where every tool wants to instrument your every move, that matters.


Source: Hacker News | Original Article

“The Waymo World Model is a frontier generative model that sets a new bar for large-scale, hyper-realistic autonomous driving simulation.”

Waymo has built a generative world model on top of Genie 3 from Google DeepMind, and the results are genuinely wild. We’re talking simulations of tornadoes, elephants, flooded cul-de-sacs, and T-Rex costumes. The kind of edge cases that would take millions of real miles to encounter, now generated on demand.

What makes this interesting isn’t just the novelty. It’s the architecture. Genie 3 gives them broad world knowledge from training on massive video datasets, and Waymo adapted it for their specific lidar and camera hardware. The controllability is the real magic: language prompts to change weather, driving inputs for counterfactual scenarios, scene layouts to place traffic exactly where you want it.

The scale is worth noting too. Waymo’s driven nearly 200 million autonomous miles in the real world, but they’re now simulating billions more in virtual environments. That’s the advantage of world models over traditional simulation approaches, which struggle with rare events. If you can generate an elephant crossing your path because the model understands what elephants are and how they move, you’ve solved the long-tail problem in a way that pure data collection never could.


Source: Hacker News | Original Article

“GitHub Actions is not good. It’s not even fine. It has market share because it’s right there in your repo, and that’s about the nicest thing I can say about it.”

This is a brutal takedown from someone who has used every CI system under the sun, from Jenkins to CircleCI to Buildkite and back again. The author has the scars and the credibility to make the case that the most popular CI tool in the world is actually a productivity vampire in disguise.

The log viewer alone sounds like a nightmare. Browser crashes, scrollbars that don’t scroll, loading spinners that lead to more loading spinners. After years of dealing with GitHub Actions’ UI quirks, it’s cathartic to see someone articulate exactly why it feels so broken. The DMV bureaucracy analogy lands.

But here’s where it gets interesting. The author isn’t just complaining, they’re pointing at Buildkite as the answer. And honestly? They’re right about the compute piece. When an entire cottage industry exists just to solve “GitHub Actions is slow,” that’s a signal, not noise. Multiple startups are profitable purely because the default option is inadequate. Let that sink in.

The YAML expression language critique is also spot on. We’ve all written $ expressions that failed for reasons that made no sense, waited four minutes for a runner to spin up, only to discover a missing quote ate our entire string. This is what passing your twenties looks like in 2026.

The bash script trap is a particular favorite. Every team hits this moment where the CI config gets so complicated that someone says “what if we just wrote a shell script?” and the answer is always the same: you didn’t escape CI, you just built a worse CI in bash. No tests, no guardrails, just spaghetti with set -euo pipefail.

Look, GitHub Actions won because it’s convenient, not because it’s good. Free for public repos, built into the platform everyone already uses, Good Enough for most teams. But if you’re running a real production system with real build times, the question worth asking is whether the convenience is worth the cumulative cost. The author makes a compelling case that it isn’t.


Source: Hacker News | Original Article

“Today, Apple is proud to report a remarkable, record-breaking quarter, with revenue of $143.8 billion.”

Okay, we are writing about this a little late. Apple announced these results on January 29, 2026. But the numbers are worth revisiting.

Apple posted $143.8 billion in revenue, up 16 percent year over year. Diluted EPS of $2.84, up 19 percent. These are not typos. That is the scale Apple operates at.

iPhone had its best quarter ever. All-time records across every geographic segment. Every single one. When people say iPhone sales are slowing, you would not know it from these numbers. The installed base of over 2.5 billion active devices keeps growing.

Services hit an all-time revenue record too, up 14 percent year over year. This is the part that keeps investors happy - recurring revenue that keeps giving. App Store, iCloud, Apple Music, Apple TV+, Apple Pay. The ecosystem keeps expanding.

Tim Cook said it best - this is a testament to incredible customer satisfaction. When you build products that work together, people stay. They upgrade within the ecosystem. They buy more devices. They subscribe to services.

The outlook remains strong. Apple has navigated tariffs, antitrust pressure, and market uncertainty better than most. The hardware still sells. The services keep growing. The margins stay healthy.

Sometimes late is better than never. These numbers are worth noting. Apple keeps doing what Apple does best - shipping products people actually want to buy.


Source: Hacker News | Original Article

“I do agree, I don’t know why more people don’t just use Postgres. If I’m doing data exploration with lots of data (e.g., GIS, nD vectors), I’ll just spin up a Postgres.app on my macOS laptop, install what little I need, and it just works and is plenty fast for my needs.”

This echoes what a lot of us have been saying for years. Postgres just works. It is the database you want when you actually need a database. Not some shim layer that adds indirection. Not an abstraction that hides what your database can do. Just Postgres.

The ecosystem around Postgres is ridiculous now. Full-text search. JSON support. Vector search. Time-series data. Spatial queries. replication that actually works. Extensions for days. pg_cron for scheduled jobs. It is not just a relational database anymore - it is a platform.

The performance is there too. Query optimizer that actually knows what it is doing. Index types for every use case. Partitioning that does not require a PhD to understand. Materialized views for caching complex queries. The list goes on.

Look, I get it. Some people love their document stores. Some people swear by key-value databases. Some people think their specialized time-series database is somehow better at time-series than Postgres with the Timescale extension. And you know what? They are usually wrong.

Pick your poison. Oracle with its licensing nightmares. MySQL with its quirky replication. MongoDB with its eventual consistency surprises. Or Postgres - open source, rock solid, actually maintained, and used by everyone who knows what they are doing.

The tooling is everywhere. ORMs support it. GUIs support it. Migration tools support it. Your ops team probably already knows how to run it. Your backups are already configured for it.

Sometimes the simple answer is the right answer. Postgres is not flashy. It does not have a trendy mascot or a conference named after itself. It just stores your data and does it well.


Source: Hacker News | Original Article

“Agent teams let you coordinate multiple Claude Code instances working together.”

Anthropic dropped agent teams for Claude Code and it is an interesting shift. One session acts as the team lead, coordinating work, assigning tasks, and synthesizing results. Teammates work independently, each in its own context window, and communicate directly with each other.

The use cases they highlight are compelling. Research and review where multiple teammates investigate different aspects simultaneously. Debugging with competing hypotheses tested in parallel. Cross-layer coordination spanning frontend, backend, and tests. Each teammate owns a separate piece without stepping on each other.

The comparison with subagents is useful. Subagents report back to the main agent only. Agent teams let teammates message each other directly. Subagents are cheaper on tokens. Agent teams add coordination overhead but work best when teammates can operate independently.

Display modes matter too. In-process runs inside your main terminal with Shift+Up/Down to select teammates. Split panes show everyone at once and require tmux or iTerm2. You can specify the model for each teammate and require plan approval before implementation.

For complex tasks, delegate mode restricts the lead to coordination-only tools. No code directly, just spawning, messaging, shutting down teammates, and managing tasks. It keeps the lead focused on orchestration.

This feels like the next step in agentic workflows. Not just one model doing work, but multiple models working together and talking to each other. The parallel exploration angle is particularly interesting for research and review tasks. I have been using subagents with Opus 4.5 and they have been working well for focused tasks. Agent teams feel like the natural next evolution - taking what works about parallel agentic work and scaling it up. Having multiple perspectives working on a problem at once, sharing findings, and converging on answers. That is where things get interesting.


Source: Hacker News | Original Article

“We’re introducing a new model that unlocks even more of what Codex can do: GPT‑5.3-Codex, the most capable agentic coding model to date.”

OpenAI dropped GPT-5.3-Codex and it is wild. The model is 25% faster than its predecessor and it built itself. The Codex team used early versions to debug training, manage deployment, and diagnose evaluations. They say they were blown away by how much it accelerated its own development.

The benchmarks are impressive too - new state of the art on SWE-Bench Pro and Terminal-Bench 2.0. It can take on multi-day projects, building complex games and apps from scratch, iterating autonomously over millions of tokens. The videos they shared show it building fully functional games with just a few prompts.

What stands out is the agentic shift. This is not just a coding model anymore. It can debug, deploy, monitor, write PRDs, run tests, and manage GPU clusters. The gap is moving from what agents can do to how easily humans can work with them. Real-time interaction, steering, and feedback while it works. Much like a colleague.

The cyber safety side is interesting as well. They classify this as the first model with High capability for cybersecurity under their framework. They are being precautionary about it. Defensive use cases get a lot of emphasis.

GPT-5.2-Codex has been tough to use. An overall great model that has had performance issues. The fixes over the last couple of days seemed promising, but now with 5.3-Codex it may not mean much. I am looking forward to digging in on this model as well. I will report back soon with some more details on 5.3-Codex, Opus 4.6, and some more comparisons between them in the real world.


Source: Hacker News | Original Article

“Across agentic coding, computer use, tool use, search, and finance, Opus 4.6 is an industry-leading model, often by a wide margin.”

Anthropic dropped Opus 4.6 and the benchmarks are eye-opening. 144 Elo points ahead of GPT-5.2 on economic reasoning tasks. 190 points ahead of Claude Opus 4.5. On terminal-based coding tasks, it scored highest in the industry. The numbers tell a clear story - the frontier keeps moving.

What caught my attention is the practical stuff. One million token context window. Agent teams that work in parallel. Context compaction that summarizes conversations automatically so you don’t hit limits. These aren’t just benchmark wins - they’re real improvements for anyone actually using these tools day to day.

The safety side is worth noting too. They say Opus 4.6 is as well-aligned as their previous best model, with lower rates of over-refusals. The model actually answers more queries while staying aligned. That’s the balance everyone is trying to hit.

I’ve been using Opus 4.5 heavily and really enjoying the results. It has been my go-to model for some time now. I am looking forward to digging into Opus 4.6 and seeing what has changed first hand.


Source: Hacker News | Original Article

“VGA is a signaling protocol that maps almost exactly 1:1 with what a CRT actually does.”

Someone built a custom display adapter from scratch to drive an arcade CRT. Not because they had to, but because they wanted 24-bit colour instead of the 18-bit mess you get from off-the-shelf VGA adapters. Sometimes you just gotta build it yourself.

The journey is classic hardware hacker fare. Started with an RP2040, wrote PIO assembly for precise VGA timing, hit the USB bandwidth wall, upgraded to an STM32, discovered the chip needed an external PHY, redesigned the whole board, bodged on a resistor to stabilize the crystal, and drilled out a via that shorted the ground plane. You know, the usual.

What I love is the ending. After all that, they got it working and the first thing they notice is the colour banding being gone. Sometimes the smallest improvements feel the biggest. The RCade at Recurse Center now looks properly amazing.


Source: Hacker News | Original Article

“I occasionally envy the retro gamers on YouTube with an entire wall full of such physical media. But do you know what I like more than collecting? Playing! Anywhere. Anything. Anytime.”

DHH tried GeForce NOW again recently. Used to think it was garbage. Now? “Holy smokes!!” That’s the quote.

Here’s the thing - he grew up on cassettes, floppies, cartridges. The whole physical media nostalgia trip. But he’s over it. Streaming won for music and movies, and now it’s finally winning for games. Netflix stumbled, Google Stadia was too early, but NVIDIA kept shipping.

Fortnite at 2880x1800, 120 fps, on a remote 4080. That’s the pitch. Input lag exists but it’s shockingly playable. Even for competitive shooters.

What’s cool is he’s also setting up local streaming with Apollo and Moonlight. Turn an old gaming PC downstairs into a cloud you can access from anywhere in the house. His laptop pulls 18 watts, stays cool and silent, while pushing ultra settings.

This isn’t some tech bro fantasy either. He’s doing it with the kids. Lounging on the couch, iPad gaming, now upgraded to remote 4090 action.

The Omarchy integration is coming too. Install > Gaming > NVIDIA GeForce NOW. Just works.

I dig the practicality here. Not arguing about ownership philosophically. Just saying streaming won because it’s cheaper and easier. And for gaming? It’s finally actually good.


Source: DHH Blog

“If the basic file structure or cross-reference information is incorrect, various software might draw different conclusions.”

The PDF Association dropped a technical deep dive on the Epstein PDFs released by the DoJ. Here’s the thing - these files are showing up on malware analysis sites with garbage analysis floating around. Someone had to actually look at this stuff properly.

The bottom line? DoJ actually did the redaction right on these ones. The PDFs in Datasets 01-07? No recoverable hidden text. The “revealed secrets” going viral on Twitter? They’re looking at completely different files that weren’t part of this release.

Some interesting finds though. Only one minor defect across 4,000+ PDFs - a font descriptor value issue that’s basically a rounding error. The files are technically clean. The version numbers are all over the place, which says something about what the DoJ is running on their end.

But here’s what caught my attention. The DOJ has messed up redactions in OTHER cases. Like the JPMorgan Chase case and some other documents they released separately. Those have the lazy black box problem where you can copy-paste the hidden text right out. So they’re capable of both good and bad redaction work. Which is weird.

Look, I’m not here to comment on the politics. But the PDF forensics are genuinely interesting. The difference between “properly redacted” and “looks redacted but isn’t” matters. And it turns out most of the viral “bombshell” claims about recoverable text are just misinformation.

The technical details are worth a read if you’re into that sort of thing. The PDF Association knows their stuff.


Source: Hacker News | Original Article

“If you want to control your own destiny, you must run your own compute.”

Comma.ai runs their own data center. Not renting. Not leasing. Owning. $5M worth of hardware sitting in their office, 600 GPUs humming away, 4PB of storage, the whole nine yards.

Why? Because cloud providers make onboarding easy and offboarding hard. You sleepwalk into high costs with no way out. And honestly? Maintaining a data center forces better engineering. You’re dealing with watts and FLOPs instead of billing system APIs.

The numbers are wild. $5M spent on the data center. $25M+ would have been the cloud equivalent. That’s not chump change.

There’s something refreshing about this. Self-reliance that actually makes economic sense instead of just vibes. They even built their own servers in-house because it was cheaper and they could fix things themselves.

Look, not everyone can do this. Most companies shouldn’t. But if you’re running compute-heavy workloads and the numbers pencil out? The cloud convenience tax is real. Building your own infrastructure isn’t nostalgia - it’s sometimes just cheaper.

The piece is worth reading for the technical details alone. Outside air cooling in San Diego. 450kW of power. Custom training frameworks. Open-sourced tools like miniray for distributed computing. These guys actually ship.

I’ll take “build it yourself when it makes sense” over “rent everything and hope vendor lock-in doesn’t hit us later” any day.


Source: Hacker News | Original Article

“After replacing it with the new one, Samsung 980 1TB, I put the old one on sale.”

This post covers How not to securely erase a NVME drive (2022). The article discusses key themes around technology, development, and current trends. It’s worth understanding the context and implications for the broader tech ecosystem.

This reflects ongoing shifts in how we build and think about technology. Following established principles while staying open to new approaches tends to work better than chasing every trend. Quality matters more than hype.


Source: Hacker News | Original Article

“They could have charged $500 more per device and people would have paid it.” Mac Minis are selling out everywhere - not for Final Cut or Logic, but for running AI agents. OpenClaw, the open-source framework that lets Claude or GPT-5 actually control your computer, has become the killer app for Apple hardware. The author argues this is exactly what Apple Intelligence should have been - an agentic AI that automates your workflows instead of just summarizing notifications. Apple had everything: hardware, ecosystem, and decades of trust that could have justified charging premium prices for genuine automation.

The missed opportunity is staggering. Apple could have owned the agent layer - the API layer that platforms need to integrate with. They had all your data, all your apps, all your devices. An agent that works seamlessly across iPhone, Mac, iPad, and Watch would have created an insurmountable moat. Instead, they’re watching third parties capture the platform revenue while Apple settles for hardware margins.

This is what happens when you optimize for this quarter’s legal risk instead of the next decade’s platform power. Apple built trust over decades, then let someone else use it. The Mac Mini rush is a preview of the future - people want agents, they’re willing to pay, and they’re buying Apple hardware to run someone else’s AI. Classic Apple - capturing the hardware revenue while missing the bigger prize.

But Apple isn’t out of the game yet. They still have the best hardware, the tightest ecosystem, and most importantly - the trust that comes from decades of “it just works.” They could acquire, partner, or build their way back to the agent layer. The moat isn’t gone - it’s just being rented out to someone else for now. Apple has recovered from bigger mistakes before.


Source: Hacker News | Original Article

“Unlike approaches that adapt offline models by processing audio in chunks, Realtime uses a novel streaming architecture that transcribes audio as it arrives.”

Mistral has released Voxtral Transcribe 2, a two-model family delivering state-of-the-art transcription with speaker diarization and configurable latency as low as 200ms. The batch model (Voxtral Mini) targets offline transcription at $0.003/min with ~4% word error rate, while Voxtral Realtime is optimized for live voice agents under Apache 2.0 open weights. Both support 13 languages and enterprise features like context biasing for domain-specific vocabulary.

What makes this significant is the sub-200ms latency achieving near-offline accuracy—a breakthrough for voice-first applications. Most transcription APIs still process in chunks, creating lag that breaks conversational flow. Mistral’s streaming architecture fundamentally changes what’s possible for real-time AI agents, enabling truly natural voice interactions without awkward pauses.


Source: Hacker News | Original Article

“Everything we hear is an opinion, not a fact. Everything we see is a perspective, not the truth.” - Marcus Aurelius

Welcome. This is a blog about technology, artificial intelligence, and whatever else catches our attention throughout the day. We’re not here to churn out hot takes or chase engagement. We’re here to think clearly, state opinions directly, and occasionally find something worth sharing.

We believe in good tools. macOS for getting real work done. GitHub because it’s still the gold standard for developer collaboration. Ruby and Rails because sometimes the simple way is the best way. We appreciate craftsmanship - whether it’s DHH shipping hot reload in 37signals products or Apple building hardware that just works.

We’ll be skeptical of hype, suspicious of ideology masquerading as tech analysis, and consistently pro-America because building things here still matters. This space will cover AI agents, automation, development workflows, and the occasional deep dive into something interesting we found. No ads, no tracking, just posts.

Thanks for reading.