Good morning everyone and welcome to the second of my regular columns that keep you up to speed with the latest in artificial intelligence news, with a focus on informing novice and laypeople. Over the past 86 weeks, I’ve manually organized 29,700 AI headlines per week into 45 categories. This has given me a spidey sense of what’s happening, and I’m excited to share it with you.
We’re in a chapter of technology where the sum of the parts will become greater than the whole. It’s like the early days of the Internet. While it’s tempting to seek surgical examples of “AI for journalism”, it’s equally important to understand the big picture to see and prepare for opportunities and challenges as they emerge.
Recent major AI headlines with broad impact across industries
Anthropic’s new models are out, performing as well as some of the best human coders, and can process entire enterprise code bases within their memory.
https://www.anthropic.com/news/claude-4
The release of Claude 4 Opus marks the first time my serious developer friends have told me that they see the potential to use AI for enterprise development. One friend told me that Opus is as good as a mid-career PhD-level computer programmer.
The building blocks for AI agents to make true impacts have been put in place. I think 2025 is going to be the year that people start losing jobs in entry-level operational customer service roles from health to finance to law.
Google has launched an Internet agent that can manage up to 10 web-based tasks at the same time, including things like booking flights or making restaurant reservations. Even with a web interface like a browser, if the system takes a search query and goes out throughout the entire Internet to accommodate requests, that implies it might as well be a non-graphical interface. You just ask Google what you want and Chrome goes and does it.
https://labs.google.com/mariner/landing
https://blog.google/products/search/google-search-ai-mode-update
Google’s Internet agent can also save tasks to repeat on a regular basis. For example, if you want to check real estate listings, instead of opening the apps, Google will check them every day for you and even schedule viewings of properties.
Anthropic has created the standard for interfacing artificial intelligence with existing systems. Zapier is now connected to Anthropic’s system and can connect to and automate over 8,000 applications without requiring development expertise.
https://www.anthropic.com/news/agent-capabilities-api
https://zapier.com/blog/zapier-mcp-claude-guide
AI search tool Perplexity reports that people are booking hotels directly through its AI search platform more and more every day. Hotel advertising is Google’s second-largest category.
A lot of companies are forcing each other’s hands. Perplexity is forcing Google to make its search more agentic. Meanwhile, Google introducing app usage is forcing Apple to integrate AI into its App Store. For a long time, I’ve said that Apple is sitting on a large action model that could destroy the App Store overnight.
If agents weren’t enough, interfaces are officially beginning to be disrupted. OpenAI acquired Apple designer Jony Ive’s company for $6.5 billion. The rumor is they are going to build a device that has no graphical user interface.
https://openai.com/sam-and-jony
As artificial intelligence models grow stronger, both Anthropic and Google have strengthened their security systems and raised their threat levels as measured by proximity to AGI (aka better than humans).
https://www.anthropic.com/news/activating-asl3-protections
https://deepmind.google/discover/blog/advancing-geminis-security-safeguards
OpenAI’s software engineering agent Codex continues to get rave reviews in its second week of public availability. It’s a genuine horse race between Google, Anthropic, and OpenAI. The consensus is Anthropic’s Claude 4 Opus is the best; however, there are strong cases to be made for Google’s 2.5 Gemini (and its insane multimodal context window) as well as OpenAI’s o3.
https://openai.com/index/introducing-codex
Google has launched a new video generation model called Veo that has gone viral. It’s getting the same amount of buzz that GPT images received. However, Google’s video model costs ~$200 per month.
https://gemini.google/overview/video-generation
https://blog.google/technology/ai/google-flow-veo-ai-filmmaking-tool
Viral video: https://www.youtube.com/watch?v=-IUUCTiIIkc | https://www.instagram.com/reel/DJ9toydMnRW/?hl=en
Google quietly launched a tool that allows you to describe a web interface, and it will build the entire code for you.
Microsoft launched a science agent which discovered a new material in a few hours. Not only was the new material discovered, but scientists were able to synthesize the compound in the laboratory.
Nvidia continues to quietly launch robot training simulation models. There is a very good chance Nvidia leapfrogs everybody in robot training. The interesting twist is that Nvidia open-sources all of their models (that I know of).
https://developer.nvidia.com/isaac/gr00t
OpenAI has partnered with the United Arab Emirates to build the first international deployment of its massive Stargate AI infrastructure platform.
https://openai.com/index/introducing-stargate-uae
Minor recent AI headlines with real publisher impact
When possible, I try to find original sources outside of social media. Apologies in advance for leaning on tweets, below. Sometimes that’s the only place I see these stories.
You can now clone any YouTube channel’s thumbnail style—and automate the process of generating thumbnails for your own videos in that same style. In this sneak peek, I show how I used the Agent Development Kit (ADK) to replicate Alex Hormozi’s exact thumbnail look using OpenAI’s
“You can now crawl entire websites and extract LLM-ready data with a single tool. Crawl4AI is an open-source repo built for AI agents, RAG, and data pipelines. It supports both browser-based and HTTP crawling, with real-time Markdown generation from any site.” https://x.com/LiorOnAI/status/1925930945137254629
I just generated a 5:30 min Multi-Speaker Podcast on Agentic Patterns using Gemini 2.5 Flash and our new Text-to-speech (TTS) Model! At I/O we launched native controllable Audio Generation for Gemini 2.5 Pro & Flash. > Controllable style, accent, pace, tone.
https://x.com/_philschmid/status/1925888544175734873
https://ai.google.dev/gemini-api/docs/speech-generation
NLWeb: This is a new open project that lets you use natural language to interact with any website. Think of it like HTML for the agentic web.
Microsoft releases NLWeb NLWeb uses MCP to make it simple to interact with websites in a standardized way. Devs can now convert any website into an AI app. MCP is to NLWeb what HTTP is to HTML. This went largely unnoticed this week, but it looks like a big deal. https://x.com/omarsar0/status/1925900575666733207
A Step-by-Step Tutorial on Connecting Claude Desktop to Real-Time Web Search and Content Extraction via Tavily AI and Smithery using Model Context Protocol (MCP) In this hands-on tutorial, we’ll learn how to seamlessly connect Claude Desktop to real-time web search and https://x.com/Marktechpost/status/1918877427335622673
https://www.marktechpost.com/2025/05/03/a-step-by-step-tutorial-on-connecting-claude-desktop-to-real-time-web-search-and-content-extraction-via-tavily-ai-and-smithery-using-model-context-protocol-mcp/
Here’s a quick demo of searching, running and using the browser-tools MCP using OneMCP. https://x.com/Ipenywis/status/1921213033973772350
Apple to Open AI Models to Developers in Bid to Spur New Apps – Bloomberg https://www.bloomberg.com/news/articles/2025-05-20/apple-to-open-ai-models-to-developers-betting-that-it-will-spur-new-apps?embedded-checkout=true
Chicago Sun-Times publishes made-up books and fake experts in AI debacle | The Verge https://www.theverge.com/ai-artificial-intelligence/670510/chicago-sun-times-ai-generated-reading-list
