Big moment for Postgres!
A 100% open-source solution that fixes a major issue with AI coding Agents.
3 days left before lifetime access price increases by 50%
In 3 days, the price of lifetime access to DailyDoseofDS will increase by 50%.
Secure your Lifetime Access before it increases here →
Pay once and get lifetime access to our existing all-in-one hands-on AI Engineering blueprints + everything we’ll release in the future:
The 17-part course that covers how to build Agentic systems.
Our 18-part MLOps course that goes from first principles to production.
The full 9-part course on MCPs.
Our 7-part course on building RAG systems.
LLM fine-tuning techniques and implementations.
Our courses on graph neural networks, PySpark, model interpretability, model calibration, causal inference, and more.
Scaling ML models with implementations.
Building privacy-preserving ML systems.
Mathematical deep dives on core DS topics, clustering, etc.
From-scratch implementations of several core ML algorithms.
Building 100% reproducible ML projects.
50+ more existing industry-relevant topics.
You will get all 100+ existing resources plus every new weekly deep dive for life.
Secure your Lifetime Access at a discount here →
P.S. Our last sale was over 12 months ago. We don’t do Black Friday. We don’t do Cyber Monday. This discount disappears in 3 days, and we have no plans to offer it again.
Join 100k+ people that we helped get promoted, get a better job, or start their own company →
P.S. If you are an existing monthly or yearly member and wish to upgrade to lifetime, please reply to this email.
Big moment for Postgres!
AI coding tools have been surprisingly bad at writing Postgres code.
Not because the models are dumb, but because of how they learned SQL in the first place.
LLMs are trained on the internet, which is full of outdated Stack Overflow answers and quick-fix tutorials.
So when you ask an AI to generate a schema, it gives you something that technically runs but misses decades of Postgres evolution, like:
No GENERATED ALWAYS AS IDENTITY (added in PG10)
No expression or partial indexes
No NULLS NOT DISTINCT (PG15)
Missing CHECK constraints and proper foreign keys
Generic naming that tells you nothing
But this is actually a solvable problem.
You can teach AI tools to write better Postgres by giving them access to the right documentation at inference time.
This exact solution is actually implemented in the newly released pg-aiguide by Tiger Data, which is an open-source MCP server that provides coding tools access to 35 years of Postgres expertise.
In a gist, the MCP server enables:
Semantic search over the official PostgreSQL manual (version-aware, so it knows PG14 vs PG17 differences)
Curated skills with opinionated best practices for schema design, indexing, and constraints.
We ran an experiment with Claude Code to see how well this works, and worked with the team to put this together.
Prompt: “Generate a schema for an e-commerce site twice, one with the MCP server disabled, one with it enabled. Finally, run an assessment to compare the generated schemas.”
The run with the MCP server led to:
420% more indexes (faster queries, especially on filtered and computed columns)
235% more constraints (bad data gets rejected at the database, not discovered in production)
60% more tables (less redundancy, easier to maintain and extend)
11 automation functions and triggers (business logic enforced consistently, less app code)
Modern PG17 patterns throughout (future-proof and optimized for current Postgres)
The MCP-assisted schema had proper data integrity, performance optimizations baked in, and followed naming conventions that actually make sense in production.
pg-aiguide works with Claude Code, Cursor, VS Code, and any MCP-compatible tool.
It’s free and fully open source.
You can find the GitHub repo here →
Thanks for reading!
DeepSeek just fixed one of AI’s oldest problems (using a 60-year-old algorithm)
When deep learning took off around 2012-2013, researchers hit a wall. You can’t just stack layers endlessly because gradients either exploded or vanished.
So training deep networks was nearly impossible.
ResNets solved this in 2016 with residual connections:
output = input + what the layer learned
That “+” creates a direct highway for information. This is why we can now train networks with hundreds of layers.
Recently, researchers asked: What if we had multiple highways instead of one?
Hyper-Connections (HC) expanded that single lane into 4 parallel lanes with learnable matrices that mix information between streams.
The performance gains were real but there was a problem with this approach as well:
Those mixing matrices compound across layers. A tiny 5% amplification per layer becomes 18x after 60 layers. The paper measured amplification reaching 3000x, which led to a training collapse.
The usual fixes were gradient clipping and careful initialization, while hoping things work out, but these are hacks that don’t scale.
DeepSeek went back to first principles. What mathematical constraint would guarantee stability?
And the answer was sitting in a 1967 paper: the Sinkhorn-Knopp algorithm.
It forces mixing matrices to be a “doubly stochastic” matrix, which has three properties:
All entries are non-negative
Each row sums to 1
Each column sums to 1
And this small change led to:
3000x instability reduced to 1.6x
Stability guaranteed by math, not luck
Only 6.7% additional training overhead
The intuitive reason behind this is simple.
Think of it like shuffling money between 4 bank accounts. You can move funds around however you want, but the total must stay the same. So you cannot create money or destroy it.
Earlier, in Hyper-Connections, you have multiple parallel streams of information that need to mix and interact as they flow through the network.
The problem was that unconstrained mixing caused signals to amplify at each layer, compounding into that catastrophic 3000x explosion.
By forcing the mixing matrices to be doubly stochastic, DeepSeek ensured that information could flow freely between streams while the total signal energy stayed constant. The mixing and learning still happen, but the explosion becomes mathematically impossible.
This means that while signals can mix freely across streams, they can’t explode or vanish.
We will likely cover this paper in more detail with mathematical details in a future issue.
Until then, you can find this paper here →
Thanks for reading!









