Getting AI to write good SQL

423 points255 comments18 hours ago
wewewedxfgdf

Can I just say that Google AI Studio with latest Gemini is stunningly, amazingly, game changingly impressive.

It leaves Claude and ChatGPT's coding looking like they are from a different century. It's hard to believe these changes are coming in factors of weeks and months. Last month i could not believe how good Claude is. Today I'm not sure how I could continue programming without Google Gemini in my toolkit.

Gemini AI Studio is such a giant leap ahead in programming I have to pinch myself when I'm using it.

show comments
mykowebhn

I understand from a technical POV how this could be considered great news.

But I don't see how this is good news at all from a societal POV.

The last 15 or so years has seen an unprecedented rise in salaries for engineers, especially software engineers. This has brought an interest in the profession from people who would normally not have considered SW as a profession. I think this is both good and bad. It has brought new found wealth to more people, but it may have also diluted the quality of the talent pool. That said, I think it was mostly good.

Now with this game-changing efficiency from these AI tools, I'm sure we've seen an end to the glory days in terms of salaries for the SW profession.

With this gone, where else could relatively normal people achieve financial independence? Definitely not in the service industry.

Very sad.

show comments
mark_l_watson

Nice! A little off topic but I spent years experimenting writing AI-like natural language wrappers for relational databases that would query meta data to get column names, etc. Peter Norvig, in doing a tech review for me for the second edition of my Java AI book made a comment that the NLP database example was much better than anything else in the book, so the code I sweated over off and on for years was probably pretty good, BUT!, compared to what you can build with LLMs today, my old NLP wrappers aren't good at all.

LLMs make some things that were difficult very easy now.

Good article!

levocardia

In one of Stephen Boyd's lectures on convex optimization, he has some quip like "if your optimization problem is computationally intractable, you could try really hard to improve the algorithm, or you could just go on vacation for a few weeks and by the time you get back, computers will be fast enough to solve it."

I feel like that's actually true now with LLMs -- if some query I write doesn't get one-shotted, I don't bother with a galaxy-brain prompt; I just shelve it 'til next month and the next big OpenAI/Anthropic/Google model will usually crush it.

show comments
pcblues

Can someone please answer these questions because I still think AI stinks of a false promise of determinable accuracy:

Do you need an expert to verify if the answer from AI is correct? How is it time saved refining prompts instead of SQL? Is it typing time? How can you know the results are correct if you aren't able to do it yourself? Why should a junior (sorcerer's apprentice) be trusted in charge of using AI? No matter the domain, from art to code to business rules, you still need an expert to verify the results. Would they (and their company) be in a better place to design a solution to a problem themselves, knowing their own assumptions? Or just check of a list of happy-path results without a FULL knowledge of the underlying design? This is not just a change from hand-crafting to line-production, it's a change from deterministic problem-solving to near-enough is good enough, sold as the new truth in problem-solving. It smells wrong.

show comments
hakanito

The game changer for me will be when AI stops hallucinating SDK methods. I often find myself asking ”show me how to do advanced concept X in somewhat niche Y sdk”, and while it produces confident answers, 90% of the time it is suggesting SDK methods that do not exist, so a lot of time is wasted just arguing about that

show comments
mritchie712

the short answer: use a semantic layer.

It's the cleanest way to give the right context and the best place to pull a human in the loop.

A human can validate and create all important metrics (e.g. what does "monthly active users" really mean) then an LLM can use that metric definition whenever asked for MAU.

With a semantic layer, you get the added benefit of writing queries in JSON instead of raw SQL. LLM's are much more consistent at writing a small JSON vs. hundreds of lines of SQL.

We[0] use cube[1] for this. It's the best open source semantic layer, but there's a couple closed source options too.

My last company wrote a post on this in 2021[2]. Looks like the acquirer stopped paying for the blog hosting, but the HN post is still up.

0 - https://www.definite.app/

1 - https://cube.dev/

2 - https://news.ycombinator.com/item?id=25930190

show comments
fourfun

Google may be getting AI to write good SQL, but they aren’t getting it to write good blog posts.

show comments
insin

Is it too late to rescue the phrase "one-shotted" or is it already too far gone, like "AI" and "agent"?

show comments
danjc

The article comments "out of the box, LLMs are particularly good at tasks like creative writing" but I think this actually demonstrates the problem with the ai.

A writer won't think that they're good at creative writing. In fact, I'm pretty sure they'd think LLM's are terrible at creative writing.

In other words, to an expert in their field, they're not that good - at least not yet.

But to someone who is not an expert, they're unbelievably good - they're enabled to do something they had zero ability to do before.

show comments
bob1029

For the problems where it would matter the most, these tools seem to help the least. The hardest problem domains don't have just one schema to worry about. They have hundreds. If you need to spin up a personal blog or todo list tracker, I have no doubt that Google, et. al. can take you exactly where you want to go.

show comments
pcblues

A question: Does anyone know how well AI does generating performative SQL in years-old production databases? In terms of speed of execution, locking, accuracy, etc.?

I see the promise for green-field projects.

dcrimp

I wonder if, for a given dialect (and even DDL), you could use that token masking technique similar to how that Structured Outputs [1] thing went:

Quote: "While sampling, after every token, our inference engine will determine which tokens are valid to be produced next based on the previously generated tokens and the rules within the grammar that indicate which tokens are valid next. We then use this list of tokens to mask the next sampling step, which effectively lowers the probability of invalid tokens to 0. Because we have preprocessed the schema, we can use a cached data structure to do this efficiently, with minimal latency overhead."

I.e. mask any tokens that would produce something that isn't valid SQL in the given dialect, or further, a valid query for the given schema. I assume some structured outputs capability is latent to most assistants nowadays, so they probably already have explored this

[1] https://openai.com/index/introducing-structured-outputs-in-t...

rectang

> We will cover state-of-the-art [...] how we approach techniques that allows the system to offer virtually certified correct answers.

I don't need AI to generate perfect SQL, because I am never going to trust the output enough to copy/paste it — the risk of subtle semantic errors is too high, even if the code validates.

Instead, I find it helpful for AI to suggest approaches — after which I will manually craft the SQL, starting from scratch.

show comments
tango12

What’s the eventual goal of text to sql?

Is it to build a copilot for a data analyst or to get business insight without going through an analyst?

If it’s the latter - then imho no amount of text to sql sophistication will solve the problem because it’s impossible for a non analyst to understand if the sql is correct or sufficient.

These don’t seem like text2sql problems:

> Why did we hit only 80% of our daily ecommmerce transaction yesterday?

> Why is customer acquisition cost trending up?

> Why was the campaign in NYC worse than the same in SF?

show comments
zeroq

Every once in a while I've been trying AI, since everyone and their mother told me to, so I comply.

My recent endevour was with Gemini 2.5:

  - Write me a simple todo app on cloudflare with auth0 authentication.
  - Here's a simple todo on cloudflare. We import the @auth0-cloudflare and...
  - Does that @auth0-cloudflare exists?
  - Oh, it doesn't. I can give you a walkthrough on how to set up an account on auth0. Would you like me to?
  - Yes, please.
  - Here. I'm going to write the walkthrough in a document... (proceed to create an empty document)
  - That seems to be an empty document.
  - Oh, my bad. I'll produce it once more. (proceed to create another empty document)
  - Seems like you're md parsing library is broken, can you write it in chat instead?
  - Yes... (your gemini trial has expired, would you like to pay $100 to continue?)
show comments
rawgabbit

Regarding the first issue: ” For example, even the best DBA in the world would not be able to write an accurate query to track shoe sales if they didn't know that cat_id2 = 'Footwear' in a pcat_extension table means that the product in question is a kind of shoe. The same is true for LLMs.

I wish developers would make use of long table names and column names. For example, pcat_extension could have been named release_schema_1_0.product_category_extension. And cat_id2 could have been named category_id2.

zxexz

I find Gemini excellent for sql. Wouldn’t consider myself an expert in many things, but in sql and database design id consider myself close. I like writing queries and doing the architecture, and that’s where it’s exceptionally helpful. The massive context length combined with pointed questions means i can just dump the entire DDL, and ask “what am i missing?”. It really is an excellent tool for helping with times like checks and catching dumb errors on complex databases.

JodieBenitez

Wait... people need AI to write SQL ?

show comments
todotask2

Those days, we have many types of database tools—ORMs, query builders, and more. AI can help reduce the complexity and avoid lock-in to a specific tech stack. I love to write raw SQL.

neuroelectron

No mention of knowing anything about the tables, versions or relational structure? Are we just assuming that's already given to the AI?

iddan

For anybody wanting to use best-in-class AI SQL, I highly recommend checking out Sherloq (W23): https://www.sherloqdata.io/

msvana

Problem no. 2 (Understanding user intent) is relevant not only to writing SQL but also to software development in general. Follow-up questions are something I had in mind for a long time. I wonder why this is not the default for LLMs.

antman

This is on howto to to write good SELECTS, not SQL. AI is good enough to also create schemas from spec, migrate, explore databases, testing etc which tgis article does not touch upon

show comments
stefap2

I have done this using the OpenAI 4o model. I had to pass in a prompt with business-specific instructions, industry jargon, and descriptions of tables, including foreign keys. Then it would generate even complex join queries and return data. In my case, I was more interested in providing results to users not knowledgeable about SQL, but the SQL was displayed for information.

treebeard901

If a lot of the value in a company is the software and over time a handful of AI companies start writing all the software, who really ends up owning all the value of the company?

show comments
AdrianB1

In real life I find using AI for SQL dangerous. It allows people that don't know what they do to write queries that can significantly impact servers. In my world databases are relatively big for most developers, but not huge.

Sometimes when I want to fine tune a query I am challenging AI to provide a better solution. I give it the already optimized query and I ask for better. I never got a better answer, sometimes because AI is hallucinating or because the changes that it proposes are not working in a way that is beneficial, it is like an idiot parrot is telling what it overheard in the brothel - good info if it is a war brothel frequented by enemy officers in 1916, but not these days.

show comments
jgalt212

For me the flash model is way better than the pro model. I don't want to wait all the extra time to get some code back that I'm going to have to read and modify anyway. I much prefer getting a 92.5% right answer now, than 95% correct answer a minute or minutes from now.

mousetree

Out of all the AI tools and models I’ve tried, the most disappointing is the Gemini built into BigQuery. Despite having well named columns with good descriptions it consistently gets nowhere close to solving the problem.

show comments
benjbrooks

o3 has yet to fail me on complex, multi-table queries. Not a fan of BigQuery’s Gemini integration.

nashashmi

AI text to regex solutions would be incredibly handy.

show comments
IncreasePosts

Can't believe I'm seeing something from Google involving shoes but it isn't named gShoe.

LAC-Tech

If LLMs are so wonderful we can just read from B+ Tree storage engines directly. SQL, ORMs, Query Planners... all bloat.

show comments
cloudking

This is pretty simple in any foundation model, provide a well commented schema and ask for the query

show comments
westurner

From "Show HN: We open sourced our entire text-to-SQL product" (2024) https://news.ycombinator.com/item?id=40456236 :

> awesome-Text2SQL: https://github.com/eosphoros-ai/Awesome-Text2SQL

> Awesome-code-llm > Benchmarks > Text to SQL: https://github.com/codefuse-ai/Awesome-Code-LLM#text-to-sql

dangus

> Even with a high-quality model, there is still some level of non-determinism or unpredictability involved in LLM-driven SQL generation. To address this we have found that non-AI approaches like query parsing or doing a dry run of the generated SQL complements model-based workflows well. We can get a clear, deterministic signal if the LLM has missed something crucial, which we then pass back to the model for a second pass. When provided an example of a mistake and some guidance, models can typically address what they got wrong.

Sounds like a bunch of bespoke not-AI work is being done to make up for LLM limitations that point blank can’t be resolved.

deadbabe

All this LLM written SQL stuff sounds great until you realize if you don’t really know SQL you won’t be able to debug or fix any broken SQL an LLM generates.

Thus, this is mainly just a tool for true experts to do less work and still get paid the same, not a tool for beginners to rise to the level of experts.

show comments