Ever asked Claude or ChatGPT to write a complex SQL query, only to get back something that looks right but crashes spectacularly when you run it? You’re not alone.

I’ve been experimenting with AI-assisted database work for months now, and while these tools are incredible for many coding tasks, SQL generation has some fascinating blind spots. After debugging dozens of AI-generated queries that should have worked but didn’t, I’ve identified three patterns that consistently trip up even the smartest models.

Let me share what I’ve learned about where AI database tools stumble—and more importantly, how to work around these limitations.

The Window Function Trap

Window functions are where AI gets overconfident. The syntax looks straightforward, so models confidently generate queries that compile but produce completely wrong results.

Here’s a query I asked GPT-4 to write: “Show me each employee’s salary rank within their department, but only for the top 3 in each department.”

-- AI's first attempt (wrong!)
SELECT 
  employee_name,
  department,
  salary,
  RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dept_rank
FROM employees 
WHERE dept_rank <= 3;

This looks reasonable, right? Wrong. You can’t use window function aliases in the WHERE clause—they’re not available yet. The AI missed this fundamental SQL execution order concept.

The working version requires a subquery or CTE:

-- Correct approach
WITH ranked_employees AS (
  SELECT 
    employee_name,
    department,
    salary,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dept_rank
  FROM employees
)
SELECT * FROM ranked_employees WHERE dept_rank <= 3;

What I’ve found helpful is being explicit about execution order in my prompts: “Write a query using a CTE to first calculate window functions, then filter the results.” This usually steers the AI toward the correct pattern.

The Aggregation Context Confusion

AI models struggle with mixed aggregation levels—queries that need both row-level and aggregated data. They often generate queries that work in some databases but violate GROUP BY rules in others.

I asked Claude to help with this scenario: “Show me products where sales are above the category average, including the category average for comparison.”

-- AI's attempt (breaks in PostgreSQL)
SELECT 
  product_name,
  category,
  sales_amount,
  AVG(sales_amount) OVER (PARTITION BY category) as category_avg
FROM products 
GROUP BY category
HAVING sales_amount > AVG(sales_amount);

The AI mixed window functions with GROUP BY incorrectly. In strict SQL modes, this fails because product_name and sales_amount aren’t in the GROUP BY clause, but we’re also not properly aggregating them.

Here’s the corrected version:

-- Working solution
WITH category_averages AS (
  SELECT 
    category,
    AVG(sales_amount) as avg_sales
  FROM products 
  GROUP BY category
)
SELECT 
  p.product_name,
  p.category,
  p.sales_amount,
  ca.avg_sales as category_avg
FROM products p
JOIN category_averages ca ON p.category = ca.category
WHERE p.sales_amount > ca.avg_sales;

The key insight? AI models often try to solve everything in one query when breaking it into logical steps (like CTEs) produces clearer, more reliable results.

The Join Logic Labyrinth

Complex join scenarios—especially those involving multiple tables with similar relationships—consistently confuse AI models. They often generate joins that are technically valid but logically wrong.

Here’s where I see this most: “Find customers who bought products in multiple categories, showing their first purchase date in each category.”

-- AI's confusing attempt
SELECT DISTINCT
  c.customer_name,
  p.category,
  MIN(o.order_date) as first_purchase
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id  
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
GROUP BY c.customer_name, p.category
HAVING COUNT(DISTINCT p.category) > 1;

This query has a subtle but critical flaw. The HAVING clause checks categories within each group, but we’re already grouping BY category. So it’s checking if each customer-category combination has multiple categories, which is impossible.

The corrected approach requires thinking step-by-step:

-- Step 1: Find customers with multiple categories
WITH multi_category_customers AS (
  SELECT c.customer_id
  FROM customers c
  JOIN orders o ON c.customer_id = o.customer_id  
  JOIN order_items oi ON o.order_id = oi.order_id
  JOIN products p ON oi.product_id = p.product_id
  GROUP BY c.customer_id
  HAVING COUNT(DISTINCT p.category) > 1
),
-- Step 2: Get first purchase per category for those customers
customer_category_first AS (
  SELECT 
    c.customer_id,
    c.customer_name,
    p.category,
    MIN(o.order_date) as first_purchase
  FROM customers c
  JOIN orders o ON c.customer_id = o.customer_id  
  JOIN order_items oi ON o.order_id = oi.order_id
  JOIN products p ON oi.product_id = p.product_id
  WHERE c.customer_id IN (SELECT customer_id FROM multi_category_customers)
  GROUP BY c.customer_id, c.customer_name, p.category
)
SELECT * FROM customer_category_first ORDER BY customer_name, category;

When dealing with complex joins, I’ve learned to ask AI to “break this into multiple steps using CTEs” rather than requesting one massive query.

Working Better with AI SQL Generation

These patterns have taught me to adjust how I prompt for database queries. Instead of describing the end result, I often describe the logical steps:

  1. “First, calculate X using a CTE”
  2. “Then, join that with Y”
  3. “Finally, filter where Z”

I also explicitly mention my database system. “Write a PostgreSQL query that…” often produces better results than generic SQL requests, since the AI can apply database-specific best practices.

Most importantly, I’ve stopped expecting AI to nail complex queries on the first try. Instead, I use it for the initial structure, then debug and refine. AI is fantastic at generating boilerplate joins and basic aggregations—just keep an eye on those tricky edge cases.

The goal isn’t to replace your SQL knowledge, but to augment it. Understanding these common failure patterns helps you catch issues quickly and guide AI toward better solutions. After all, the best AI-assisted coding happens when you know enough to spot when something’s off.

What SQL patterns have you noticed AI struggling with? I’d love to hear about your experiences debugging AI-generated database queries.