Database Optimization: Keeping SQL Queries Fast

Most people think slow databases are a hardware problem. Add more CPU, throw in some SSDs, increase RAM, and the queries will magically fly. I learned the hard way that if your SQL is bad and your schema is worse, all the hardware in the world just lets you run slow queries faster.

The short version: keeping SQL queries fast means designing indexes for how your queries actually work, avoiding unnecessary work in the query plan, keeping result sets small, and watching query plans like a hawk. Start with good schema design and correct indexing on predicates and joins, keep statistics fresh, avoid anti-patterns such as `SELECT *`, wild `%like` patterns, and functions on indexed columns, then validate all of it with `EXPLAIN` / `EXPLAIN ANALYZE`. Hardware comes last, not first.

If you are not routinely checking query plans and slow query logs, you are not doing database optimization. You are guessing.

Understand what “fast” actually means

Before changing a single query, decide what “fast” is for your system. Most teams skip this and then wonder why performance “feels bad.”

Target latency: For OLTP (web apps, dashboards), aim for p95 under 50-100 ms for common queries. Background jobs can be slower, but predictable.
Throughput vs latency: A query that runs in 30 ms but locks huge ranges can kill concurrency and harm throughput.
Consistency under load: You want stable performance at peak, not only during development on an empty laptop database.

A query that is “fast on my machine” against 1,000 rows often becomes a disaster at 10 million rows. Always think about how performance scales with data size.

Know your workload type

Different workloads want different optimizations:

Workload	Characteristics	Focus for speed
OLTP (web apps, APIs)	Many small queries, random access, high concurrency	Index design, short transactions, lock contention, connection limits
Analytics / Reporting	Fewer large queries, scans, aggregations	Columnar storage (if available), partitioning, materialized views
Mixed	Blend of both above	Physical separation (replicas), workload isolation

If you run heavy analytical queries on your primary OLTP node, do not be surprised when simple user actions feel sluggish. That is not a mystery, that is contention.

Indexing: the first line of defense

Most SQL performance problems are index problems dressed up as “mysterious slowness.”

Index for your predicates, not your feelings

The database can only use indexes that match how you filter and join.

Filter columns: Add indexes on columns frequently used in `WHERE` clauses, especially high-selectivity columns (e.g., user_id, email, order_id).
Join columns: Columns used for joining tables usually deserve indexes on both sides, especially on the foreign key side.
Sort columns: If a query sorts a lot and you need that order, consider indexes that match the `ORDER BY` pattern.

Example bad pattern:

“`sql
SELECT *
FROM users
WHERE LOWER(email) = LOWER($1);
“`

If you only have a plain index on `email`, the database will often ignore it because of the `LOWER()` function. A better approach:

“`sql
CREATE INDEX idx_users_email_lower ON users (LOWER(email));
“`

Or store normalized emails in a separate column and index that.

If you wrap an indexed column in a function, you probably just disabled the index for that query.

Composite indexes: order matters

For multi-column indexes, the order defines what can be used.

“`sql
CREATE INDEX idx_orders_user_created
ON orders (user_id, created_at);
“`

This index can speed up:

“`sql
SELECT *
FROM orders
WHERE user_id = $1
ORDER BY created_at DESC
LIMIT 20;
“`

because the predicate starts with `user_id` and then sorts by `created_at`. The index does both.

That same index will not help much for:

“`sql
SELECT *
FROM orders
WHERE created_at >= $1;
“`

The leading column in the index is `user_id`, not `created_at`, so the engine cannot use the index effectively for that filter.

General rule:

Put highly selective filter columns first.
Then put columns used for ordering or grouping.

Avoid index bloat, not just index absence

Too many indexes hurt writes and sometimes reads. Each insert or update must maintain every index touching those columns.

Signs you have too many or bad indexes:

Write-heavy tables with many non-used indexes.
Indexes on low-cardinality boolean columns (`is_active` only) without other predicates.
Redundant indexes, for example an index on `user_id` and another on `(user_id, created_at)` where the first is now useless.

In PostgreSQL you can check index usage:

“`sql
SELECT
relname AS table_name,
indexrelname AS index_name,
idx_scan,
idx_tup_read,
idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC;
“`

Look for indexes with low or zero `idx_scan` over a long period. Those are candidates for removal, after careful review.

Write queries that do less work

Bad queries often ask the database to do unnecessary work: scan too many rows, return too many columns, or perform pointless computations.

Stop using SELECT *

`SELECT *` is convenient during development and a problem later.

Problems:

Pulls unnecessary columns across the wire, bloating I/O.
Makes covering indexes harder, because the engine might need to hit the heap anyway.
Makes schema changes risky for app code that assumes column order or presence.

Instead, request only what you actually need:

“`sql
SELECT id, name, email
FROM users
WHERE id = $1;
“`

This might feel annoying, but it saves CPU, memory, and network traffic at scale.

Filter early, aggregate later

Push filters as close to the base tables as possible.

Compare:

“`sql
— Bad: aggregate everything, then filter
SELECT *
FROM (
SELECT user_id, COUNT(*) AS cnt
FROM logins
GROUP BY user_id
) t
WHERE cnt > 100;
“`

Better:

“`sql
SELECT user_id, COUNT(*) AS cnt
FROM logins
WHERE created_at >= NOW() – INTERVAL ’30 days’
GROUP BY user_id
HAVING COUNT(*) > 100;
“`

The second version prunes rows earlier with the `WHERE` clause, and the `HAVING` condition is clear to the planner.

Avoid wild leading patterns and unindexed searches

This pattern kills index usage:

“`sql
SELECT *
FROM posts
WHERE title LIKE ‘%database%’;
“`

The leading `%` prevents index usage in many engines, turning it into a full scan. For full text search, do not force a relational database to pretend it is a search engine.

For PostgreSQL, consider full-text search:

“`sql
CREATE INDEX idx_posts_search
ON posts USING GIN (to_tsvector(‘english’, title || ‘ ‘ || body));

SELECT *
FROM posts
WHERE to_tsvector(‘english’, title || ‘ ‘ || body)
@@ plainto_tsquery(‘database’);
“`

Or use a dedicated search engine where appropriate.

If product insists on “search everything, everywhere, instantly” on a relational database, be honest about the cost. Fast search is not free.

Understand query plans: EXPLAIN is your friend

Reading query plans is the difference between tuning with intent and stabbing in the dark.

Basic usage

Common commands:

EXPLAIN <query>: Shows the estimated plan.
EXPLAIN ANALYZE <query>: Executes the query and shows actual timing and row counts.

For PostgreSQL:

“`sql
EXPLAIN ANALYZE
SELECT *
FROM orders
WHERE user_id = 123
ORDER BY created_at DESC
LIMIT 20;
“`

Look for:

Seq Scan (sequential scan): Full table scan. Acceptable for small tables or when most rows match; disastrous for large tables with selective filters.
Index Scan or Index Only Scan: Usually desired for selective queries.
Nested Loop vs Hash Join: Join strategy matters when tables grow.
Rows estimated vs rows actual: Large mismatches indicate poor statistics.

If you see a `Seq Scan` on a large table yet expect an index scan, check:

Is there an index that matches the predicate?
Are statistics up to date?
Is the predicate using functions, casts, or expressions that block index use?

Watch actual vs estimated rows

If the planner thinks a predicate will return 10 rows and it actually returns 1,000,000, you have a statistics problem. Bad statistics lead to bad join choices, bad scan choices, and slow queries.

For PostgreSQL:

“`sql
VACUUM ANALYZE my_table;
“`

For MySQL (InnoDB), statistics are usually updated automatically, but for very large tables you might need to tune configuration or force stats updates.

If the planner is blind, your indexes are not as helpful as you think. Keep statistics fresh.

Schema design that does not sabotage performance

No amount of query tweaking will compensate for a schema that ignores access patterns.

Normalize, then denormalize carefully

Over-normalization:

Too many joins for simple operations.
Complex queries for basic UI views.

Over-denormalization:

Data duplication across many tables.
Expensive writes and complicated consistency logic.

Aim for a reasonable normal form for OLTP tables, then introduce selective denormalized fields or materialized views for read-heavy paths that are expensive to compute online.

Example: A dashboard that shows total order count and total value per user. If this is hit often and the base `orders` table is large, precompute into a `user_order_stats` table and keep it updated by triggers or background jobs.

Choose appropriate data types

Data types affect index size, comparison speed, and query plans.

Some guidelines:

Use INT or BIGINT for numeric IDs, not strings.
Use BOOLEAN where appropriate, not tinyint or text flags, when your engine supports it.
For time, prefer UTC timestamps in a consistent type (TIMESTAMP WITH TIME ZONE in PostgreSQL).
Use appropriate length for strings (VARCHAR(255) is not a universal solution).

Large variable-length columns (like long text or JSON blobs) can harm cache efficiency and performance of queries that do not even need those columns. Keep them in separate tables if necessary.

Be careful with JSON and schemaless patterns

JSON columns are attractive to avoid schema migrations. They come with trade-offs:

Harder to index correctly.
Query plans can be worse, especially if many predicates use JSON operators.
Application logic often becomes messy when the structure drifts over time.

If a field is critical for queries or joins, promote it to a proper typed column.

Limit what you fetch and how often you hit the database

Reducing database work is often more effective than micro-tuning single queries.

Use LIMIT correctly and thoughtfully

`LIMIT` helps only if the engine can stop early. Pair it with an index that supports the filter and order.

Bad example:

“`sql
SELECT *
FROM events
ORDER BY created_at DESC
LIMIT 50;
“`

If `events` is large and `created_at` is not indexed, the engine has to sort a huge set then take 50. Better:

“`sql
CREATE INDEX idx_events_created ON events (created_at DESC);
“`

Then the engine can read the index top entries directly.

N+1 queries: the repeated mistake

The N+1 query pattern is common in ORMs. Example in pseudo-code:

“`python
users = db.query(“SELECT * FROM users LIMIT 100”)
for user in users:
posts = db.query(“SELECT * FROM posts WHERE user_id = ?”, user.id)
“`

That is 101 queries instead of 2.

Fix with a join or `IN` query:

“`sql
SELECT *
FROM posts
WHERE user_id IN (
SELECT id FROM users LIMIT 100
);
“`

Or join and hydrate in application code:

“`sql
SELECT u.id AS user_id, u.name, p.id AS post_id, p.title
FROM users u
LEFT JOIN posts p ON p.user_id = u.id
WHERE u.id IN ( … );
“`

Most ORMs have “eager loading” or “includes” features to do this.

If your application logs show hundreds of small similar queries for one page load, stop tuning individual queries and fix the N+1 pattern.

Locking, concurrency, and transaction scope

Fast single queries are useless if they block each other into oblivion.

Keep transactions short

Transactions that stay open for a long time:

Hold locks longer, blocking others.
In MVCC systems like PostgreSQL, retain old row versions and bloat storage.

Bad pattern:

“`sql
BEGIN;

— long-running processing in application
— user confirmation, external HTTP calls, etc.

UPDATE accounts SET balance = balance – 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

COMMIT;
“`

Business logic and network calls should not live inside DB transactions. Move them out, keep the critical read/write window minimal.

Understand isolation levels

In many cases, default isolation levels (like READ COMMITTED in PostgreSQL, REPEATABLE READ in InnoDB) are enough. Raising isolation for “safety” without understanding the cost can bring heavy locking and deadlocks.

If you encounter deadlocks or long waits:

Inspect queries for lock order issues (updating tables in different orders in different code paths).
Use shorter transactions and less broad updates (`WHERE` clauses that touch fewer rows).
Break large maintenance jobs into smaller batches.

Partitioning and sharding for large datasets

At some scale, single-table strategies break down. That does not mean you jump to sharding at the first sign of a slowdown.

Partitioning

Partitioning splits a large table into smaller pieces managed as one logical table.

Common strategies:

Range: e.g., by date (per day / per month partitions).
Hash: by key (like user_id).

Good for:

Time-series data where old partitions can be archived or dropped.
Queries that filter on the partition key (e.g., date range) so the engine can skip entire partitions.

Badly planned partitioning can worsen performance if most queries hit many partitions at once. Partition only when tables are genuinely large and access patterns justify it.

Sharding

Sharding splits data across multiple database servers. This adds:

Complexity in queries (no global joins without extra logic).
Operational overhead in moving shards, handling rebalancing, and dealing with cross-shard transactions.

Sharding is not a tuning trick. It is an architectural decision. Exhaust better indexing, caching, and possibly read replicas before you jump to sharding.

Caching: relieve the database of repetitive work

SQL optimization and caching are partners. If you query the same expensive data repeatedly and it does not change very often, do not ask the database every time.

Levels of caching

Database-level caches: Query cache (in some systems), buffer cache, plan cache.
Application-level caching: In-memory caches, Redis, Memcached.
HTTP-level caching: For API responses where appropriate.

Practical patterns:

Cache expensive aggregated statistics with a TTL.
Cache configuration or reference data that rarely changes.
Use read replicas to offload read-only queries from primary.

What caching is not: a band-aid for terrible queries that run constantly with slightly different parameters. Fix the query and underlying schema first, then apply caching where it amplifies your gains.

Monitoring and slow query tracking

You cannot tune what you do not measure.

Enable slow query logs

Every production database should have slow query logging enabled with a sensible threshold (for example, queries longer than 200 ms for OLTP, maybe 1 second for heavier analytics).

Use these to:

Identify top offenders by frequency and duration.
Spot unexpected full table scans or expensive joins.

Then, for each high-impact query:

Check its query plan.
Review indexes and schema.
Check query patterns in application code.

Metrics to watch

Track at minimum:

Metric	Why it matters
Query latency (p50, p95, p99)	Shows how users experience the system, and tail latency issues.
QPS / TPS	Combined with latency, shows saturation points.
Buffer/cache hit ratio	Indicates if your working set fits into memory.
Lock wait times and deadlocks	Reveals concurrency issues.
Disk I/O throughput and IOPS	Shows whether you are I/O bound.

If your cache hit ratio tanks and latency spikes, you might have queries scanning more data than memory can hold, or your working set grew beyond available RAM.

Hardware and configuration: support, not magic

At some point, you will hit hardware limits. That does not make hardware the first tuning step.

Memory, CPU, and disk

Memory: More RAM helps when it keeps your working set in cache. If queries are poorly indexed, more memory just delays the problem.
CPU: Helps with many concurrent queries and complex processing. Overly complex functions and unneeded JSON parsing can waste CPU.
Disk: SSDs are table stakes for serious workloads. Spinning disks and random access are a bad mix.

If query plans show huge sequential scans over multi-gigabyte tables for OLTP-style lookups, no realistic hardware will save that pattern.

Database configuration

Tuning knobs vary by engine, but general areas:

Memory allocations (shared buffers, work memory).
Checkpoint and WAL / redo log tuning.
Maximum connections and pool sizes.

Avoid cranking every knob to “high” without understanding trade-offs. An overloaded database with thousands of connections is slower than a well-pooled one with sane limits.

If your app opens a new database connection for every request without pooling, you do not have a SQL problem. You have a connection problem.

Anti-patterns to avoid in SQL optimization

Some patterns show up often and almost always represent a wrong path.

Premature micro-tuning

Examples:

Rewriting simple `IN` queries into complex unions “for speed” without evidence.
Hand-optimizing tiny lookups that run once a minute while ignoring a slow report run every second.
Obsessing over query hints instead of fixing schema and indexes.

Always start with data: slow logs, query plans, real production metrics.

Abusing ORM features

ORMs save time but come with traps:

Lazy loading collections in loops (N+1 query problem).
Automatic eager loading of huge graphs of related objects.
Letting the ORM generate unbounded queries without paginations.

Use ORMs, but keep an eye on actual SQL. For critical queries, write them by hand or at least inspect generated SQL and plans.

Ignoring maintenance

Maintenance tasks are not optional for performance:

Vacuuming and analyzing (PostgreSQL) to reclaim space and update stats.
Rebuilding very bloated indexes when necessary.
Archiving or pruning old data instead of letting tables grow forever.

Tables that grow without any archival strategy eventually punish every query, no matter how well indexed.

Practical tuning workflow

Putting it all together, a realistic approach looks like this:

1. Identify high-impact queries

Use slow query logs and APM tools.
Rank by total time consumed, not only individual execution time.

Sometimes a 20 ms query executed 10,000 times per minute is a bigger problem than a 3-second query executed once a day.

2. Inspect query and schema

Look for `SELECT *`, unnecessary joins, unbounded result sets.
Check existing indexes that relate to the predicates and joins.

3. Run EXPLAIN / EXPLAIN ANALYZE

Confirm whether indexes are used.
Check row estimates vs actuals.
Identify scans, sorts, and joins that dominate cost.

4. Apply minimal, targeted changes

Possible changes, in rough order of preference:

Adjust the query (drop `SELECT *`, filter earlier, reduce joins, add pagination).
Add or adjust indexes that match real predicates and orders.
Fix statistics or run maintenance commands.
Revisit schema for heavily used paths.

Measure again after each change. Do not stack multiple big changes at once or you will not know what helped.

5. Bake monitoring and review into routine

Performance tuning is not a one-time event. Data volume, usage patterns, and features evolve. Schedule periodic reviews:

Look at top slow queries monthly or after major releases.
Review indexing strategy when adding major features.
Plan archival and partitioning as data grows.

Fast SQL is not magic. It is the result of steady, boring habits: good schema design, sane indexing, careful queries, and continuous measurement.

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28