SQL for Finance - Querying Your Way Through Millions of Trades

2026-04-03 21 min read Computer Science

SQL for Finance: Querying Your Way Through Millions of Trades¶

Somewhere, right now, a database is holding 47 million trade records, 3 billion market data ticks, and the entire transaction history of a bank that has existed since 1832. That database does not care that it is Monday morning. It does not care that the CEO wants “a quick number” for the board meeting. It does not care that you just spilled coffee on your keyboard. It just sits there, waiting for a well-written query.

SQL (Structured Query Language) is how you talk to that database. It is the universal language of data retrieval, and in finance, it is everywhere. Trade databases, risk systems, regulatory reporting, market data warehouses, position management, P&L reconciliation, compliance monitoring. If money moved, SQL knows about it.

The thing about SQL in finance is that it is not particularly complicated. You do not need a PhD in database theory. You need to know about 20 patterns really well, and you need to understand the specific shapes of financial data. The rest is just practice and knowing which questions to ask.

SQL is the only language where you can earn a six-figure salary knowing fewer than 20 keywords. SELECT, FROM, WHERE, JOIN, GROUP BY, ORDER BY, and you are already dangerous.

The Financial Database: What You Are Working With¶

Before writing queries, let us set up the tables we will use throughout this article. These represent a simplified but realistic trading database.

Table: trades¶

The core of any trading system. Every executed trade gets a row.

CREATE TABLE trades (
    trade_id        SERIAL PRIMARY KEY,
    trade_date      DATE NOT NULL,
    trade_time      TIMESTAMP NOT NULL,
    trader          VARCHAR(50) NOT NULL,
    desk            VARCHAR(30) NOT NULL,
    ticker          VARCHAR(10) NOT NULL,
    side            VARCHAR(4) NOT NULL,    -- 'BUY' or 'SELL'
    quantity        INTEGER NOT NULL,
    price           DECIMAL(12, 4) NOT NULL,
    notional        DECIMAL(15, 2) GENERATED ALWAYS AS (quantity * price) STORED,
    currency        VARCHAR(3) DEFAULT 'USD',
    status          VARCHAR(10) DEFAULT 'FILLED'
);

Table: positions¶

The current inventory. How much of each stock does each desk hold right now?

CREATE TABLE positions (
    position_id     SERIAL PRIMARY KEY,
    desk            VARCHAR(30) NOT NULL,
    ticker          VARCHAR(10) NOT NULL,
    quantity        INTEGER NOT NULL,          -- Positive = long, negative = short
    avg_cost        DECIMAL(12, 4) NOT NULL,
    current_price   DECIMAL(12, 4),
    last_updated    TIMESTAMP DEFAULT NOW()
);

Table: market_data¶

End-of-day prices for all instruments.

CREATE TABLE market_data (
    ticker          VARCHAR(10) NOT NULL,
    trade_date      DATE NOT NULL,
    open_price      DECIMAL(12, 4),
    high_price      DECIMAL(12, 4),
    low_price       DECIMAL(12, 4),
    close_price     DECIMAL(12, 4),
    volume          BIGINT,
    PRIMARY KEY (ticker, trade_date)
);

Sample Data¶

Let us insert some realistic trades:

INSERT INTO trades (trade_date, trade_time, trader, desk, ticker, side, quantity, price)
VALUES
    ('2026-03-28', '2026-03-28 09:31:00', 'Alice', 'Equities', 'AAPL', 'BUY',  1000, 148.25),
    ('2026-03-28', '2026-03-28 09:45:00', 'Alice', 'Equities', 'MSFT', 'BUY',   500, 375.80),
    ('2026-03-28', '2026-03-28 10:12:00', 'Bob',   'Equities', 'GOOGL', 'SELL',  300, 152.40),
    ('2026-03-28', '2026-03-28 11:03:00', 'Alice', 'Equities', 'AAPL', 'BUY',   500, 149.10),
    ('2026-03-28', '2026-03-28 14:22:00', 'Charlie', 'Fixed Income', 'TLT', 'BUY', 2000, 95.50),
    ('2026-03-28', '2026-03-28 15:15:00', 'Bob',   'Equities', 'NVDA', 'BUY',   200, 825.30),
    ('2026-03-31', '2026-03-31 09:30:00', 'Alice', 'Equities', 'AAPL', 'SELL',  800, 151.20),
    ('2026-03-31', '2026-03-31 10:05:00', 'David', 'Derivatives', 'SPY', 'BUY', 1500, 520.45),
    ('2026-03-31', '2026-03-31 11:30:00', 'Bob',   'Equities', 'GOOGL', 'BUY',  400, 155.10),
    ('2026-03-31', '2026-03-31 14:00:00', 'Charlie', 'Fixed Income', 'TLT', 'SELL', 1000, 96.20);

This is a tiny dataset. In production, you would be looking at millions of rows per day across dozens of instruments. But the queries are the same. SQL does not care if the table has 10 rows or 10 billion.

The Basics: SELECT, WHERE, ORDER BY¶

Querying Trades¶

-- All trades for a specific date
SELECT trade_id, trade_time, trader, ticker, side, quantity, price, notional
FROM trades
WHERE trade_date = '2026-03-28'
ORDER BY trade_time;

trade_id	trade_time	trader	ticker	side	quantity	price	notional
1	09:31:00	Alice	AAPL	BUY	1000	148.25	148,250.00
2	09:45:00	Alice	MSFT	BUY	500	375.80	187,900.00
3	10:12:00	Bob	GOOGL	SELL	300	152.40	45,720.00
4	11:03:00	Alice	AAPL	BUY	500	149.10	74,550.00
5	14:22:00	Charlie	TLT	BUY	2000	95.50	191,000.00
6	15:15:00	Bob	NVDA	BUY	200	825.30	165,060.00

Filtering with Multiple Conditions¶

-- Alice's buys over $100K notional
SELECT ticker, side, quantity, price, notional
FROM trades
WHERE trader = 'Alice'
  AND side = 'BUY'
  AND notional > 100000
ORDER BY notional DESC;

ticker	side	quantity	price	notional
MSFT	BUY	500	375.80	187,900.00
AAPL	BUY	1000	148.25	148,250.00

Pattern Matching with LIKE and IN¶

-- All tech stock trades
SELECT trader, ticker, side, quantity, price
FROM trades
WHERE ticker IN ('AAPL', 'MSFT', 'GOOGL', 'NVDA', 'META', 'AMZN')
ORDER BY ticker, trade_time;

-- All traders whose name starts with 'A'
SELECT DISTINCT trader
FROM trades
WHERE trader LIKE 'A%';

Key Insight: IN is your best friend when filtering against a known list of values. It is cleaner than chaining OR conditions and easier to maintain. When the list comes from another table, use a subquery or JOIN instead.

Aggregations: GROUP BY¶

Aggregation is where SQL starts earning its keep. Summarizing millions of rows into a handful of meaningful numbers is what databases are built for.

Daily Trading Volume by Desk¶

SELECT
    trade_date,
    desk,
    COUNT(*) AS num_trades,
    SUM(notional) AS total_notional,
    ROUND(AVG(notional), 2) AS avg_trade_size
FROM trades
GROUP BY trade_date, desk
ORDER BY trade_date, total_notional DESC;

trade_date	desk	num_trades	total_notional	avg_trade_size
2026-03-28	Equities	4	435,730.00	108,932.50
2026-03-28	Fixed Income	1	191,000.00	191,000.00
2026-03-28	Equities	1	165,060.00	165,060.00
2026-03-31	Derivatives	1	780,675.00	780,675.00
2026-03-31	Equities	2	183,000.00	91,500.00
2026-03-31	Fixed Income	1	96,200.00	96,200.00

Top Traders by Volume¶

SELECT
    trader,
    COUNT(*) AS total_trades,
    SUM(quantity) AS total_shares,
    SUM(notional) AS total_notional,
    ROUND(AVG(price), 2) AS avg_price
FROM trades
GROUP BY trader
ORDER BY total_notional DESC;

trader	total_trades	total_shares	total_notional	avg_price
Alice	4	2800	531,660.00	206.09
David	1	1500	780,675.00	520.45
Charlie	2	3000	287,200.00	95.85
Bob	3	900	272,820.00	377.60

Filtering Aggregated Results with HAVING¶

-- Desks with more than $500K in daily notional
SELECT
    trade_date,
    desk,
    SUM(notional) AS total_notional
FROM trades
GROUP BY trade_date, desk
HAVING SUM(notional) > 500000
ORDER BY total_notional DESC;

WHERE filters rows before aggregation. HAVING filters groups after aggregation. Mixing them up is the SQL equivalent of putting on your shoes before your socks. Technically possible, but everyone will notice.

JOINs: Combining Tables¶

Real-world queries almost always involve multiple tables. JOINs are how you connect them.

INNER JOIN: Matching Rows Only¶

-- Trades with current market prices
SELECT
    t.trade_date,
    t.ticker,
    t.side,
    t.price AS trade_price,
    m.close_price AS market_close,
    ROUND((m.close_price - t.price) * t.quantity, 2) AS unrealized_pnl
FROM trades t
INNER JOIN market_data m
    ON t.ticker = m.ticker
    AND t.trade_date = m.trade_date
WHERE t.side = 'BUY'
ORDER BY unrealized_pnl DESC;

LEFT JOIN: Keep All Rows from the Left Table¶

-- All trades, with market data if available
SELECT
    t.trade_id,
    t.ticker,
    t.price AS trade_price,
    m.close_price,
    CASE
        WHEN m.close_price IS NULL THEN 'NO MARKET DATA'
        ELSE 'OK'
    END AS data_status
FROM trades t
LEFT JOIN market_data m
    ON t.ticker = m.ticker
    AND t.trade_date = m.trade_date;

Key Insight: LEFT JOIN is essential in finance for data quality checks. If a trade has no matching market data, you want to know about it (not silently drop it, which is what INNER JOIN would do). “Show me everything, and flag what’s missing” is a very common pattern.

Self-JOIN: Comparing a Table to Itself¶

-- Find pairs of trades in the same stock on the same day
-- (potential wash trades or paired activity)
SELECT
    t1.trade_id AS trade_1,
    t2.trade_id AS trade_2,
    t1.ticker,
    t1.trader AS trader_1,
    t2.trader AS trader_2,
    t1.side AS side_1,
    t2.side AS side_2,
    t1.quantity AS qty_1,
    t2.quantity AS qty_2
FROM trades t1
INNER JOIN trades t2
    ON t1.ticker = t2.ticker
    AND t1.trade_date = t2.trade_date
    AND t1.trade_id < t2.trade_id  -- Avoid duplicates and self-matches
    AND t1.side != t2.side          -- Opposite sides
ORDER BY t1.ticker, t1.trade_time;

Self-joins are how compliance teams find suspicious trading patterns. “Did anyone buy and sell the same stock on the same day?” is a query that compliance runs more often than you think.

Window Functions: The Power Tool¶

Window functions are where SQL goes from “useful” to “incredibly powerful.” They let you perform calculations across related rows without collapsing the result set (unlike GROUP BY, which reduces rows).

Running Total of Daily Notional¶

SELECT
    trade_date,
    trade_time,
    trader,
    ticker,
    notional,
    SUM(notional) OVER (
        PARTITION BY trade_date
        ORDER BY trade_time
    ) AS running_total
FROM trades
ORDER BY trade_date, trade_time;

trade_date	trade_time	trader	ticker	notional	running_total
2026-03-28	09:31:00	Alice	AAPL	148,250	148,250
2026-03-28	09:45:00	Alice	MSFT	187,900	336,150
2026-03-28	10:12:00	Bob	GOOGL	45,720	381,870
2026-03-28	11:03:00	Alice	AAPL	74,550	456,420
2026-03-28	14:22:00	Charlie	TLT	191,000	647,420
2026-03-28	15:15:00	Bob	NVDA	165,060	812,480

Ranking Traders¶

SELECT
    trader,
    SUM(notional) AS total_notional,
    RANK() OVER (ORDER BY SUM(notional) DESC) AS rank,
    ROUND(
        SUM(notional) * 100.0 / SUM(SUM(notional)) OVER (),
        1
    ) AS pct_of_total
FROM trades
GROUP BY trader
ORDER BY rank;

trader	total_notional	rank	pct_of_total
David	780,675	1	41.7%
Alice	531,660	2	28.4%
Charlie	287,200	3	15.3%
Bob	272,820	4	14.6%

Moving Average of Stock Prices¶

-- 5-day moving average of closing prices
SELECT
    ticker,
    trade_date,
    close_price,
    ROUND(
        AVG(close_price) OVER (
            PARTITION BY ticker
            ORDER BY trade_date
            ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
        ), 4
    ) AS ma_5day,
    ROUND(
        AVG(close_price) OVER (
            PARTITION BY ticker
            ORDER BY trade_date
            ROWS BETWEEN 19 PRECEDING AND CURRENT ROW
        ), 4
    ) AS ma_20day
FROM market_data
WHERE ticker = 'AAPL'
ORDER BY trade_date DESC
LIMIT 10;

LAG and LEAD: Looking at Adjacent Rows¶

-- Daily price change and percentage change
SELECT
    ticker,
    trade_date,
    close_price,
    LAG(close_price) OVER (
        PARTITION BY ticker ORDER BY trade_date
    ) AS prev_close,
    ROUND(
        close_price - LAG(close_price) OVER (
            PARTITION BY ticker ORDER BY trade_date
        ), 4
    ) AS daily_change,
    ROUND(
        (close_price - LAG(close_price) OVER (
            PARTITION BY ticker ORDER BY trade_date
        )) / LAG(close_price) OVER (
            PARTITION BY ticker ORDER BY trade_date
        ) * 100, 2
    ) AS pct_change
FROM market_data
WHERE ticker IN ('AAPL', 'MSFT')
ORDER BY ticker, trade_date DESC;

Key Insight: Window functions are the single most important SQL feature for financial analysis. They let you compute running totals, moving averages, rankings, percentiles, and row comparisons without subqueries or self-joins. If you learn one advanced SQL feature, make it window functions.

Before window functions existed, computing a running total required a self-join or a correlated subquery. Both were ugly and slow. Window functions turned a 15-line query into a 3-line query. The person who invented them deserves a Nobel Prize, or at least a really nice dinner.

CTEs: Common Table Expressions¶

CTEs (the WITH clause) let you name a subquery and reference it multiple times. They make complex queries readable and maintainable.

Daily P&L Report¶

WITH daily_buys AS (
    SELECT
        trade_date,
        ticker,
        SUM(quantity) AS bought_qty,
        SUM(notional) AS bought_notional,
        ROUND(SUM(notional) / NULLIF(SUM(quantity), 0), 4) AS avg_buy_price
    FROM trades
    WHERE side = 'BUY'
    GROUP BY trade_date, ticker
),
daily_sells AS (
    SELECT
        trade_date,
        ticker,
        SUM(quantity) AS sold_qty,
        SUM(notional) AS sold_notional,
        ROUND(SUM(notional) / NULLIF(SUM(quantity), 0), 4) AS avg_sell_price
    FROM trades
    WHERE side = 'SELL'
    GROUP BY trade_date, ticker
)
SELECT
    COALESCE(b.trade_date, s.trade_date) AS trade_date,
    COALESCE(b.ticker, s.ticker) AS ticker,
    COALESCE(b.bought_qty, 0) AS bought,
    COALESCE(s.sold_qty, 0) AS sold,
    COALESCE(b.bought_qty, 0) - COALESCE(s.sold_qty, 0) AS net_quantity,
    COALESCE(b.bought_notional, 0) AS buy_value,
    COALESCE(s.sold_notional, 0) AS sell_value,
    COALESCE(s.sold_notional, 0) - COALESCE(b.bought_notional, 0) AS realized_pnl
FROM daily_buys b
FULL OUTER JOIN daily_sells s
    ON b.trade_date = s.trade_date
    AND b.ticker = s.ticker
ORDER BY trade_date, ticker;

Position Builder¶

Build current positions from the trade history:

WITH trade_flows AS (
    SELECT
        ticker,
        SUM(CASE WHEN side = 'BUY' THEN quantity ELSE -quantity END) AS net_quantity,
        SUM(CASE WHEN side = 'BUY' THEN notional ELSE -notional END) AS net_cost
    FROM trades
    GROUP BY ticker
),
valued_positions AS (
    SELECT
        tf.ticker,
        tf.net_quantity,
        ROUND(tf.net_cost / NULLIF(tf.net_quantity, 0), 4) AS avg_cost,
        m.close_price AS current_price,
        ROUND(tf.net_quantity * m.close_price, 2) AS market_value,
        ROUND(tf.net_quantity * m.close_price - tf.net_cost, 2) AS unrealized_pnl
    FROM trade_flows tf
    LEFT JOIN (
        SELECT DISTINCT ON (ticker)
            ticker, close_price
        FROM market_data
        ORDER BY ticker, trade_date DESC
    ) m ON tf.ticker = m.ticker
    WHERE tf.net_quantity != 0
)
SELECT
    ticker,
    net_quantity,
    avg_cost,
    current_price,
    market_value,
    unrealized_pnl,
    ROUND(unrealized_pnl / NULLIF(ABS(net_quantity * avg_cost), 0) * 100, 2)
        AS pnl_pct
FROM valued_positions
ORDER BY ABS(market_value) DESC;

Key Insight: CTEs are not just syntactic sugar. They make your intent explicit. A reader can look at daily_buys and daily_sells and immediately understand the structure, instead of mentally parsing nested subqueries three levels deep. In finance, where queries get audited by risk and compliance, readability is not a luxury. It is a requirement.

If your query is so complex that you cannot explain it to a colleague in 30 seconds, break it into CTEs. Each CTE should answer one clear question. “What did we buy?” “What did we sell?” “What is the net?” Three simple questions instead of one incomprehensible query.

Real-World Queries: The Ones You Actually Need¶

Query 1: VWAP (Volume-Weighted Average Price)¶

VWAP is the gold standard benchmark for execution quality. It tells you the average price at which a stock traded, weighted by volume.

SELECT
    ticker,
    trade_date,
    ROUND(
        SUM(price * quantity) / NULLIF(SUM(quantity), 0),
        4
    ) AS vwap,
    SUM(quantity) AS total_volume,
    MIN(price) AS low_price,
    MAX(price) AS high_price
FROM trades
WHERE trade_date = '2026-03-28'
GROUP BY ticker, trade_date
ORDER BY total_volume DESC;

ticker	trade_date	vwap	total_volume	low_price	high_price
TLT	2026-03-28	95.5000	2000	95.50	95.50
AAPL	2026-03-28	148.5333	1500	148.25	149.10
MSFT	2026-03-28	375.8000	500	375.80	375.80
GOOGL	2026-03-28	152.4000	300	152.40	152.40
NVDA	2026-03-28	825.3000	200	825.30	825.30

If a trader bought AAPL at $149.10 and the VWAP was $148.53, they paid more than the market average. That is 57 cents of “execution slippage.” On 500 shares, that is $285 they left on the table. Execution desks live and die by these numbers.

Query 2: Largest Trades (Outlier Detection)¶

-- Find trades that are more than 3 standard deviations above the mean notional
WITH trade_stats AS (
    SELECT
        AVG(notional) AS mean_notional,
        STDDEV(notional) AS std_notional
    FROM trades
)
SELECT
    t.trade_id,
    t.trade_date,
    t.trader,
    t.ticker,
    t.notional,
    ROUND((t.notional - ts.mean_notional) / ts.std_notional, 2) AS z_score
FROM trades t
CROSS JOIN trade_stats ts
WHERE t.notional > ts.mean_notional + 3 * ts.std_notional
ORDER BY t.notional DESC;

Query 3: Trader Activity Heatmap Data¶

-- Hourly trading activity breakdown
SELECT
    trader,
    EXTRACT(HOUR FROM trade_time) AS hour,
    COUNT(*) AS num_trades,
    SUM(notional) AS total_notional
FROM trades
GROUP BY trader, EXTRACT(HOUR FROM trade_time)
ORDER BY trader, hour;

Query 4: Concentration Risk¶

-- What percentage of total exposure is in each stock?
WITH total AS (
    SELECT SUM(ABS(quantity * current_price)) AS total_exposure
    FROM positions
)
SELECT
    p.ticker,
    p.quantity,
    ABS(p.quantity * p.current_price) AS exposure,
    ROUND(
        ABS(p.quantity * p.current_price) / t.total_exposure * 100,
        2
    ) AS pct_of_total
FROM positions p
CROSS JOIN total t
ORDER BY exposure DESC;

If one stock is 40% of your total exposure, you do not have a portfolio. You have a stock and some friends. Concentration risk queries keep risk managers employed and traders annoyed.

Performance: Making Queries Not Terrible¶

When your trades table has 50 million rows, a badly written query can take minutes. Here is how to keep things fast.

Indexes¶

Indexes are the single biggest performance lever in SQL. They are like the index at the back of a textbook: instead of reading every page, you jump straight to the right one.

-- Essential indexes for a trades table
CREATE INDEX idx_trades_date ON trades (trade_date);
CREATE INDEX idx_trades_ticker ON trades (ticker);
CREATE INDEX idx_trades_trader ON trades (trader);
CREATE INDEX idx_trades_date_ticker ON trades (trade_date, ticker);

-- Essential indexes for market_data
CREATE INDEX idx_market_date ON market_data (trade_date);
CREATE INDEX idx_market_ticker_date ON market_data (ticker, trade_date);

EXPLAIN: Understanding Query Performance¶

EXPLAIN ANALYZE
SELECT ticker, SUM(notional)
FROM trades
WHERE trade_date = '2026-03-28'
GROUP BY ticker;

This shows you the query execution plan: which indexes are used, how many rows are scanned, and where the time is spent. If you see “Seq Scan” on a large table, you need an index.

Performance Rules of Thumb¶

Do This	Not This	Why
Filter with indexed columns in WHERE	Wrap indexed columns in functions	`WHERE trade_date = '2026-03-28'` uses the index. `WHERE DATE(trade_time) = '2026-03-28'` does not.
Use specific columns in SELECT	Use SELECT *	Fetching all columns reads more data from disk.
Use EXISTS for existence checks	Use COUNT(*) > 0	EXISTS stops at the first match. COUNT scans everything.
Join on indexed columns	Join on computed expressions	The optimizer can use indexes for direct column joins.
Use LIMIT for exploratory queries	Fetch millions of rows to “take a quick look”	Your query returns 47 million rows. Do you really need all of them?

Key Insight: The difference between a query that takes 50 milliseconds and one that takes 5 minutes is almost always an index. A composite index on (trade_date, ticker) can turn a full table scan into an index-only scan. Always index the columns you filter and join on.

Common Pitfalls¶

Dividing by zero. Financial data loves to have zero quantities, zero prices, and empty positions. Always use NULLIF(denominator, 0) to prevent division-by-zero errors. price / NULLIF(quantity, 0) returns NULL instead of crashing.
Forgetting NULL handling. NULL is not zero. NULL is not empty string. NULL is “unknown.” Any arithmetic with NULL returns NULL. Use COALESCE(value, 0) when you need a default.
Floating-point precision. Never use FLOAT for money. Use DECIMAL(12, 4) or NUMERIC. Floating-point arithmetic can turn $100.00 into $99.99999999 and cause P&L reconciliation nightmares.
Not using transactions. When updating positions or inserting related records, always wrap the operations in a transaction (BEGIN ... COMMIT). Otherwise a crash halfway through can leave your data inconsistent.
Ignoring time zones. A trade executed at “3 PM” could be New York, London, or Tokyo time. Always store timestamps in UTC and convert for display. Financial data across time zones without UTC is a debugging nightmare.
Using SELECT * in production queries. It fetches every column, breaks when schema changes, and makes the query optimizer’s life harder. Always list the specific columns you need.

Wrapping Up¶

SQL is the unsung hero of finance technology. Every trade, every position, every risk calculation, and every regulatory report ultimately comes from a SQL query hitting a database. The language itself is deceptively simple (you can learn the basics in an afternoon), but mastering the patterns that matter in finance (window functions, CTEs, aggregation, proper JOINs) is what separates “I can write a SELECT statement” from “I can build the query that tells the risk committee exactly how exposed we are.”

Start with the basics: SELECT, WHERE, GROUP BY, JOIN. Then learn window functions, because they change everything. Then learn CTEs, because they make complex queries readable. Then learn EXPLAIN, because performance matters when your table has 50 million rows.

And always remember the golden rule: use DECIMAL for money, NULLIF for division, and UTC for timestamps. Break any of these rules and you will spend your weekend debugging a reconciliation break instead of enjoying it.

SQL is not sexy. SQL is not trending. SQL has not changed much since 1986. But SQL is running every financial institution on the planet, and it will still be running them long after the trendy frameworks have come and gone. Learn it well.

Cheat Sheet¶

Key Questions & Answers¶

What SQL features matter most for financial data analysis?¶

Window functions (running totals, moving averages, ranking), CTEs (readable multi-step queries), GROUP BY with HAVING (aggregation and filtering), and JOINs (combining trade, market, and position data). These four features cover 90% of financial SQL work.

Why should I use DECIMAL instead of FLOAT for financial data?¶

FLOAT uses binary floating-point arithmetic, which cannot represent many decimal fractions exactly. This leads to rounding errors like $99.99999999 instead of $100.00. DECIMAL stores exact decimal values. In finance, a fraction of a cent times millions of trades adds up to real money.

How do I compute VWAP in SQL?¶

SUM(price * quantity) / SUM(quantity). This gives the volume-weighted average price. Always use NULLIF(SUM(quantity), 0) in the denominator to handle edge cases where total volume is zero.

What is the difference between WHERE and HAVING?¶

WHERE filters individual rows before aggregation. HAVING filters groups after aggregation. Use WHERE for conditions on raw data (e.g., WHERE trade_date = '2026-03-28'). Use HAVING for conditions on aggregated results (e.g., HAVING SUM(notional) > 1000000).

Key Concepts at a Glance¶

Concept	Summary
SELECT … FROM … WHERE	Basic query: choose columns, table, and filter conditions
GROUP BY + aggregate functions	Summarize data: COUNT, SUM, AVG, MIN, MAX
HAVING	Filter after aggregation (WHERE filters before)
INNER JOIN	Only rows that match in both tables
LEFT JOIN	All rows from left table, matching rows from right (NULL if no match)
FULL OUTER JOIN	All rows from both tables, NULLs where no match
Window functions	Calculations across related rows: SUM() OVER, RANK(), LAG(), LEAD()
PARTITION BY	Divides rows into groups for window function calculation
CTE (WITH clause)	Named subquery for readability and reuse
VWAP	SUM(price * quantity) / SUM(quantity)
NULLIF(x, 0)	Returns NULL instead of dividing by zero
COALESCE(x, default)	Returns the first non-NULL value
DECIMAL(12,4)	Exact decimal type for financial amounts
INDEX	Speeds up queries by creating lookup structures on columns
EXPLAIN ANALYZE	Shows the query execution plan and actual timings
UTC timestamps	Always store times in UTC, convert for display

Sources & Further Reading¶

Date, C.J., SQL and Relational Theory, O’Reilly
Molinaro, A., SQL Cookbook, O’Reilly
PostgreSQL Documentation, Window Functions
PostgreSQL Documentation, Queries with WITH (CTEs)
Mode Analytics, SQL Tutorial
SQLZoo, Interactive SQL Tutorial
Investopedia, Volume-Weighted Average Price (VWAP)
Use The Index, Luke, SQL Indexing and Tuning
Khan Academy, Intro to SQL

The Financial Database: What You Are Working With
The Basics: SELECT, WHERE, ORDER BY
Aggregations: GROUP BY
JOINs: Combining Tables
Window Functions: The Power Tool
CTEs: Common Table Expressions
- Daily P&L Report
- Position Builder
Real-World Queries: The Ones You Actually Need
Performance: Making Queries Not Terrible
Common Pitfalls
Wrapping Up
Cheat Sheet
- Key Questions & Answers
- Key Concepts at a Glance
Sources & Further Reading