StudyComputer SciencePython

NumPy Crash Course - The Math Engine Behind Everything

2026-04-13 17 min read Computer Science
Cover image for article: NumPy Crash Course - The Math Engine Behind Everything

NumPy Crash Course: The Math Engine Behind Everything

If you have ever used Python for anything involving numbers, you have used NumPy, whether you knew it or not. Pandas is built on NumPy. Scikit-learn is built on NumPy. TensorFlow, PyTorch, SciPy, Matplotlib, and roughly 90% of the Python scientific ecosystem either depend on NumPy directly or reimplement parts of it. If you are doing quantitative finance in Python and you do not understand NumPy, you are writing code against the grain of the entire ecosystem.

NumPy started in 2005 as a merger of two older libraries (Numeric and Numarray) and quickly became the foundation of scientific Python. Its superpower is deceptively simple: it gives you an N-dimensional array type that is memory-contiguous, type-homogeneous, and fast. All the math happens in C code. Your Python is just a thin interface. The result is performance that rivals compiled languages for numerical work, with a syntax that stays readable.

This article covers the NumPy you actually use: array creation, indexing, broadcasting, vectorization, linear algebra, and random number generation. Once you see how it works, you will understand why “loops are slow in Python” and what to do about it.

Python has many libraries. NumPy is the one that makes the others possible.


Why NumPy Exists

Python lists are flexible. You can put anything in them: integers, strings, other lists, objects. That flexibility has a cost. Each element is a full Python object, stored somewhere in memory, with a pointer from the list.

For 1 million integers, a Python list stores:
- 1 million integer objects (each ~28 bytes)
- 1 million pointers (8 bytes each)
- Plus list overhead

Total: around 36 MB for a million 8-byte integers.

NumPy stores the same data as a contiguous C array of 8 bytes per integer: 8 MB total. Four times less memory. But the real win is speed.

The Loop vs Vectorization Example

import numpy as np
import time

n = 10_000_000
python_list = list(range(n))
numpy_array = np.arange(n)

# Python list comprehension
start = time.time()
result = [x * 2 + 1 for x in python_list]
print(f"Python list: {time.time() - start:.3f} seconds")

# NumPy vectorized
start = time.time()
result = numpy_array * 2 + 1
print(f"NumPy array: {time.time() - start:.3f} seconds")

Typical output:

Python list: 0.842 seconds
NumPy array: 0.015 seconds

56x faster. For 10 million elements. With shorter, more readable code.

Key Insight: NumPy’s speed advantage comes from two sources. First, it stores data as contiguous memory blocks, which is cache-friendly. Second, its operations are implemented in C and use CPU vector instructions (SIMD). When you write array * 2, you are dispatching to a highly optimized C loop, not iterating in Python.

Writing a Python for loop over 10 million numbers is like trying to empty a swimming pool with a teaspoon. NumPy hands you a firefighter’s hose.


Creating Arrays

From Python Lists

import numpy as np

a = np.array([1, 2, 3, 4, 5])
b = np.array([[1, 2, 3], [4, 5, 6]])  # 2D array
c = np.array([1.0, 2.0, 3.0], dtype=np.float32)

Pre-Sized Arrays

np.zeros(10)              # Ten zeros
np.ones((3, 3))           # 3x3 matrix of ones
np.full((2, 4), 7)        # 2x4 matrix of sevens
np.empty(5)               # Uninitialized (fastest, but contents are garbage)
np.eye(4)                 # 4x4 identity matrix

Ranges

np.arange(0, 10, 2)       # [0, 2, 4, 6, 8]
np.linspace(0, 1, 11)     # 11 evenly spaced values from 0 to 1
np.logspace(0, 3, 4)      # [1, 10, 100, 1000]

Random Arrays

np.random.seed(42)
np.random.rand(3, 3)           # Uniform [0, 1), shape 3x3
np.random.randn(5)             # Standard normal
np.random.randint(0, 100, 10)  # 10 integers between 0 and 99
np.random.normal(loc=100, scale=20, size=1000)  # Custom normal

From Finance Data

# Monthly returns for 5 stocks over 2 years
returns = np.random.normal(0.008, 0.05, size=(24, 5))

# Or loaded from CSV via NumPy directly
data = np.loadtxt("prices.csv", delimiter=",", skiprows=1)

Array Attributes: Know Your Data

Every NumPy array has these essential attributes:

arr = np.random.randn(252, 10)  # 252 days, 10 stocks

arr.shape      # (252, 10) - dimensions
arr.ndim       # 2 - number of dimensions
arr.size       # 2520 - total number of elements
arr.dtype      # dtype('float64') - data type
arr.itemsize   # 8 - bytes per element
arr.nbytes     # 20160 - total memory

Data Types

Common dtypes for finance:

dtypeDescriptionUse Case
np.float6464-bit float (default)Almost everything
np.float3232-bit floatML models, saving memory
np.int6464-bit integerCounts, IDs
np.bool_BooleanMasks, conditions
np.datetime64Date/timeTime series

Changing dtype:

arr_float = arr.astype(np.float32)  # Convert to 32-bit float
arr_int = arr.astype(np.int64)      # Convert to integer (truncates)

Indexing and Slicing

Basic Indexing

NumPy extends Python’s slicing to N dimensions:

arr = np.arange(24).reshape(4, 6)
# [[ 0  1  2  3  4  5]
#  [ 6  7  8  9 10 11]
#  [12 13 14 15 16 17]
#  [18 19 20 21 22 23]]

arr[1]          # Row 1: [6, 7, 8, 9, 10, 11]
arr[1, 2]       # Element at row 1, column 2: 8
arr[:, 0]       # All rows, column 0: [0, 6, 12, 18]
arr[1:3, 2:5]   # Rows 1-2, columns 2-4
arr[::2, ::2]   # Every other row and column

Boolean Indexing

This is where NumPy really shines for data analysis:

returns = np.random.normal(0.001, 0.02, 1000)

# Positive returns
positive = returns[returns > 0]

# Large moves (> 2 standard deviations)
mean = returns.mean()
std = returns.std()
outliers = returns[np.abs(returns - mean) > 2 * std]

# Set all negative values to zero
returns[returns < 0] = 0

Fancy Indexing

Select specific elements using arrays of indices:

arr = np.arange(10) * 10  # [0, 10, 20, 30, 40, 50, 60, 70, 80, 90]

indices = [0, 2, 4, 6]
arr[indices]  # [0, 20, 40, 60]

# 2D example
arr2d = np.arange(20).reshape(4, 5)
rows = [0, 1, 3]
cols = [1, 2, 4]
arr2d[rows, cols]  # Elements at (0,1), (1,2), (3,4)

Key Insight: Boolean indexing and fancy indexing return copies, not views. Modifying the result does not modify the original. Basic slicing returns views, so modifying the slice modifies the original. This distinction trips up new users constantly.


Broadcasting: The Hidden Superpower

Broadcasting is how NumPy handles operations between arrays of different shapes. It is the feature that makes NumPy code so concise.

The Rules

Two arrays are “broadcastable” if, when you align their shapes from the right:
1. The dimensions are equal, OR
2. One of the dimensions is 1

NumPy “stretches” the dimension of size 1 to match the other.

Simple Examples

# Scalar + array: scalar broadcasts to every element
np.array([1, 2, 3]) + 10  # [11, 12, 13]

# Array + smaller array
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])  # shape (3, 3)
b = np.array([10, 20, 30])                         # shape (3,)
a + b
# [[11, 22, 33],
#  [14, 25, 36],
#  [17, 28, 39]]

Here, b is broadcast to each row of a.

Column vs Row Broadcasting

# Row vector (1, 3)
row = np.array([[10, 20, 30]])

# Column vector (3, 1)
col = np.array([[1], [2], [3]])

# Broadcast: outer product-like behavior
row + col
# [[11, 21, 31],
#  [12, 22, 32],
#  [13, 23, 33]]

Finance Example: Portfolio Returns

Suppose you have daily returns for 10 stocks (matrix of shape (252, 10)) and want weighted portfolio returns:

returns = np.random.normal(0.001, 0.02, size=(252, 10))
weights = np.array([0.15, 0.10, 0.10, 0.10, 0.05, 0.10, 0.10, 0.10, 0.10, 0.10])

# Broadcast: weights (10,) multiplied against each row of returns (252, 10)
weighted = returns * weights

# Portfolio daily return: sum across stocks
portfolio_returns = weighted.sum(axis=1)  # shape (252,)

Broadcasting is NumPy’s equivalent of Excel’s $ anchoring. Once you get used to it, you write calculations in your head, not on paper.


Vectorization: The Core Philosophy

“Vectorization” means replacing Python loops with array operations. This is the single most important performance technique in scientific Python.

The Wrong Way

# Bad: Python loop
returns = np.random.normal(0.001, 0.02, 1000000)
squared = np.empty_like(returns)
for i in range(len(returns)):
    squared[i] = returns[i] ** 2

The Right Way

# Good: vectorized
squared = returns ** 2

Universal Functions (ufuncs)

NumPy provides element-wise versions of every common math function:

np.sqrt(arr)        # Square root
np.exp(arr)         # e^x
np.log(arr)         # Natural log
np.sin(arr)         # Sine
np.abs(arr)         # Absolute value
np.maximum(a, b)    # Element-wise max
np.minimum(a, b)    # Element-wise min
np.round(arr, 2)    # Round to 2 decimal places
np.clip(arr, 0, 1)  # Clip to range

All of these operate in compiled C, on entire arrays, in one call.

Finance Example: Log Returns

prices = np.array([100, 101.5, 99.8, 102.3, 103.7])

# Simple returns
returns = prices[1:] / prices[:-1] - 1

# Log returns (vectorized)
log_returns = np.log(prices[1:] / prices[:-1])

# Cumulative wealth from log returns
cumulative_wealth = np.exp(log_returns.cumsum())

Three one-liners do what would take a dozen lines in a loop.


Reductions and Aggregations

Reduce an array along an axis to compute statistics.

Basic Reductions

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

arr.sum()           # 45 (all elements)
arr.sum(axis=0)     # [12, 15, 18] (sum of each column)
arr.sum(axis=1)     # [6, 15, 24] (sum of each row)

arr.mean()          # 5.0
arr.std()           # Standard deviation
arr.var()           # Variance
arr.min()           # 1
arr.max()           # 9
arr.argmin()        # Index of minimum: 0
arr.argmax()        # Index of maximum: 8
arr.cumsum()        # Cumulative sum
arr.cumprod()       # Cumulative product

Axis Matters

returns = np.random.normal(0.001, 0.02, size=(252, 10))  # 252 days, 10 stocks

# Average daily return per stock (across all days)
avg_per_stock = returns.mean(axis=0)       # shape (10,)

# Average daily return per day (across all stocks)
avg_per_day = returns.mean(axis=1)         # shape (252,)

# Grand average
grand_avg = returns.mean()                 # scalar

Finance Example: Annualized Volatility

daily_returns = np.random.normal(0.0008, 0.012, size=(252, 5))

# Per-stock annualized volatility
annual_vol = daily_returns.std(axis=0) * np.sqrt(252)
print(f"Annual volatilities: {annual_vol}")

# Cross-sectional mean (average across stocks, per day)
equal_weight_portfolio = daily_returns.mean(axis=1)

Linear Algebra

NumPy’s linalg module handles matrix operations needed for portfolio theory, regression, PCA, and more.

Matrix Multiplication

A = np.random.randn(3, 4)
B = np.random.randn(4, 2)

C = A @ B               # Matrix multiplication (Python 3.5+)
C = np.dot(A, B)        # Equivalent
# Shape: (3, 2)

Common Operations

# Transpose
A.T

# Inverse
np.linalg.inv(A)  # A must be square

# Determinant
np.linalg.det(A)

# Eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

# Solve Ax = b
x = np.linalg.solve(A, b)

# Singular Value Decomposition
U, S, Vt = np.linalg.svd(A)

# Cholesky decomposition (for Monte Carlo)
L = np.linalg.cholesky(cov_matrix)

Finance Example: Portfolio Optimization (Markowitz)

# 5 stocks, covariance matrix
cov = np.array([
    [0.04, 0.02, 0.01, 0.00, 0.01],
    [0.02, 0.05, 0.02, 0.01, 0.01],
    [0.01, 0.02, 0.03, 0.01, 0.01],
    [0.00, 0.01, 0.01, 0.02, 0.01],
    [0.01, 0.01, 0.01, 0.01, 0.03]
])

expected_returns = np.array([0.10, 0.12, 0.08, 0.06, 0.09])

# Minimum variance portfolio (analytical solution)
inv_cov = np.linalg.inv(cov)
ones = np.ones(5)
weights_min_var = inv_cov @ ones / (ones @ inv_cov @ ones)
print(f"Min variance weights: {weights_min_var}")
print(f"Sum of weights: {weights_min_var.sum():.4f}")

# Portfolio variance
portfolio_variance = weights_min_var @ cov @ weights_min_var
print(f"Portfolio std: {np.sqrt(portfolio_variance):.4f}")

That is Markowitz portfolio optimization in 5 lines. The math behind hundreds of research papers and trillions of dollars of portfolio allocations.


Random Number Generation

For Monte Carlo, simulations, and bootstrapping.

Modern API (preferred)

# Create a generator with a seed
rng = np.random.default_rng(seed=42)

# Various distributions
uniform = rng.uniform(0, 1, size=1000)
normal = rng.normal(loc=0, scale=1, size=1000)
lognormal = rng.lognormal(mean=0, sigma=1, size=1000)
exponential = rng.exponential(scale=1, size=1000)
binomial = rng.binomial(n=100, p=0.5, size=1000)

# Multivariate normal (for correlated returns)
mean = np.array([0.01, 0.02, 0.015])
cov = np.array([[0.04, 0.01, 0.02],
                [0.01, 0.09, 0.03],
                [0.02, 0.03, 0.06]])
samples = rng.multivariate_normal(mean, cov, size=10000)

Finance Example: Monte Carlo for Asian Option

rng = np.random.default_rng(42)
n_simulations = 100_000
n_steps = 252  # Daily steps in a year

S0 = 100        # Initial price
K = 100         # Strike
T = 1.0         # Time to maturity
r = 0.05        # Risk-free rate
sigma = 0.2     # Volatility

dt = T / n_steps

# Generate random shocks: shape (n_simulations, n_steps)
z = rng.standard_normal((n_simulations, n_steps))

# Build price paths using vectorized operations
log_returns = (r - 0.5 * sigma ** 2) * dt + sigma * np.sqrt(dt) * z
log_paths = np.cumsum(log_returns, axis=1)
paths = S0 * np.exp(log_paths)

# Average price along each path
avg_prices = paths.mean(axis=1)

# Asian call payoff
payoffs = np.maximum(avg_prices - K, 0)

# Discount and average
asian_call = np.exp(-r * T) * payoffs.mean()
print(f"Asian call price: ${asian_call:.4f}")

100,000 Monte Carlo simulations with 252 time steps each. 25.2 million random numbers. Entire calculation runs in under a second.

Key Insight: The modern np.random.default_rng() API is preferred over the legacy np.random.seed() + np.random.normal() style. It is faster, has better statistical properties, and allows multiple independent generators. Always use the modern API in new code.


Reshaping and Stacking

Reshape

arr = np.arange(12)
arr.reshape(3, 4)        # 3 rows, 4 columns
arr.reshape(2, 2, 3)     # 3D array
arr.reshape(-1, 4)       # -1 means "figure it out"

Flatten

matrix = np.array([[1, 2, 3], [4, 5, 6]])
matrix.flatten()  # [1, 2, 3, 4, 5, 6]
matrix.ravel()    # Same, but returns a view when possible (faster)

Stacking

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

np.vstack([a, b])   # Vertical: shape (2, 3)
np.hstack([a, b])   # Horizontal: shape (6,)
np.column_stack([a, b])  # As columns: shape (3, 2)

# For 2D arrays
A = np.ones((3, 2))
B = np.zeros((3, 2))
np.hstack([A, B])   # shape (3, 4)
np.vstack([A, B])   # shape (6, 2)

Transposing

arr.T               # Transpose (for 2D)
arr.transpose()     # Same thing
arr.swapaxes(0, 1)  # Swap specific axes

Performance Tips

1. Preallocate, Do Not Append

# Bad: growing the array is O(n^2)
result = np.array([])
for i in range(1000):
    result = np.append(result, i)

# Good: preallocate
result = np.empty(1000)
for i in range(1000):
    result[i] = i

# Best: vectorize
result = np.arange(1000)

2. Use Appropriate Data Types

# If values fit, float32 is half the memory and often faster
arr_big = np.random.randn(1_000_000).astype(np.float32)

3. Avoid Creating Intermediate Arrays

# Creates intermediate array: x^2
y = np.sin(x ** 2)

# Avoid intermediate with out= parameter
y = np.empty_like(x)
np.multiply(x, x, out=y)
np.sin(y, out=y)

4. Use numexpr or numba for Complex Expressions

import numexpr as ne

# For large arrays, numexpr can be 2-10x faster
result = ne.evaluate("sin(x) * exp(-x**2)")

5. Know When to Leave NumPy

For GPU: use CuPy (drop-in NumPy replacement on NVIDIA GPUs).
For sparse arrays: use scipy.sparse.
For out-of-core: use Dask.
For SIMD: sometimes you need to write C extensions.


Common Pitfalls

  1. Views vs copies. Basic slicing returns a view (modifications affect the original). Boolean/fancy indexing returns a copy. Use .copy() explicitly when in doubt.

  2. Broadcasting errors. When two arrays have mismatched shapes, NumPy tries to broadcast. If it cannot, you get an error. Always check shapes before operations.

  3. Integer overflow. np.int8 holds values -128 to 127. Arithmetic can overflow silently. Use int64 for anything that might grow.

  4. NaN propagation. Any operation with NaN produces NaN. np.nan == np.nan is False. Use np.isnan() to detect.

  5. Axis confusion. axis=0 goes “down” (across rows, collapsing to a row). axis=1 goes “across” (through columns, collapsing to a column). This confuses everyone. Always double-check with a small example.

  6. Using Python math on NumPy arrays. math.sqrt(arr) errors on arrays. Use np.sqrt(arr).


Wrapping Up

NumPy is the foundation. Pandas sits on it. Scikit-learn sits on it. Every serious numerical Python library leverages it. If you are writing Python for finance, machine learning, or scientific computing, your productivity is roughly proportional to your NumPy fluency.

The core ideas: arrays are memory-contiguous, operations are vectorized, broadcasting eliminates explicit loops, and reductions happen in C. Master those four concepts and you have 80% of what you need. The remaining 20% (linear algebra, random numbers, reshaping) is reference material you look up when needed.

Vectorized NumPy code is often shorter than equivalent loopy code, and usually 10-100x faster. It reads like mathematical notation. Once you start thinking in arrays, you will never go back.

NumPy makes Python fast enough for most things. Where it is not, you have bigger problems than language choice.


Cheat Sheet

Key Questions & Answers

Why is NumPy faster than Python lists?

NumPy arrays store data as contiguous C arrays (not Python objects) and operate on them in compiled C code. Python lists store pointers to Python objects scattered across memory. For numerical work, NumPy is typically 10-100x faster due to better memory layout, SIMD instructions, and elimination of Python interpreter overhead per element.

What is broadcasting?

A mechanism that allows NumPy to apply operations to arrays of different shapes. Dimensions of size 1 get “stretched” to match the other array’s size. It is what lets you write array + scalar or add a (3,) vector to a (252, 3) matrix without explicit loops.

When should I use @, *, or np.dot?

Use * for element-wise multiplication (Hadamard product). Use @ or np.dot for matrix multiplication. In modern code, @ is preferred. For 1D arrays, @ gives the inner product.

How do I simulate correlated random variables?

Use rng.multivariate_normal(mean, cov, size=n). Alternatively, generate uncorrelated samples and apply a Cholesky decomposition of the covariance matrix: L = np.linalg.cholesky(cov); samples = uncorrelated @ L.T. Both approaches produce samples with the specified covariance structure.

Key Concepts at a Glance

ConceptSummary
ndarrayNumPy’s N-dimensional array type
dtypeData type (float64, int32, bool, etc.)
shapeTuple of dimensions: (rows, cols) for 2D
VectorizationReplacing loops with array operations
BroadcastingAutomatic expansion of smaller arrays for operations
ufuncUniversal function (element-wise: sqrt, exp, sin, etc.)
axis0 = down columns; 1 = across rows; -1 = last axis
View vs copyBasic slicing: view. Boolean/fancy: copy.
np.arangeRange as array
np.linspaceEvenly spaced values
np.random.default_rngModern random number generator
Matrix multA @ B (Python 3.5+)
Element-wise multA * B
Reductionsarr.sum(axis=...), arr.mean(), arr.std()
Cumulative.cumsum(), .cumprod()
Reshape.reshape(new_shape)
Transpose.T
Choleskynp.linalg.cholesky(cov) for correlated samples
Solvenp.linalg.solve(A, b) instead of inv(A) @ b

Sources & Further Reading

  • Harris, C.R. et al. (2020), Array Programming with NumPy, Nature
  • McKinney, W., Python for Data Analysis, O’Reilly
  • VanderPlas, J., Python Data Science Handbook, O’Reilly
  • NumPy Documentation, User Guide
  • NumPy Documentation, NumPy for Matlab Users
  • Oliphant, T.E., Guide to NumPy, Continuum Press
  • Python Data Science, NumPy Tutorials
  • Quantitative Economics, NumPy
  • Real Python, NumPy Tutorial
  • SciPy Documentation, Linear Algebra

Browse Articles

C:\KortesHub\Study 31 file(s)
All_Articles31
4 folders 31 files 190 tags
PreviousCredit Default Swaps - The Derivative That Ate Wall Street