The anatomy of a QR code
A QR code is not just a random grid of black and white squares. Every region has a specific function. Zoom into any QR code and you will find the same structural elements:
- Finder patterns — the three large squares in the top-left, top-right, and bottom-left corners. These are how the scanner locates and orients the code, regardless of the angle or rotation it is scanned from. Their distinctive 1:1:3:1:1 ratio is detectable even in distorted or partially obscured images.
- Timing patterns — alternating black and white dots running horizontally and vertically between the finder patterns. They tell the decoder the size of each module (individual square) in the grid.
- Alignment patterns — smaller squares scattered throughout larger QR codes (version 2+). They help the decoder compensate for distortion when the code is printed on a curved surface or photographed at an angle.
- Format information — two strips of modules adjacent to the finder patterns that encode the error correction level and mask pattern used.
- Data and error correction region — everything else. The actual payload you encoded, plus redundant error correction codewords.
Versions and sizes
QR codes come in 40 "versions" — not software versions, but size variants. A Version 1 code is 21×21 modules. Each version adds 4 modules per side:
Version 1: 21×21 modules — up to 41 numeric characters Version 5: 37×37 modules — up to 154 numeric characters Version 10: 57×57 modules — up to 652 numeric characters Version 20: 97×97 modules — up to 2,953 numeric characters Version 40: 177×177 modules — up to 7,089 numeric characters
The version is chosen automatically based on the amount of data and the error correction level. More data means a larger, denser code. Higher error correction means more of the module space is used for redundancy, so you need a higher version to fit the same payload.
Data encoding modes
QR codes do not encode all data the same way. The encoder picks the most efficient mode for the character set of your input:
Numeric mode: digits 0-9 only
3 chars packed into 10 bits — most efficient
Up to 7,089 chars in Version 40
Alphanumeric mode: 0-9, A-Z (uppercase only), space, $%*+-./:
2 chars packed into 11 bits
Up to 4,296 chars in Version 40
Byte mode: any byte (UTF-8 URLs, text)
8 bits per character
Up to 2,953 chars in Version 40
Kanji mode: Shift JIS double-byte characters
13 bits per character
Up to 1,817 chars in Version 40This is why URLs in QR codes should ideally use uppercase letters and allowed characters — they can use the more efficient alphanumeric mode instead of byte mode. In practice, most URLs contain lowercase letters, so byte mode (UTF-8) is used. A URL shortener pointing to a case-folded domain can meaningfully reduce QR code size.
Reed-Solomon error correction: why damaged QR codes still scan
The most impressive engineering in a QR code is its error correction. QR codes use Reed-Solomon error correction — the same mathematics used in CDs, DVDs, hard drives, and deep-space communications. There are four levels:
Level L — 7% of data codewords can be restored Level M — 15% of data codewords can be restored Level Q — 25% of data codewords can be restored Level H — 30% of data codewords can be restored
Reed-Solomon works by treating the data as a polynomial and computing additional "check" values that allow reconstruction of any damaged portions. It is not simple redundancy (copying data twice) — it is algebraic coding that can identify which codewords are corrupted and reconstruct the originals from the undamaged portions.
This is why you can print a QR code, spill coffee on 20% of it, and still scan it successfully. It is also why QR codes with logos in the center work — the logo obscures some modules, but the error correction reconstructs them. At Level H, up to 30% of the code can be destroyed. This is engineered headroom, not a bug.
Masking patterns: the secret to reliable scanning
After encoding data, the QR generator applies one of eight masking patterns — XOR operations that flip specific modules. This might seem counterproductive, but it solves a real problem: if your data happens to encode as a large block of all-black or all-white modules, or a pattern that looks like a finder pattern, the scanner gets confused. The masking pattern breaks up these problematic regularities. The encoder tries all eight masks, scores each one for "badness" (large uniform regions, patterns resembling finder patterns), and picks the lowest-scoring result. The chosen mask ID is stored in the format information strip so the decoder knows which mask to reverse. This automatic optimization is why QR codes almost always look like a fairly balanced salt-and-pepper pattern rather than large uniform regions.
Why this engineering matters for developers
Understanding QR code structure matters for practical decisions. The tradeoff between error correction level and QR code size is real: choosing Level H for a logo overlay produces a significantly larger, denser code that requires a better camera and more careful framing to scan reliably. If your QR code will be printed large and scanned at close range, Level H makes sense. For a postage-stamp-sized code on a product label, Level M is the right compromise. The quiet zone (the white border around the code) is not decorative — scanners need it to locate the finder patterns. Crop it out and the scan fails. These are not edge cases. They are the decisions that determine whether a printed QR code works in the field.
Generate a QR code
QR Code Generator — create QR codes for URLs, WiFi, and more →