What Levenshtein distance measures
Levenshtein distance is the minimum number of single-character edits needed to transform one string into another. It answers questions like, "How many edits turn cat into cut?" and gives a numeric measure of similarity.
Allowed operations and costs
- Insert a character into the first string (cost 1).
- Delete a character from the first string (cost 1).
- Substitute one character for another (cost 0 if they match, else 1).
If you allow only these moves, the distance is the cheapest sequence of edits that turns string A into string B.
Why dynamic programming is used
Trying every possible edit sequence is expensive. Dynamic programming (DP) stores answers to smaller subproblems so each prefix pair is computed once, then reused.
DP matrix definition
Let dp[i][j] be the edit distance between the first i characters of a
and the first j characters of b. The final answer is dp[m][n] where
m = a.length and n = b.length.
| a \\ b | '' | c | u | t |
|---|---|---|---|---|
| '' | 0 | 1 | 2 | 3 |
| c | 1 | 0 | 1 | 2 |
| a | 2 | 1 | 1 | 2 |
| t | 3 | 2 | 2 | 1 |
Initialization rules
The first row and column compare a string with the empty prefix. It takes i deletions to turn a
length-i prefix into empty, and j insertions to build a length-j
prefix from empty.
Ready to see it? Jump to the simulator and press Start.
Recurrence formula
For each cell dp[i][j], pick the cheapest of deleting, inserting, or substituting the last
character. That minimum becomes the new value in the matrix.
dp[i][j] = min(
dp[i-1][j] + 1, // delete
dp[i][j-1] + 1, // insert
dp[i-1][j-1] + (a[i-1] == b[j-1] ? 0 : 1) // substitute
)
Run the simulator to see each dp[i][j] computed and highlighted in the grid.
Animated intuition
The next cell depends on three neighbors. The small animation below highlights the delete, insert, and substitute candidates feeding the minimum.
Debugger-style simulator
Change the inputs, then step through the fixed algorithm. The code listing does not change; only the data does.
Advanced controls
Developer checks
Run a small set of correctness checks in your browser console or by clicking the button below.
Simulator code (JavaScript)
Line highlights map to the steps shown in the matrix.
DP matrix
Current cell and dependencies are highlighted.
Watch + explanation
Key variables for the current step.
The simulator executes the JavaScript above while following the same algorithm as the C99 reference listing below.
Time and space complexity
The DP algorithm fills an (m + 1) x (n + 1) matrix. That is O(mn) time and
O(mn) space. There are faster variants for special cases, but the classic method is reliable and
easy to reason about.
Common pitfalls
- Off-by-one errors when mapping indices to characters.
- Forgetting to initialize the first row or column.
- Mixing up
i/jwith character positions. - Ignoring how empty strings behave (they are valid inputs).
Examples
cat -> cut has distance 1 (one substitution: a -> u).
kitten -> sitting has distance 3 (substitute k -> s, substitute e -> i,
insert g).
Use the Examples dropdown in the simulator to replay these cases.
Optional edit script
The simulator can show one valid sequence of edits by tracing backward through the matrix. Click "Show edits" once the final answer is reached.
Note: JavaScript strings are indexed by UTF-16 code units. Some emoji or accented characters count as two positions. This simulator follows that standard for consistency with the reference code.
C99 reference implementation
This listing is a clean C99 version of the same algorithm. The simulator uses JavaScript for execution so it can run in the browser, but the structure mirrors this reference.
C99 reference code (read-only)
#include#include size_t levenshtein(const char *a, const char *b) { size_t m = 0; size_t n = 0; while (a[m] != '\0') m++; while (b[n] != '\0') n++; size_t **dp = malloc((m + 1) * sizeof(size_t *)); for (size_t i = 0; i <= m; i++) { dp[i] = malloc((n + 1) * sizeof(size_t)); } dp[0][0] = 0; for (size_t i = 1; i <= m; i++) dp[i][0] = i; for (size_t j = 1; j <= n; j++) dp[0][j] = j; for (size_t i = 1; i <= m; i++) { for (size_t j = 1; j <= n; j++) { size_t cost = (a[i - 1] == b[j - 1]) ? 0 : 1; size_t del = dp[i - 1][j] + 1; size_t ins = dp[i][j - 1] + 1; size_t sub = dp[i - 1][j - 1] + cost; size_t min = del < ins ? del : ins; if (sub < min) min = sub; dp[i][j] = min; } } size_t result = dp[m][n]; for (size_t i = 0; i <= m; i++) free(dp[i]); free(dp); return result; }