Nine Chapter
  • Introduction
    • Summary
  • 1.Binary Search
    • Introduction
    • 458.Last position of target
    • 600.Smallest Rectangle Enclosing Black Pixels
    • 585.Maximum Number in Mountain Sequence
    • 183.Wood Cut
    • 62.Search in Rotated Sorted Array
    • 63.Search in Rotated Sorted Array II
    • 159.Find Minimum in Rotated Sorted Array
    • 160.Find Minimum in Rotated Sorted Array II
    • 75.Find Peak Element
    • 60.Search Insert Position
    • 28.Search a 2D Matrix
    • 240. Search a 2D Matrix II
    • 14.First Position of Target
    • 74.First Bad Version
    • 875. Koko Eating Bananas
    • 1011. Capacity To Ship Packages Within D Days (M)
    • 410. Split Array Largest Sum (H)
    • 475. Heaters (M)
    • 1044. Longest Duplicate Substring (H)
  • 2.Binary Tree
    • Summary
      • 二叉树八股文:递归改迭代
      • BST
      • Frame
    • 66.Binary Tree Preorder Traversal
    • 67.🌟Binary Tree Inorder Traversal
    • 145. Binary Tree Postorder Traversal (E)
    • 98.Validate Binary Search Tree(M)
    • 85.Insert Node in a Binary Search Tree
    • 104. Maximum Depth of Binary Tree(E)
    • 235. Lowest Common Ancestor of a Binary Search Tree (E)
    • 236.Lowest Common Ancestor of Binary Tree(M)
    • 578.Lowest Common Ancestor III
    • 1120.Subtree with Maximum Average
    • 596.Minimum Subtree
    • 480.Binary Tree Paths
    • 453.Flatten Binary Tree to Linked List
    • 110.Balanced Binary Tree
    • 376.Binary Tree Path Sum
    • 246.Binary Tree Path Sum II
    • 475.Binary Tree Maximum Path Sum II
    • 124.Binary Tree Maximum Path Sum (H)
    • Path Sum (*)
      • 112. Path Sum
      • 113. Path Sum II
      • 437. Path Sum III
    • 177.Convert Sorted Array to Binary Search Tree With Minimal Height
    • 7.Binary Tree Serialization
    • 72,73.Construct Binary Tree
    • Binary Search Tree Path
    • 245.Subtree
    • 469.Identical Binary Tree
    • 87.Remove Node in Binary Search Tree
    • 116.Populating Next Right Pointers in Each Node (M)
    • 114. Flatten Binary Tree to Linked List(M)
    • 654.Maximum Binary Tree (M)
    • 105. 🌟Construct Binary Tree from Preorder and Inorder Traversal (M)
    • 106. Construct Binary Tree from Inorder and Postorder Traversal (M)
    • 652. Find Duplicate Subtrees(M)
    • 230. Kth Smallest Element in a BST (M)
    • 538&1038. Convert BST to Greater Tree
    • 450. Delete Node in a BST (M)
    • 701. Insert into a Binary Search Tree (M)
    • 96. Unique Binary Search Trees
    • 95. Unique Binary Search Trees II (M)
    • 1373. Maximum Sum BST in Binary Tree (H)
    • 297. Serialize and Deserialize Binary Tree (H)
    • 222. Count Complete Tree Nodes (M)
    • 1120. Maximum Average Subtree
    • 341. Flatten Nested List Iterator
    • 333. Largest BST Subtree (M)
    • 543. Diameter of Binary Tree
    • Binary Tree Longest Consecutive Sequence(*)
      • 298.Binary Tree Longest Consecutive Sequence
      • 549. Binary Tree Longest Consecutive Sequence II (M)
  • 3.Breadth First Search
    • Introduction
      • BFS 算法解题套路框架
      • 双向 BFS 优化
    • 102.Binary Tree Level Order Traversal (M)
    • 103. Binary Tree Zigzag Level Order Traversal (M)
    • 107.Binary Tree Level Order Traversal II(M)
    • 618.Search Graph Nodes
    • 207.Course Schedule (M)
    • 210.Course Schedule II (M)
    • 611.Knight Shortest Path
    • 598.Zombie in Matrix
    • 133.Clone Graph (M)
    • 178.Graph Valid Tree
    • 7.Binary Tree Serialization
    • 574.Build Post Office
    • 573.Build Post Office II
    • 127.Topological Sorting
    • 127.Word Ladder
    • 126. Word Ladder II
    • (LeetCode)515.Find Largest Value in Each Tree Row
    • 111. Minimum Depth of Binary Tree (E)
    • 752. Open the Lock
    • 542. 01 Matrix (M)
    • 1306. Jump Game III (M)
  • 4.Depth First Search+BackTracking
    • Summary
      • FloodFill 算法
    • 136.Palindrome Partitioning
    • 39.Combination Sum
    • 40.Combination Sum II
    • 377. Combination Sum IV
    • 77.Combinations (M)
    • 78.Subsets (M)
    • 90.Subsets II (M)
    • 46.🌟Permutations
    • 47.Permutations II
    • 582.Word Break II
    • 490.The Maze (M)
    • 51.N-Queens (H)
    • 52. N-Queens II (H)
    • 698. Partition to K Equal Sum Subsets (M)
    • 22. Generate Parentheses (M)
    • 岛屿问题
      • 200.Number of Islands (M)
      • 1254. Number of Closed Islands (M)
      • 1020. Number of Enclaves (M)
      • 695. Max Area of Island (M)
      • 1905. Count Sub Islands (M)
      • 694. Number of Distinct Islands
    • 131. Palindrome Partitioning (M)
    • 967. Numbers With Same Consecutive Differences (M)
    • 79. Word Search (M)
    • 212. Word Search II (M)
    • 472. Concatenated Words (H)
    • Page 2
    • 291. Word Pattern II
    • 17. Letter Combinations of a Phone Number (M)
  • 5.LinkedList
    • Summary
      • 单链表的倒数第 k 个节点
      • Merge two/k sorted LinkedList
      • Middle of the Linked List
      • 判断链表是否包含环
      • 两个链表是否相交 Intersection of Two Linked Lists
      • 递归反转链表
      • 如何判断回文链表
    • 599.Insert into a Cyclic Sorted List
    • 21.Merge Two Sorted Lists (E)
    • 23.Merge k Sorted Lists (H)
    • 105.Copy List with Random Pointer
    • 141.Linked List Cycle (E)
    • 142.Linked List Cycle II (M)
    • 148.Sort List (M)
    • 86.Partition List (M)
    • 83.Remove Duplicates from Sorted List(E)
    • 82.Remove Duplicates from Sorted List II (M)
    • 206.Reverse Linked List (E)
    • 92.Reverse Linked List II (M)
    • 143.Reorder List (M)
    • 19.Remove Nth Node From End of List (E)
    • 170.Rotate List
    • 🤔25.Reverse Nodes in k-Group (H)
    • 452.Remove Linked List Elements
    • 167.Add Two Numbers
    • 221.Add Two Numbers II
    • 876. Middle of the Linked List (E)
    • 160. Intersection of Two Linked Lists (E)
    • 234. Palindrome Linked List (E)
    • 2130. Maximum Twin Sum of a Linked List (M)
  • 6.Array
    • Summary
      • 前缀和思路PrefixSum
      • 差分数组 Difference Array
      • 双指针Two Pointers
      • 滑动窗口算法算法
      • Sliding windows II
      • 二分搜索Binary Search
      • 排序算法
      • 快速选择算法
    • 604.Window Sum
    • 138.Subarray Sum
    • 41.Maximum Subarray
    • 42.Maximum Subarray II
    • 43.Maximum Subarray III
    • 620.Maximum Subarray IV
    • 621.Maximum Subarray V
    • 6.Merge Two Sorted Arrays
    • 88.Merge Sorted Array
    • 547.Intersection of Two Arrays
    • 548.Intersection of Two Arrays II
    • 139.Subarray Sum Closest
    • 65.Median of two Sorted Arrays
    • 636.132 Pattern
    • 402.Continuous Subarray Sum
    • 303. Range Sum Query - Immutable (E)
    • 304.Range Sum Query 2D - Immutable (M)
    • 560. Subarray Sum Equals K (M)
    • 370. Range Addition(M)
    • 1109. Corporate Flight Bookings(M)
    • 1094. Car Pooling (M)
    • 76. Minimum Window Substring(H)
    • 567. Permutation in String (M)
    • 438. Find All Anagrams in a String(M)
    • 3. Longest Substring Without Repeating Characters (M)
    • 380. Insert Delete GetRandom O(1) (M)
    • 710. Random Pick with Blacklist (H)
    • 528. Random Pick with Weight (M)
    • 26. Remove Duplicates from Sorted Array (E)
    • 27. Remove Element (E)
    • 283. Move Zeroes (E)
    • 659. Split Array into Consecutive Subsequences (M)
    • 4. Median of Two Sorted Arrays (H)
    • 48. Rotate Image (M)
    • 54. Spiral Matrix (M)
    • 59. Spiral Matrix II (M)
    • 918. Maximum Sum Circular Subarray
    • 128. Longest Consecutive Sequence (M)
    • 238. Product of Array Except Self (M)
    • 1438. Longest Continuous Subarray With Absolute Diff Less Than or Equal to Limit (M)
    • 1151. Minimum Swaps to Group All 1's Together (M)
    • 2134. Minimum Swaps to Group All 1's Together II
    • 2133. Check if Every Row and Column Contains All Numbers
    • 632. Smallest Range Covering Elements from K Lists (H)
    • 36. Valid Sudoku (M)
    • 383. Ransom Note
    • 228. Summary Ranges
  • 7.Two pointers
    • Summary
      • Two Sum
      • 2Sum 3Sum 4Sum 问题
    • 1.Two Sum I
    • 170.Two Sum III - Data structure design
    • 167.Two Sum II- Input array is sorted
    • 609.Two Sum - Less than or equal to target
    • 610.Two Sum - Difference equals to targe
    • 587.Two Sum - Unique pairs
    • 533.Two Sum - Closest to target
    • 443.Two Sum - Greater than target
    • 653. Two Sum IV - Input is a BST (M)
    • 57.3Sum
    • 59.3Sum Closest
    • 58.4Sum
    • 148.Sort Colors
    • 143.Sort Colors II
    • 31.Partition Array
    • 625.Partition Array II
    • 382.Triangle Count
      • 611. Valid Triangle Number
    • 521.Remove Duplicate Numbers in Array
    • 167. Two Sum II - Input Array Is Sorted (E)
    • 870. Advantage Shuffle (M)
    • 9. Palindrome Number (E)
    • 125. Valid Palindrome(E)
    • 5. Longest Palindromic Substring (M)
    • 42. Trapping Rain Water
    • 11. Container With Most Water (M)
    • 658. Find K Closest Elements (M)
    • 392. Is Subsequence
  • 8.Data Structure
    • Summary
      • 数据结构的存储方式
      • 单调栈
      • 单调队列
      • 二叉堆 Binary Heap
      • TreeMap
      • TreeSet
      • 🌟Trie
      • Trie Application
    • 155. Min Stack (E)
    • 716. Max Stack (E)
    • 1648. Sell Diminishing-Valued Colored Balls
    • 232. Implement Queue using Stacks (E)
    • 225. Implement Stack using Queues(E)
    • 84.Largest Rectangle in Histogram
    • 128.Hash Function
    • Max Tree
    • 544.Top k Largest Numbers
    • 545.Top k Largest Numbers II
    • 613.High Five
    • 606.Kth Largest Element II
    • 5.Kth Largest Element
    • 129.Rehashing
    • 4.Ugly Number II
    • 517.Ugly Number
    • 28. Implement strStr()
    • 594.strStr II
    • 146.LRU Cache
    • 460.LFU Cache
    • 486.Merge k Sorted Arrays
    • 130.Heapify
    • 215. Kth Largest Element in an Array (M)
    • 612.K Closest Points
    • 692. Top K Frequent Words
    • 347.Top K Frequent Elements
    • 601.Flatten 2D Vector
    • 540.Zigzag Iterator
    • 541.Zigzag Iterator II
    • 423.Valid Parentheses
    • 488.Happy Number
    • 547.Intersection of Two Arrays
    • 548.Intersection of Two Arrays II
    • 627.Longest Palindrome
    • 638.Strings Homomorphism
    • 138.Subarray Sum
    • 647.Substring Anagrams
    • 171.Anagrams
    • 739. Daily Temperatures(M)
    • 496. Next Greater Element I (E)
    • 503. Next Greater Element II(M)
    • 316. Remove Duplicate Letters(M) & 1081. Smallest Subsequence of Distinct Characters
    • 239. Sliding Window Maximum (H)
    • 355. Design Twitter (M)
    • 895. Maximum Frequency Stack (H)
    • 20. Valid Parentheses (E)
    • 921. Minimum Add to Make Parentheses Valid (M)
    • 1541. Minimum Insertions to Balance a Parentheses String (M)
    • 32. Longest Valid Parentheses (H)
    • Basic Calculator (*)
      • 224. Basic Calculator
      • 227. Basic Calculator II (M)
    • 844. Backspace String Compare
    • 295. Find Median from Data Stream
    • 208. Implement Trie (Prefix Tree)
    • 461.Kth Smallest Numbers in Unsorted Array
    • 1152.Analyze user website visit pattern
    • 811. Subdomain Visit Count (M)
    • 71. Simplify Path (M)
    • 362. Design Hit Counter
  • 9.Dynamic Programming
    • Summary
      • 最优子结构 Optimal Sustructure
      • 子序列解题模板
      • 空间压缩
      • 背包问题
        • Untitled
      • 股票买卖问题
      • KMP
    • 109.Triangle
    • 110.Minimum Path Sum
    • 114.Unique Paths
    • 115.Unique Paths II
    • 70.Climbing Stairs
    • 272.Climbing StairsII
    • 116.Jump Game
    • 117.Jump Game II
    • 322.Coin Change
    • 518. Coin Change 2 ()
    • Backpack I~VI
      • LintCode 563.Backpack V (M)
    • Best Time to Buy and Sell Stock(*)
      • 121. Best Time to Buy and Sell Stock
      • 122. Best Time to Buy and Sell Stock II (M)
      • 123. Best Time to Buy and Sell Stock III (H)
      • 188. Best Time to Buy and Sell Stock IV (H)
      • 309. Best Time to Buy and Sell Stock with Cooldown (M)
      • 714. Best Time to Buy and Sell Stock with Transaction Fee (M)
    • 394.Coins in a line
    • 395.Coins in a Line II
    • 509. Fibonacci Number (E)
    • 931. Minimum Falling Path Sum (M)
    • 494. Target Sum (M)
    • 72. Edit Distance (H)
    • 300.Longest Increasing Subsequence
    • 1143. Longest Common Subsequence (M)
    • 718. Maximum Length of Repeated Subarray
    • 583. Delete Operation for Two Strings (M)
    • 712. Minimum ASCII Delete Sum for Two Strings(M)
    • 53. Maximum Subarray (E)
    • 516. Longest Palindromic Subsequence (M)
    • 1312. Minimum Insertion Steps to Make a String Palindrome (H)
    • 416. Partition Equal Subset Sum (M)
    • 64. Minimum Path Sum(M)
    • 651. 4 Keys Keyboards (M)
    • House Robber (*)
      • 198. House Robber (M)
      • 213. House Robbber II
      • 337. House Robber III (M)
    • Word Break (*)
      • 139.Word Break (M)
    • 140. Word Break II (H)
    • 828. Count Unique Characters of All Substrings of a Given String (H)
    • 174. Dungeon Game (H)
    • 1567. Maximum Length of Subarray With Positive Product (M)
  • 10. Graph
    • Introduction
      • 有向图的环检测
      • 拓扑排序
      • 二分图判定
      • Union-Find
      • 最小生成树(Minimum Spanning Tree)算法
        • KRUSKAL 最小生成树算法
        • Prim 最小生成树算法
      • Dijkstra 最短路径算法
      • BFS vs DFS
    • 797. All Paths From Source to Target (M)
    • 785. Is Graph Bipartite? (M)
    • 886. Possible Bipartition (M)
    • 130. Surrounded Regions (M)
    • 990. Satisfiability of Equality Equations (M)
    • 721. Accounts Merge (M)
    • 323. Number of Connected Components in an Undirected Graph (M)
    • 261. Graph Valid Tree
    • 1135. Connecting Cities With Minimum Cost
    • 1584. Min Cost to Connect All Points (M)
    • 277. Find the Celebrity (M)
    • 743. Network Delay Time (M)
    • 1631. Path With Minimum Effort (M)
    • 1514. Path with Maximum Probability (M)
    • 589.Connecting Graph
    • 🌟787. Cheapest Flights Within K Stops (M)
    • 2050. Parallel Courses III (H)
    • 1293. Shortest Path in a Grid with Obstacles Elimination (H)
    • 864. Shortest Path to Get All Keys (H)
    • 269. Alien Dictionary (H)
    • 1192. Critical Connections in a Network (H)
    • 529. Minesweeper (M)
  • 11.Math
    • Page 1
Powered by GitBook
On this page

Was this helpful?

  1. 9.Dynamic Programming

72. Edit Distance (H)

https://leetcode.com/problems/edit-distance/

Given two strings word1 and word2, return the minimum number of operations required to convert word1 to word2.

You have the following three operations permitted on a word:

  • Insert a character

  • Delete a character

  • Replace a character

Example 1:

Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation: 
horse -> rorse (replace 'h' with 'r')
rorse -> rose (remove 'r')
rose -> ros (remove 'e')

Example 2:

Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation: 
intention -> inention (remove 't')
inention -> enention (replace 'i' with 'e')
enention -> exention (replace 'n' with 'x')
exention -> exection (replace 'n' with 'c')
exection -> execution (insert 'u')

Constraints:

  • 0 <= word1.length, word2.length <= 500

  • word1 and word2 consist of lowercase English letters.

Solution:

前几天看了一份鹅厂的面试题,算法部分大半是动态规划,最后一题就是写一个计算编辑距离的函数,今天就专门写一篇文章来探讨一下这个问题。

我个人很喜欢编辑距离这个问题,因为它看起来十分困难,解法却出奇得简单漂亮,而且它是少有的比较实用的算法(是的,我承认很多算法问题都不太实用)。下面先来看下题目:

为什么说这个问题难呢,因为显而易见,它就是难,让人手足无措,望而生畏。

为什么说它实用呢,因为前几天我就在日常生活中用到了这个算法。之前有一篇公众号文章由于疏忽,写错位了一段内容,我决定修改这部分内容让逻辑通顺。但是公众号文章最多只能修改 20 个字,且只支持增、删、替换操作(跟编辑距离问题一模一样),于是我就用算法求出了一个最优方案,只用了 16 步就完成了修改。

再比如高大上一点的应用,DNA 序列是由 A,G,C,T 组成的序列,可以类比成字符串。编辑距离可以衡量两个 DNA 序列的相似度,编辑距离越小,说明这两段 DNA 越相似,说不定这俩 DNA 的主人是远古近亲啥的。

下面言归正传,详细讲解一下编辑距离该怎么算,相信本文会让你有收获。

一、思路

编辑距离问题就是给我们两个字符串 s1 和 s2,只能用三种操作,让我们把 s1 变成 s2,求最少的操作数。需要明确的是,不管是把 s1 变成 s2 还是反过来,结果都是一样的,所以后文就以 s1 变成 s2 举例。

设两个字符串分别为 "rad" 和 "apple",为了把 s1 变成 s2,算法会这样进行:

请记住这个 GIF 过程,这样就能算出编辑距离。关键在于如何做出正确的操作,稍后会讲。

根据上面的 GIF,可以发现操作不只有三个,其实还有第四个操作,就是什么都不要做(skip)。比如这个情况:

因为这两个字符本来就相同,为了使编辑距离最小,显然不应该对它们有任何操作,直接往前移动 i,j 即可。

还有一个很容易处理的情况,就是 j 走完 s2 时,如果 i 还没走完 s1,那么只能用删除操作把 s1 缩短为 s2。比如这个情况:

类似的,如果 i 走完 s1 时 j 还没走完了 s2,那就只能用插入操作把 s2 剩下的字符全部插入 s1。等会会看到,这两种情况就是算法的 base case。

下面详解一下如何将思路转换成代码,坐稳,要发车了。

二、代码详解

先梳理一下之前的思路:

base case 是 i 走完 s1 或 j 走完 s2,可以直接返回另一个字符串剩下的长度。

对于每对儿字符 s1[i] 和 s2[j],可以有四种操作:

if s1[i] == s2[j]:
    啥都别做(skip)
    i, j 同时向前移动
else:
    三选一:
        插入(insert)
        删除(delete)
        替换(replace)

有这个框架,问题就已经解决了。读者也许会问,这个「三选一」到底该怎么选择呢?很简单,全试一遍,哪个操作最后得到的编辑距离最小,就选谁。这里需要递归技巧,理解需要点技巧,先看下代码:

def minDistance(s1, s2) -> int:
    # 定义:dp(i, j) 返回 s1[0..i] 和 s2[0..j] 的最小编辑距离
    def dp(i, j):
        # base case
        if i == -1: return j + 1
        if j == -1: return i + 1
        
        if s1[i] == s2[j]:
            return dp(i - 1, j - 1)  # 啥都不做
        else:
            return min(
                dp(i, j - 1) + 1,    # 插入
                dp(i - 1, j) + 1,    # 删除
                dp(i - 1, j - 1) + 1 # 替换
            )
    
    # i,j 初始化指向最后一个索引
    return dp(len(s1) - 1, len(s2) - 1)

下面来详细解释一下这段递归代码,base case 应该不用解释了,主要解释一下递归部分。

都说递归代码的可解释性很好,这是有道理的,只要理解函数的定义,就能很清楚地理解算法的逻辑。我们这里 dp(i, j) 函数的定义是这样的:

def dp(i, j) -> int
# 返回 s1[0..i] 和 s2[0..j] 的最小编辑距离

记住这个定义之后,先来看这段代码:

if s1[i] == s2[j]:
    return dp(i - 1, j - 1)  # 啥都不做
# 解释:
# 本来就相等,不需要任何操作
# s1[0..i] 和 s2[0..j] 的最小编辑距离等于
# s1[0..i-1] 和 s2[0..j-1] 的最小编辑距离
# 也就是说 dp(i, j) 等于 dp(i-1, j-1)

如果 s1[i]!=s2[j],就要对三个操作递归了,稍微需要点思考:

dp(i, j - 1) + 1,    # 插入
# 解释:
# 我直接在 s1[i] 插入一个和 s2[j] 一样的字符
# 那么 s2[j] 就被匹配了,前移 j,继续跟 i 对比
# 别忘了操作数加一
dp(i - 1, j) + 1,    # 删除
# 解释:
# 我直接把 s[i] 这个字符删掉
# 前移 i,继续跟 j 对比
# 操作数加一
dp(i - 1, j - 1) + 1 # 替换
# 解释:
# 我直接把 s1[i] 替换成 s2[j],这样它俩就匹配了
# 同时前移 i,j 继续对比
# 操作数加一

现在,你应该完全理解这段短小精悍的代码了。还有点小问题就是,这个解法是暴力解法,存在重叠子问题,需要用动态规划技巧来优化。

def dp(i, j):
    dp(i - 1, j - 1) #1
    dp(i, j - 1)     #2
    dp(i - 1, j)     #3

对于子问题 dp(i-1, j-1),如何通过原问题 dp(i, j) 得到呢?有不止一条路径,比如 dp(i, j) -> #1 和 dp(i, j) -> #2 -> #3。一旦发现一条重复路径,就说明存在巨量重复路径,也就是重叠子问题。

三、动态规划优化

备忘录很好加,原来的代码稍加修改即可:

def minDistance(s1, s2) -> int:

    memo = dict() # 备忘录
    def dp(i, j):
        if (i, j) in memo: 
            return memo[(i, j)]
        ...
        
        if s1[i] == s2[j]:
            memo[(i, j)] = ...  
        else:
            memo[(i, j)] = ...
        return memo[(i, j)]
    
    return dp(len(s1) - 1, len(s2) - 1)
    
    JAVA Version:
    class Solution {
    int[][] memo;
    public int minDistance(String word1, String word2) {
        memo = new int[word1.length()][word2.length()];
        return dp(word1, word2, word1.length()-1, word2.length()-1);
    }
    private int dp(String s1, String s2, int i, int j){
        if(i == -1) return j+1;
        if(j == -1) return i+1;
        if(memo[i][j] != 0) return memo[i][j];
        if(s1.charAt(i) == s2.charAt(j)){
            memo[i][j] = dp(s1,s2,i-1,j-1);
        }else{
            int a = dp(s1,s2,i,j-1) + 1;
            int b = dp(s1,s2,i-1,j) + 1;
            int c = dp(s1,s2,i-1,j-1) + 1;
            memo[i][j] = Math.min(a, Math.min(b, c));
        }
        return memo[i][j];
    }
}

主要说下 DP table 的解法:

首先明确 dp 数组的含义,dp 数组是一个二维数组,长这样:

有了之前递归解法的铺垫,应该很容易理解。dp[..][0] 和 dp[0][..] 对应 base case,dp[i][j] 的含义和之前的 dp 函数类似:

def dp(i, j) -> int
# 返回 s1[0..i] 和 s2[0..j] 的最小编辑距离

dp[i-1][j-1]
# 存储 s1[0..i] 和 s2[0..j] 的最小编辑距离

dp 函数的 base case 是 i, j 等于 -1,而数组索引至少是 0,所以 dp 数组会偏移一位。

既然 dp 数组和递归 dp 函数含义一样,也就可以直接套用之前的思路写代码,唯一不同的是,DP table 是自底向上求解,递归解法是自顶向下求解:

int minDistance(String s1, String s2) {
    int m = s1.length(), n = s2.length();
    // 定义:s1[0..i] 和 s2[0..j] 的最小编辑距离是 dp[i-1][j-1]
    int[][] dp = new int[m + 1][n + 1];
    // base case 
    for (int i = 1; i <= m; i++)
        dp[i][0] = i;
    for (int j = 1; j <= n; j++)
        dp[0][j] = j;
    // 自底向上求解
    for (int i = 1; i <= m; i++) {
        for (int j = 1; j <= n; j++) {
            if (s1.charAt(i-1) == s2.charAt(j-1)) {
                dp[i][j] = dp[i - 1][j - 1];
            } else {
                dp[i][j] = min(
                    dp[i - 1][j] + 1,
                    dp[i][j - 1] + 1,
                    dp[i - 1][j - 1] + 1
                );
            }
        }
    }
    // 储存着整个 s1 和 s2 的最小编辑距离
    return dp[m][n];
}

int min(int a, int b, int c) {
    return Math.min(a, Math.min(b, c));
}

三、扩展延伸

一般来说,处理两个字符串的动态规划问题,都是按本文的思路处理,建立 DP table。为什么呢,因为易于找出状态转移的关系,比如编辑距离的 DP table:

还有一个细节,既然每个 dp[i][j] 只和它附近的三个状态有关,空间复杂度是可以压缩成 O(min(M, N)) 的(M,N 是两个字符串的长度)。不难,但是可解释性大大降低,读者可以自己尝试优化一下。

你可能还会问,这里只求出了最小的编辑距离,那具体的操作是什么?你之前举的修改公众号文章的例子,只有一个最小编辑距离肯定不够,还得知道具体怎么修改才行。

这个其实很简单,代码稍加修改,给 dp 数组增加额外的信息即可:

// int[][] dp;
Node[][] dp;

class Node {
    int val;
    int choice;
    // 0 代表啥都不做
    // 1 代表插入
    // 2 代表删除
    // 3 代表替换
}

val 属性就是之前的 dp 数组的数值,choice 属性代表操作。在做最优选择时,顺便把操作记录下来,然后就从结果反推具体操作。

我们的最终结果不是 dp[m][n] 吗,这里的 val 存着最小编辑距离,choice 存着最后一个操作,比如说是插入操作,那么就可以左移一格:

重复此过程,可以一步步回到起点 dp[0][0],形成一条路径,按这条路径上的操作进行编辑,就是最佳方案。

_____________

Previous494. Target Sum (M)Next300.Longest Increasing Subsequence

Last updated 3 years ago

Was this helpful?

本文有视频版:

前文 说过,解决两个字符串的动态规划问题,一般都是用两个指针 i,j 分别指向两个字符串的最后,然后一步步往前走,缩小问题的规模。

怎么能一眼看出存在重叠子问题呢?前文 有提过,这里再简单提一下,需要抽象出本文算法的递归框架:

对于重叠子问题呢,前文 详细介绍过,优化方法无非是备忘录或者 DP table。

编辑距离详解动态规划
最长公共子序列
动态规划之正则表达式
动态规划详解