2060-同源字符串检测(Check if an Original String Exists Given Two Encoded Strings)

原文链接: https://leetcode-cn.com/problems/check-if-an-original-string-exists-given-two-encoded-strings

英文原文

An original string, consisting of lowercase English letters, can be encoded by the following steps:

Arbitrarily split it into a sequence of some number of non-empty substrings.
Arbitrarily choose some elements (possibly none) of the sequence, and replace each with its length (as a numeric string).
Concatenate the sequence as the encoded string.

For example, one way to encode an original string "abcdefghijklmnop" might be:

Split it as a sequence: ["ab", "cdefghijklmn", "o", "p"].
Choose the second and third elements to be replaced by their lengths, respectively. The sequence becomes ["ab", "12", "1", "p"].
Concatenate the elements of the sequence to get the encoded string: "ab121p".

Given two encoded strings s1 and s2, consisting of lowercase English letters and digits 1-9 (inclusive), return true if there exists an original string that could be encoded as both s1 and s2. Otherwise, return false.

Note: The test cases are generated such that the number of consecutive digits in s1 and s2 does not exceed 3.

Example 1:

Input: s1 = "internationalization", s2 = "i18n"
Output: true
Explanation: It is possible that "internationalization" was the original string.
- "internationalization" 
  -> Split:       ["internationalization"]
  -> Do not replace any element
  -> Concatenate:  "internationalization", which is s1.
- "internationalization"
  -> Split:       ["i", "nternationalizatio", "n"]
  -> Replace:     ["i", "18",                 "n"]
  -> Concatenate:  "i18n", which is s2

Example 2:

Input: s1 = "l123e", s2 = "44"
Output: true
Explanation: It is possible that "leetcode" was the original string.
- "leetcode" 
  -> Split:      ["l", "e", "et", "cod", "e"]
  -> Replace:    ["l", "1", "2",  "3",   "e"]
  -> Concatenate: "l123e", which is s1.
- "leetcode" 
  -> Split:      ["leet", "code"]
  -> Replace:    ["4",    "4"]
  -> Concatenate: "44", which is s2.

Example 3:

Input: s1 = "a5b", s2 = "c5b"
Output: false
Explanation: It is impossible.
- The original string encoded as s1 must start with the letter 'a'.
- The original string encoded as s2 must start with the letter 'c'.

Example 4:

Input: s1 = "112s", s2 = "g841"
Output: true
Explanation: It is possible that "gaaaaaaaaaaaas" was the original string
- "gaaaaaaaaaaaas"
  -> Split:      ["g", "aaaaaaaaaaaa", "s"]
  -> Replace:    ["1", "12",           "s"]
  -> Concatenate: "112s", which is s1.
- "gaaaaaaaaaaaas"
  -> Split:      ["g", "aaaaaaaa", "aaaa", "s"]
  -> Replace:    ["g", "8",        "4",    "1"]
  -> Concatenate: "g841", which is s2.

Example 5:

Input: s1 = "ab", s2 = "a2"
Output: false
Explanation: It is impossible.
- The original string encoded as s1 has two letters.
- The original string encoded as s2 has three letters.

Constraints:

1 <= s1.length, s2.length <= 40
s1 and s2 consist of digits 1-9 (inclusive), and lowercase English letters only.
The number of consecutive digits in s1 and s2 does not exceed 3.

中文题目

原字符串由小写字母组成，可以按下述步骤编码：

任意将其分割为由若干非空子字符串组成的一个序列。
任意选择序列中的一些元素（也可能不选择），然后将这些元素替换为元素各自的长度（作为一个数字型的字符串）。
重新 顺次连接 序列，得到编码后的字符串。

例如，编码 "abcdefghijklmnop" 的一种方法可以描述为：

将原字符串分割得到一个序列：["ab", "cdefghijklmn", "o", "p"] 。
选出其中第二个和第三个元素并分别替换为它们自身的长度。序列变为 ["ab", "12", "1", "p"] 。
重新顺次连接序列中的元素，得到编码后的字符串："ab121p" 。

给你两个编码后的字符串 s1 和 s2 ，由小写英文字母和数字 1-9 组成。如果存在能够同时编码得到 s1 和 s2 原字符串，返回 true ；否则，返回 false。

注意：生成的测试用例满足 s1 和 s2 中连续数字数不超过 3 。

示例 1：

输入：s1 = "internationalization", s2 = "i18n"
输出：true
解释："internationalization" 可以作为原字符串
- "internationalization" 
  -> 分割：      ["internationalization"]
  -> 不替换任何元素
  -> 连接：      "internationalization"，得到 s1
- "internationalization"
  -> 分割：      ["i", "nternationalizatio", "n"]
  -> 替换：      ["i", "18",                 "n"]
  -> 连接：      "i18n"，得到 s2

示例 2：

输入：s1 = "l123e", s2 = "44"
输出：true
解释："leetcode" 可以作为原字符串
- "leetcode" 
  -> 分割：       ["l", "e", "et", "cod", "e"]
  -> 替换：       ["l", "1", "2",  "3",   "e"]
  -> 连接：       "l123e"，得到 s1
- "leetcode" 
  -> 分割：       ["leet", "code"]
  -> 替换：       ["4",    "4"]
  -> 连接：       "44"，得到 s2

示例 3：

输入：s1 = "a5b", s2 = "c5b"
输出：false
解释：不存在这样的原字符串
- 编码为 s1 的字符串必须以字母 'a' 开头
- 编码为 s2 的字符串必须以字母 'c' 开头

示例 4：

输入：s1 = "112s", s2 = "g841"
输出：true
解释："gaaaaaaaaaaaas" 可以作为原字符串
- "gaaaaaaaaaaaas"
  -> 分割：       ["g", "aaaaaaaaaaaa", "s"]
  -> 替换：       ["1", "12",           "s"]
  -> 连接：       "112s"，得到 s1
- "gaaaaaaaaaaaas"
  -> 分割：       ["g", "aaaaaaaa", "aaaa", "s"]
  -> 替换：       ["g", "8",        "4",    "1"]
  -> 连接         "g841"，得到 s2

示例 5：

输入：s1 = "ab", s2 = "a2"
输出：false
解释：不存在这样的原字符串
- 编码为 s1 的字符串由两个字母组成
- 编码为 s2 的字符串由三个字母组成

提示：

1 <= s1.length, s2.length <= 40
s1 和 s2 仅由数字 1-9 和小写英文字母组成
s1 和 s2 中连续数字数不超过 3

通过代码

高赞题解

方法一：动态规划

我们用$dp[i][j]$表示将$s_1$的前$i$个字母和$s_2$的前$j$个字母匹配且不发生冲突时，可能的长度差值。

可以看到，存在以下的转移：

$dp[i][j]\rightarrow dp[p][j]$，$\Delta\rightarrow\Delta+s_1[i][p]$，这要求$s_1[i][p]$是一个数字。这里我们额外限定$\Delta\le0$，以避免重复讨论。
$dp[i][j]\rightarrow dp[i][q]$，$\Delta\rightarrow\Delta-s_2[j][q]$，这要求$s_2[j][q]$是一个数字。这里我们额外限定$\Delta\ge0$，以避免重复讨论。
$dp[i][j]\rightarrow dp[i+1][j]$，$\Delta\rightarrow\Delta+1$，这要求$s_1[i]$是一个字母，并且$\Delta<0$，从而保证这个字母可以被$s_2$的剩余长度匹配掉。
$dp[i][j]\rightarrow dp[i][j+1]$，$\Delta\rightarrow\Delta-1$，这要求$s_2[j]$是一个字母，并且$\Delta>0$，从而保证这个字母可以被$s_1$的剩余长度匹配掉。
$dp[i][j]\rightarrow dp[i+1][j+1]$，$\Delta\rightarrow\Delta$，这要求$s_1[i]=s_2[j]$且都为字母，并且$\Delta=0$。

最后，我们检查$dp[N][M]$是否包含$0$即可。

时间复杂度$\mathcal{O}(NMD\cdot 10^D)$。其中$D$表示连续数字串的最长长度，本题中$D=3$。$D$决定了长度差的取值范围为$(-10^D, 10^D)$，这是因为连续的数字串前面至少有一个字母（或为字符串串首），而由我们的转移规则可知，字母只有在串的长度小于等于另一个串时才会被用于匹配，因此连续$D$个数字至多使得当前字符串比另一字符串长$10^D-1$。
空间复杂度$\mathcal{O}(NM\cdot 10^D)$。

参考代码（C++）

class Solution {
    bool isdigit(char ch) {
        return ch >= '0' && ch <= '9';
    }
public:
    bool possiblyEquals(string s1, string s2) {
        int n = s1.size(), m = s2.size();
        vector<vector<unordered_set<int>>> dp(n + 1, vector<unordered_set<int>>(m + 1));
        dp[0][0].emplace(0);
                
        for (int i = 0; i <= n; ++i) {
            for (int j = 0; j <= m; ++j) {
                for (int delta : dp[i][j]) {
                    int num = 0;
                    if (delta <= 0) {
                        for (int p = i; p < min(i + 3, n); ++p) {
                            if (isdigit(s1[p])) {
                                num = num * 10 + s1[p] - '0';
                                dp[p + 1][j].emplace(delta + num);
                            } else {
                                break;
                            }
                        }
                    }
                    
                    num = 0;
                    if (delta >= 0) {
                        for (int q = j; q < min(j + 3, m); ++q) {
                            if (isdigit(s2[q])) {
                                num = num * 10 + s2[q] - '0';
                                dp[i][q + 1].emplace(delta - num);
                            } else {
                                break;
                            }
                        }
                    }
                    
                    if (i < n && delta < 0 && !isdigit(s1[i])) 
                        dp[i + 1][j].emplace(delta + 1);
                            
                    if (j < m && delta > 0 && !isdigit(s2[j])) 
                        dp[i][j + 1].emplace(delta - 1);
                            
                    if (i < n && j < m && delta == 0 && s1[i] == s2[j])
                        dp[i + 1][j + 1].emplace(0);
                }
            }
        }
        
        return dp[n][m].count(0);
    }
};

统计信息

通过次数	提交次数	AC比率
936	2615	35.8%

提交历史

提交时间	提交结果	执行时间	内存消耗	语言

凤起丶宛南

从来没有真正的绝境，只有心灵的迷途

英文原文

中文题目

通过代码

高赞题解

方法一：动态规划

参考代码（C++）

统计信息

提交历史

你的赏识是我前进的动力

英文原文

中文题目

通过代码

高赞题解

方法一：动态规划

参考代码（C++）

统计信息

提交历史

你的赏识是我前进的动力

微信扫一扫：分享