加载中...
2060-同源字符串检测(Check if an Original String Exists Given Two Encoded Strings)
发表于:2021-12-03 | 分类: 困难
字数统计: 2k | 阅读时长: 9分钟 | 阅读量:

原文链接: https://leetcode-cn.com/problems/check-if-an-original-string-exists-given-two-encoded-strings

英文原文

An original string, consisting of lowercase English letters, can be encoded by the following steps:

  • Arbitrarily split it into a sequence of some number of non-empty substrings.
  • Arbitrarily choose some elements (possibly none) of the sequence, and replace each with its length (as a numeric string).
  • Concatenate the sequence as the encoded string.

For example, one way to encode an original string "abcdefghijklmnop" might be:

  • Split it as a sequence: ["ab", "cdefghijklmn", "o", "p"].
  • Choose the second and third elements to be replaced by their lengths, respectively. The sequence becomes ["ab", "12", "1", "p"].
  • Concatenate the elements of the sequence to get the encoded string: "ab121p".

Given two encoded strings s1 and s2, consisting of lowercase English letters and digits 1-9 (inclusive), return true if there exists an original string that could be encoded as both s1 and s2. Otherwise, return false.

Note: The test cases are generated such that the number of consecutive digits in s1 and s2 does not exceed 3.

 

Example 1:

Input: s1 = "internationalization", s2 = "i18n"
Output: true
Explanation: It is possible that "internationalization" was the original string.
- "internationalization" 
  -> Split:       ["internationalization"]
  -> Do not replace any element
  -> Concatenate:  "internationalization", which is s1.
- "internationalization"
  -> Split:       ["i", "nternationalizatio", "n"]
  -> Replace:     ["i", "18",                 "n"]
  -> Concatenate:  "i18n", which is s2

Example 2:

Input: s1 = "l123e", s2 = "44"
Output: true
Explanation: It is possible that "leetcode" was the original string.
- "leetcode" 
  -> Split:      ["l", "e", "et", "cod", "e"]
  -> Replace:    ["l", "1", "2",  "3",   "e"]
  -> Concatenate: "l123e", which is s1.
- "leetcode" 
  -> Split:      ["leet", "code"]
  -> Replace:    ["4",    "4"]
  -> Concatenate: "44", which is s2.

Example 3:

Input: s1 = "a5b", s2 = "c5b"
Output: false
Explanation: It is impossible.
- The original string encoded as s1 must start with the letter 'a'.
- The original string encoded as s2 must start with the letter 'c'.

Example 4:

Input: s1 = "112s", s2 = "g841"
Output: true
Explanation: It is possible that "gaaaaaaaaaaaas" was the original string
- "gaaaaaaaaaaaas"
  -> Split:      ["g", "aaaaaaaaaaaa", "s"]
  -> Replace:    ["1", "12",           "s"]
  -> Concatenate: "112s", which is s1.
- "gaaaaaaaaaaaas"
  -> Split:      ["g", "aaaaaaaa", "aaaa", "s"]
  -> Replace:    ["g", "8",        "4",    "1"]
  -> Concatenate: "g841", which is s2.

Example 5:

Input: s1 = "ab", s2 = "a2"
Output: false
Explanation: It is impossible.
- The original string encoded as s1 has two letters.
- The original string encoded as s2 has three letters.

 

Constraints:

  • 1 <= s1.length, s2.length <= 40
  • s1 and s2 consist of digits 1-9 (inclusive), and lowercase English letters only.
  • The number of consecutive digits in s1 and s2 does not exceed 3.

中文题目

原字符串由小写字母组成,可以按下述步骤编码:

  • 任意将其 分割 为由若干 非空 子字符串组成的一个 序列
  • 任意选择序列中的一些元素(也可能不选择),然后将这些元素替换为元素各自的长度(作为一个数字型的字符串)。
  • 重新 顺次连接 序列,得到编码后的字符串。

例如,编码 "abcdefghijklmnop" 的一种方法可以描述为:

  • 将原字符串分割得到一个序列:["ab", "cdefghijklmn", "o", "p"]
  • 选出其中第二个和第三个元素并分别替换为它们自身的长度。序列变为 ["ab", "12", "1", "p"]
  • 重新顺次连接序列中的元素,得到编码后的字符串:"ab121p"

给你两个编码后的字符串 s1s2 ,由小写英文字母和数字 1-9 组成。如果存在能够同时编码得到 s1s2 原字符串,返回 true ;否则,返回 false

注意:生成的测试用例满足 s1s2 中连续数字数不超过 3

 

示例 1:

输入:s1 = "internationalization", s2 = "i18n"
输出:true
解释:"internationalization" 可以作为原字符串
- "internationalization" 
  -> 分割:      ["internationalization"]
  -> 不替换任何元素
  -> 连接:      "internationalization",得到 s1
- "internationalization"
  -> 分割:      ["i", "nternationalizatio", "n"]
  -> 替换:      ["i", "18",                 "n"]
  -> 连接:      "i18n",得到 s2

示例 2:

输入:s1 = "l123e", s2 = "44"
输出:true
解释:"leetcode" 可以作为原字符串
- "leetcode" 
  -> 分割:       ["l", "e", "et", "cod", "e"]
  -> 替换:       ["l", "1", "2",  "3",   "e"]
  -> 连接:       "l123e",得到 s1
- "leetcode" 
  -> 分割:       ["leet", "code"]
  -> 替换:       ["4",    "4"]
  -> 连接:       "44",得到 s2

示例 3:

输入:s1 = "a5b", s2 = "c5b"
输出:false
解释:不存在这样的原字符串
- 编码为 s1 的字符串必须以字母 'a' 开头
- 编码为 s2 的字符串必须以字母 'c' 开头

示例 4:

输入:s1 = "112s", s2 = "g841"
输出:true
解释:"gaaaaaaaaaaaas" 可以作为原字符串
- "gaaaaaaaaaaaas"
  -> 分割:       ["g", "aaaaaaaaaaaa", "s"]
  -> 替换:       ["1", "12",           "s"]
  -> 连接:       "112s",得到 s1
- "gaaaaaaaaaaaas"
  -> 分割:       ["g", "aaaaaaaa", "aaaa", "s"]
  -> 替换:       ["g", "8",        "4",    "1"]
  -> 连接         "g841",得到 s2

示例 5:

输入:s1 = "ab", s2 = "a2"
输出:false
解释:不存在这样的原字符串
- 编码为 s1 的字符串由两个字母组成
- 编码为 s2 的字符串由三个字母组成

 

提示:

  • 1 <= s1.length, s2.length <= 40
  • s1s2 仅由数字 1-9 和小写英文字母组成
  • s1s2 中连续数字数不超过 3

通过代码

高赞题解

方法一:动态规划

我们用$dp[i][j]$表示将$s_1$的前$i$个字母和$s_2$的前$j$个字母匹配且不发生冲突时,可能的长度差值。

可以看到,存在以下的转移:

  • $dp[i][j]\rightarrow dp[p][j]$,$\Delta\rightarrow\Delta+s_1[i][p]$,这要求$s_1[i][p]$是一个数字。这里我们额外限定$\Delta\le0$,以避免重复讨论。
  • $dp[i][j]\rightarrow dp[i][q]$,$\Delta\rightarrow\Delta-s_2[j][q]$,这要求$s_2[j][q]$是一个数字。这里我们额外限定$\Delta\ge0$,以避免重复讨论。
  • $dp[i][j]\rightarrow dp[i+1][j]$,$\Delta\rightarrow\Delta+1$,这要求$s_1[i]$是一个字母,并且$\Delta<0$,从而保证这个字母可以被$s_2$的剩余长度匹配掉。
  • $dp[i][j]\rightarrow dp[i][j+1]$,$\Delta\rightarrow\Delta-1$,这要求$s_2[j]$是一个字母,并且$\Delta>0$,从而保证这个字母可以被$s_1$的剩余长度匹配掉。
  • $dp[i][j]\rightarrow dp[i+1][j+1]$,$\Delta\rightarrow\Delta$,这要求$s_1[i]=s_2[j]$且都为字母,并且$\Delta=0$。

最后,我们检查$dp[N][M]$是否包含$0$即可。

  • 时间复杂度$\mathcal{O}(NMD\cdot 10^D)$。其中$D$表示连续数字串的最长长度,本题中$D=3$。$D$决定了长度差的取值范围为$(-10^D, 10^D)$,这是因为连续的数字串前面至少有一个字母(或为字符串串首),而由我们的转移规则可知,字母只有在串的长度小于等于另一个串时才会被用于匹配,因此连续$D$个数字至多使得当前字符串比另一字符串长$10^D-1$。
  • 空间复杂度$\mathcal{O}(NM\cdot 10^D)$。

参考代码(C++)

class Solution {
    bool isdigit(char ch) {
        return ch >= '0' && ch <= '9';
    }
public:
    bool possiblyEquals(string s1, string s2) {
        int n = s1.size(), m = s2.size();
        vector<vector<unordered_set<int>>> dp(n + 1, vector<unordered_set<int>>(m + 1));
        dp[0][0].emplace(0);
                
        for (int i = 0; i <= n; ++i) {
            for (int j = 0; j <= m; ++j) {
                for (int delta : dp[i][j]) {
                    int num = 0;
                    if (delta <= 0) {
                        for (int p = i; p < min(i + 3, n); ++p) {
                            if (isdigit(s1[p])) {
                                num = num * 10 + s1[p] - '0';
                                dp[p + 1][j].emplace(delta + num);
                            } else {
                                break;
                            }
                        }
                    }
                    
                    num = 0;
                    if (delta >= 0) {
                        for (int q = j; q < min(j + 3, m); ++q) {
                            if (isdigit(s2[q])) {
                                num = num * 10 + s2[q] - '0';
                                dp[i][q + 1].emplace(delta - num);
                            } else {
                                break;
                            }
                        }
                    }
                    
                    if (i < n && delta < 0 && !isdigit(s1[i])) 
                        dp[i + 1][j].emplace(delta + 1);
                            
                    if (j < m && delta > 0 && !isdigit(s2[j])) 
                        dp[i][j + 1].emplace(delta - 1);
                            
                    if (i < n && j < m && delta == 0 && s1[i] == s2[j])
                        dp[i + 1][j + 1].emplace(0);
                }
            }
        }
        
        return dp[n][m].count(0);
    }
};

统计信息

通过次数 提交次数 AC比率
936 2615 35.8%

提交历史

提交时间 提交结果 执行时间 内存消耗 语言
上一篇:
2059-转化数字的最小运算数(Minimum Operations to Convert Number)
下一篇:
2062-统计字符串中的元音子字符串(Count Vowel Substrings of a String)
本文目录
本文目录