Algorithm
Problem Name: 591. Tag Validator
Problem Link: https://leetcode.com/problems/tag-validator/
Given a string representing a code snippet, implement a tag validator to parse the code and return whether it is valid.
A code snippet is valid if all the following rules hold:
- The code must be wrapped in a valid closed tag. Otherwise, the code is invalid.
- A closed tag (not necessarily valid) has exactly the following format :
<TAG_NAME>TAG_CONTENT</TAG_NAME>
. Among them,<TAG_NAME>
is the start tag, and</TAG_NAME>
is the end tag. The TAG_NAME in start and end tags should be the same. A closed tag is valid if and only if the TAG_NAME and TAG_CONTENT are valid. - A valid
TAG_NAME
only contain upper-case letters, and has length in range [1,9]. Otherwise, theTAG_NAME
is invalid. - A valid
TAG_CONTENT
may contain other valid closed tags, cdata and any characters (see note1) EXCEPT unmatched<
, unmatched start and end tag, and unmatched or closed tags with invalid TAG_NAME. Otherwise, theTAG_CONTENT
is invalid. - A start tag is unmatched if no end tag exists with the same TAG_NAME, and vice versa. However, you also need to consider the issue of unbalanced when tags are nested.
- A
<
is unmatched if you cannot find a subsequent>
. And when you find a<
or</
, all the subsequent characters until the next>
should be parsed as TAG_NAME (not necessarily valid). - The cdata has the following format :
<![CDATA[CDATA_CONTENT]]>
. The range ofCDATA_CONTENT
is defined as the characters between<![CDATA[
and the first subsequent]]>
. CDATA_CONTENT
may contain any characters. The function of cdata is to forbid the validator to parseCDATA_CONTENT
, so even it has some characters that can be parsed as tag (no matter valid or invalid), you should treat it as regular characters.
Example 1:
Input: code = "<DIV>This is the first line <![CDATA[<div>]]></DIV>" Output: true Explanation: The code is wrapped in a closed tag : <DIV> and </DIV>. The TAG_NAME is valid, the TAG_CONTENT consists of some characters and cdata. Although CDATA_CONTENT has an unmatched start tag with invalid TAG_NAME, it should be considered as plain text, not parsed as a tag. So TAG_CONTENT is valid, and then the code is valid. Thus return true.
Example 2:
Input: code = "<DIV>>> ![cdata[]] <![CDATA[<div>]>]]>]]>>]</DIV>" Output: true Explanation: We first separate the code into : start_tag|tag_content|end_tag. start_tag -> "<DIV>" end_tag -> "</DIV>" tag_content could also be separated into : text1|cdata|text2. text1 -> ">> ![cdata[]] " cdata -> "<![CDATA[<div>]>]]>", where the CDATA_CONTENT is "<div>]>" text2 -> "]]>>]" The reason why start_tag is NOT "<DIV>>>" is because of the rule 6. The reason why cdata is NOT "<![CDATA[<div>]>]]>]]>" is because of the rule 7.
Example 3:
Input: code = "<A> <B> </A> </B>" Output: false Explanation: Unbalanced. If "<A>" is closed, then "<B>" must be unmatched, and vice versa.
Constraints:
1 <= code.length <= 500
code
consists of English letters, digits,'<'
,'>'
,'/'
,'!'
,'['
,']'
,'.'
, and' '
.
Code Examples
#1 Code Example with Javascript Programming
Code -
Javascript Programming
const isValid = function (code) {
const stack = []
const [A, Z] = ['A', 'Z'].map((e) => e.charCodeAt(0))
for (let i = 0; i < code.length; ) {
if (i > 0 && stack.length === 0) return false
if (code.startsWith('', j)
if (i < 0) return false
i += 3
} else if (code.startsWith('', i)) {
let j = i + 2
i = code.indexOf('>', j)
if (i < 0 || i === j || i - j > 9) return false
for (let k = j; k < i; k++) {
if (
code.charAt(k) !== code[k].toUpperCase() ||
!(code.charCodeAt(k) >= A && code.charCodeAt(k) <= Z)
)
return false
}
let s = code.slice(j, i++)
if (stack.length === 0 || stack.pop() !== s) return false
} else if (code.startsWith('<', i)) {
let j = i + 1
i = code.indexOf('>', j)
if (i < 0 || i === j || i - j > 9) return false
for (let k = j; k < i; k++) {
if (
code.charAt(k) !== code[k].toUpperCase() ||
!(code.charCodeAt(k) >= A && code.charCodeAt(k) <= Z)
)
return false
}
let s = code.slice(j, i++)
stack.push(s)
} else {
i++
}
}
return stack.length === 0
}
Copy The Code &
Try With Live Editor
Input
code = "
This is the first line ]]>
"
Output
true
#2 Code Example with C# Programming
Code -
C# Programming
class Solution:
def isValid(self, S):
CDATA_BEGIN = '![CDATA['
CDATA_END = ']]>'
def collect_tag(i):
for j in range(i, len(S)):
if S[j] == '>': break
else:
return None
return S[i+1:j]
def valid_tag(tag):
return 1 <= len(tag) <= 9 and all('A' <= c <= 'Z' for c in tag)
if not S or S[0] != '<': return False
tag = collect_tag(0)
if (tag is None or
not S.startswith('<{}>'.format(tag)) or
not S.endswith('{}>'.format(tag)) or
not valid_tag(tag)):
return False
S = S[len(tag) + 2: -len(tag) - 3]
i = 0
stack = []
while i < len(S):
if S[i] == '<':
tag = collect_tag(i)
if tag is None: return False
if tag.startswith(CDATA_BEGIN):
while i < len(S) and S[i:i+3] != CDATA_END:
i += 1
if not S[i:i+3] == CDATA_END:
return False
i += 2
elif tag.startswith('/'):
tag = tag[1:]
if not valid_tag(tag) or not stack or stack.pop() != tag:
return False
else:
if not valid_tag(tag):
return False
stack.append(tag)
i += 1
return not stack
Copy The Code &
Try With Live Editor
Input
code = "
This is the first line ]]>
"
Output
true