Skip to content

Commit 91f29ea

Browse files
committed
HTML Tidy now parses HTML non-recursively.
Instead of recursive calls for each nested level of HTML, the next level is pushed to a stack on the heap, and returned to the main loop. This prevents stack overflow at _n_ depth (where _n_ is operating-system dependent). It's probably still possible to use all of the heap memory, but Tidy's allocators already fail gracefully in this circumstance. Please report any regressions of your own HTML! NOTE: the XML parser is not affected, and is probably still highly recursive.
1 parent b6f7e43 commit 91f29ea

22 files changed

+4088
-4234
lines changed
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Config for test case.
2+
tidy-mark: no
3+
indent: yes
4+
wrap: 999
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
<!--
2+
This test case represents HTML…
3+
-->
4+
<!DOCTYPE html>
5+
<html>
6+
<head>
7+
<title>This is a title</title>
8+
</head>
9+
10+
<body>
11+
<div>
12+
<p>This is the first paragraph.</p>
13+
<p>Now now, second paragraph?</p>
14+
<div>
15+
<p>I'm nested in a div.</p>
16+
<ul>
17+
<li>List item one.
18+
<li>List item two. There isn't a third. Hahaha.</li>
19+
</ul>
20+
<p>Because, you know, lists should have a minimum of three items.</p>
21+
</div>
22+
<p>Penultimate paragraphs are sometimes the best.</p>
23+
</div>
24+
<p>Don't Cray; Buy Amiga!</p>
25+
</body>
26+
</html>
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Config for test case.
2+
tidy-mark: no
3+
indent: yes
4+
wrap: 999
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
<!--
2+
This test case tests the datalist element and the datalist parser.
3+
Oddly, there's not an existing test case that has the datalist element.
4+
-->
5+
<!DOCTYPE html>
6+
<html>
7+
<head>
8+
<title>This is a title</title>
9+
</head>
10+
11+
<body>
12+
<label for="ice-cream-choice">Choose a flavor:</label>
13+
<input list="ice-cream-flavors" id="ice-cream-choice" name="ice-cream-choice" />
14+
15+
<datalist id="ice-cream-flavors">
16+
<option value="Chocolate">
17+
<option value="Coconut">
18+
<option value="Mint">
19+
<option value="Strawberry">
20+
<option value="Vanilla">
21+
</datalist>
22+
23+
<label for="myBrowser">Choose a browser from this list:</label>
24+
<input list="browsers" id="myBrowser" name="myBrowser" />
25+
<datalist id="browsers">
26+
<option value="Chrome">
27+
<option value="Firefox">
28+
<option value="Internet Explorer">
29+
<option value="Opera">
30+
<option value="Safari">
31+
<option value="Microsoft Edge">
32+
</body>
33+
</html>
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Config for test case.
2+
tidy-mark: no
3+
indent: yes
4+
wrap: 999
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
<!--
2+
This test case tests the definition list element and parser.
3+
-->
4+
<!DOCTYPE html>
5+
<html>
6+
<head><title>case-003</title></head>
7+
<body>
8+
9+
<dl>
10+
<dd>
11+
<div>
12+
<table summary="">
13+
<tr>
14+
<center>
15+
<td>What is up?</td>
16+
</tr>
17+
</table>
18+
</div>
19+
<dd>
20+
</dd>
21+
<center>Hello</center>
22+
</dl>
23+
24+
</body>
25+
</html>
26+
27+
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Config for test case.
2+
tidy-mark: no
3+
indent: yes
4+
wrap: 999
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
<!--
2+
This test case tests the optgroup element and parser.
3+
-->
4+
<!DOCTYPE html>
5+
<html>
6+
<head><title>case-004</title></head>
7+
<body>
8+
9+
<label for="dino-select">Choose a dinosaur:</label>
10+
<select id="dino-select">
11+
<optgroup label="Theropods">
12+
<option>Tyrannosaurus</option>
13+
<option>Velociraptor</option>
14+
<option>Deinonychus</option>
15+
</optgroup>
16+
<optgroup label="Sauropods">
17+
<option>Diplodocus</option>
18+
<option>Saltasaurus</option>
19+
<option>Apatosaurus</option>
20+
</optgroup>
21+
</select>
22+
23+
<optgroup label="Body Parts">
24+
<option>Claws</option>
25+
<option>Teeth</option>
26+
<option>Tail Spikes</option>
27+
</optgroup>
28+
29+
<optgroup label="Movies">
30+
<optgroup label="Scifi">
31+
<option>Jurassic Park</option>
32+
</optgroup>
33+
<option>The Good Dinosaur</option>
34+
<option>The Land Before Time</option>
35+
</optgroup>
36+
37+
38+
</body>
39+
</html>
40+
41+
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
<!--
2+
This test case represents HTML…
3+
-->
4+
<!DOCTYPE html>
5+
<html>
6+
<head>
7+
<title>
8+
This is a title
9+
</title>
10+
</head>
11+
<body>
12+
<div>
13+
<p>
14+
This is the first paragraph.
15+
</p>
16+
<p>
17+
Now now, second paragraph?
18+
</p>
19+
<div>
20+
<p>
21+
I'm nested in a div.
22+
</p>
23+
<ul>
24+
<li>List item one.
25+
</li>
26+
<li>List item two. There isn't a third. Hahaha.
27+
</li>
28+
</ul>
29+
<p>
30+
Because, you know, lists should have a minimum of three items.
31+
</p>
32+
</div>
33+
<p>
34+
Penultimate paragraphs are sometimes the best.
35+
</p>
36+
</div>
37+
<p>
38+
Don't Cray; Buy Amiga!
39+
</p>
40+
</body>
41+
</html>
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
line 17 column 13 - Info: missing optional end tag </li>
2+
Info: Document content looks like HTML5
3+
No warnings or errors were found.
4+
5+
About HTML Tidy: https://github.com/htacg/tidy-html5
6+
Bug reports and comments: https://github.com/htacg/tidy-html5/issues
7+
Official mailing list: https://lists.w3.org/Archives/Public/public-htacg/
8+
Latest HTML specification: https://html.spec.whatwg.org/multipage/
9+
Validate your HTML documents: https://validator.w3.org/nu/
10+
Lobby your company to join the W3C: https://www.w3.org/Consortium
11+
12+
Do you speak a language other than English, or a different variant of
13+
English? Consider helping us to localize HTML Tidy. For details please see
14+
https://github.com/htacg/tidy-html5/blob/master/README/LOCALIZE.md

0 commit comments

Comments
 (0)