parsing web pages with tDOM
short note how to parse web pages with tDOM
most web pages, especially now with html5, are rather "flexible" in
their adherence to standards. one can't just parse them as something
XML-like anymore. reading the
manual suggested
that the -html5 and -ignorexmlns switches might help:
set f [ open "foobar.html" r ]
set dom [ dom parse -html5 -ignorexmlns [ read $f ] ]
they did, no parsing errors anymore :)


![[ Celebrate 40 years of GNU! ]](https://www.gnu.org/gnu40/GNU40_badge-sm.png)