try using regex to parse HTML
@007 You ever tried the Python HTML/XHTML parser? https://docs.python.org/3.7/library/html.parser.html?highlight=html%20parser#module-html.parser