Cos (cos) wrote in perl,

simple html parsing

Is there a simple and straightforward way to parse HTML text (from a string) into a perl data tree, using only the modules provided with a standard perl distribution? HTML::Parser is the closest I've found, but it seems designed for entirely different purposes, and looks like it'd require some tricky gymnastics to make it simple take a string of HTML and give me back a data structure of all the contents. I've found HTML::Parser::Simple which seems to be exactly what I want, but it's a nonstandard module that I don't even see in the normal Linux repos, so having a script that depends on the presence of that module would be a major pain. (Sure, I can find and install the module on one host, but my context is to write something that could work on many hosts, that I don't necessarily admin myself)
  • Post a new comment


    default userpic