After checking out beautifulsoup in Python, I became very curious to know what we have in PHP for accessing the DOM nodes of a html document. After exploring a while, I found out the PHP DOM and XPATH.
Here’s a little example:
masnun.html – The HTML file
1 2 3 4 5 6 7 8 9 |
<html> <head> <title> DOM Test </title> <body> <font size="3">maSnun <a href="https://masnun.com">masnun.com</a></font> </body> </html> |
Now we use PHP to access the document:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
#!/usr/bin/php <?php $dom = new DOMDocument; // construct a DOMDocument element $dom->loadHTML(file_get_contents("masnun.html")); // load the contents of the html file $xpath = new DOMXPath($dom); // build a new DOMXPath element with the loaded DOMDocument // define the search pattern now // here we search for every <a> tag // inside <font> tags which have a // attribute, size = 3 // the <font> tags would be under // the <body> tag // the pattern is quite easy to understand // and I find it quite identical to the linux // filesystem structure $pattern = "/html/body/font[@size=3]/a"; $res = $xpath->evaluate($pattern); // use the evaluate method to get a DOMElement object //Now, lets play with the objects // Please see the DOMElement part of PHP Manual // to get the details on the methods I have used here for($i=0;$i < $res->length; $i++) { echo $res->item($i)->getAttribute("href"); } ?> |
That’s it, pretty simple and easy ! 😀
3 replies on “PHP DOM and XPATH”
Unfortunately in Russia, with visual kulturoybolshie problem.
I am in Web-design newbie, take orders for the development of sites can not yet.
But I often appear to customers who want to make a site.
Where and how to find a decent Web-designer in Russia?
????????, ???????, ???????. ???????????? ???????
Such an enjoyable read, and fantastic comments
How much money is needed to live without doing work and earn money without effort?