Some additional repos to consider:
xpath — A lightweight package for xpath manipulation.
php-content-extractor-with-xpath — use this lib you can extract content from the website with xpath
ibrahimgunduz34/web-crawler — WebCrawler is simple xpath based crawler library for PHP developers.
Once a week...... I send out a list of most interesting PHP libraries and apps.
Want to get it?