Parsing Tags From an HTML Page

Nov 4, 2008 | Tags: PHP, Regex | del.icio.us del.icio.us | digg Digg

This tutorial shows you how to parsing interesting & important tags from an HTML page. These tags can be useful. For example, you can use the title and meta tags to make page analysis, generate RSS feed entry, or even things like search engine.

1. Get the page's title:

Listing 1: listing-1.php

  1. <?php
  2. $html = file_get_contents("http://www.nashruddin.com");
  3.  
  4. preg_match("/<title>(.+)<\/title>/siU", $html, $t);
  5. $title = $t[1];
  6. ?>

2.Get the META tags:

Listing 2: listing-2.php

  1. <?php
  2. $re = "<meta\s+name=['\"]??(.+)['\"]??\s+content=['\"]??(.+)['\"]??\s*\/?>";
  3. preg_match_all("/$re/siU", $html, $m);
  4. $meta = array_combine($m[1], $m[2]);
  5.  
  6. print_r($meta);
  7. /*
  8. outputs something like this:
  9. Array
  10. (
  11.     [keywords] => PHP scripts, PHP classes, PHP programming, code snippets
  12.     [description] => Free PHP scripts, classes & code snippets
  13.     [robots] => index, follow
  14.     [author] => Nashruddin Amin
  15. )
  16. */
  17. ?>

3. Get links:

Listing 3: listing-3.php

  1. <?php
  2. $re = "<a\s[^>]*href\s*=\s*(['\"]??)([^'\" >]*?)\\1[^>]*>(.*)<\/a>";
  3. preg_match_all("/$re/siU", $html, $m);
  4. $links = $m[2];
  5.  
  6. print_r($links);    
  7. ?>

Related Articles

Recommended Book

Leave a comment

Name (required)
Email (will not be published) (required)
Website

Characters left = 1000

Tags

Recent Posts

  1. OpenCV Utility: Reading Image Pixels Value
  2. OpenCV Circular ROI
  3. OpenCV 2.0 Installation on Windows XP and Visual Studio 2008
  4. Runtime ROI Selection using Mouse
  5. Real Time Eye Tracking and Blink Detection
View Archives

About the Author

avatar Cool PHP programmer writing cool PHP scripts. Feel free to contact
Tel. +62 31 8662872
+62 856 338 6017
ICQ 489571630
Skype dede_bl4ckheart
Yahoo dede_bl4ckheart
Google nashruddin.amin

Recommended Sites:

Hacker's HTTP Client
HTML and CSS Tutorials
Stop Dreaming Start Action
Online Quran and Translation