Developers often require verifying the URLs of external sites in various situations like for Affiliate Program, Link Building and many more. We can create a simple function named verifyURL() which accepts the URL as string parameter and return Boolean if the URL is really an HTML page.
function verifyURL($url) { // function body }
$url will carry the URL string.
In function body we add to verify if the passed string is a properly formatted URL.
if (!preg_match('%^(https?://)?([da-z.-]+).([a-z.]{2,6})([/w .-]*)*/?$%', $url)) { return false; }
We used regular expression to verify the URL format using preg_match(). False is returned if URL is not matched successfully with regular expression.
If the URL format is correct then code tries to fetch the content of the URL using file_get_contents() function.
$content = @file_get_contents($url); if (!$content) { return false; }
file_get_contents() throws a warning level error and returns false if the URL is not found. To control warning we have added @ to disable it. In next line the we check if the returned value from file_get_contents() is false, if it is false then functions stops its execution and return false value.
After fetching the content successfully we verify the content using regular expression.
if(!preg_match("/<[^<]+>/",$content,$m)) { return false; } return true;
Regular expression above verifies if the content is HTML and returns true. If the content is not successfully matched with the regular expression then it fails the verification and return false.
Following is the complete implementation of the function verifyURL().
function verifyURL($url) { if (!preg_match('%^(https?://)?([da-z.-]+).([a-z.]{2,6})([/w .-]*)*/?$%', $url)) { return false; } $content = @file_get_contents($url); if (!$content) { return false; } if(!preg_match("/<[^<]+>/",$content,$m)) { return false; } return true; } $url = "https://2014.discretelogix.com"; $response = verifyURL($url); if($response == true) { echo 'URL '.$url.' is verified'; } else { echo 'URL '.$url.' is not verified'; }