To cancel

Script to replace redirected URLs on Web pages.
Works at the command line on a static site, preferably on a local image for placement on the network.

Prerequisite

PHP 7 required.
Curl must be enabled in the php.ini configuration file.

This script scans the pages of a website, validates each URL, and replaces the URL with a new one when redirected.

This is also suitable for sites that switch from HTTP to HTTPS, it updates links both on the site itself and on all other related sites.

It also displays broken links and, for static nodes, replaces a link testing tool such as Link Checker on that site.

Code

The program uses the DOMDocument PHP class to find links in <a> tags or images. But it also uses the file_get_contents () function to load the file as raw text.

The program uses Curl to check if the link is being redirected and then find the final redirection address.

The str_replace function is used to replace redirected URLs (not setAttribute). The content is then saved using the file_put_contents ().
Using these alternative functions avoids using the saveHTMLile method, which attempts to recover HTML content before saving the file. Because after that, tags are added that may already be in the php file.

Redirection test PHP code:

function redirected($url)
{
   $hcurl=curl_init();
   
   curl_setopt($hcurl, CURLOPT_CONNECTTIMEOUT, 300);
   curl_setopt($hcurl, CURLOPT_RETURNTRANSFER, true);
   curl_setopt($hcurl, CURLOPT_VERBOSE, false);
   curl_setopt($hcurl, CURLOPT_URL, $url);
   curl_setopt($hcurl, CURLOPT_HEADER, true);
   curl_setopt($hcurl, CURLOPT_NOBODY, true);
   curl_setopt($hcurl, CURLOPT_FOLLOWLOCATION, false);
   curl_setopt($hcurl, CURLOPT_SSL_VERIFYPEER, false);
   $headers = curl_exec($hcurl);
   $code = curl_getinfo($hcurl, CURLINFO_HTTP_CODE);

   if($code!=301)
   {
      curl_close($hcurl);
      return "";
   }
   
   curl_setopt($hcurl, CURLOPT_FOLLOWLOCATION, true);
   $headers = curl_exec($hcurl);
   $newurl = curl_getinfo($hcurl, CURLINFO_EFFECTIVE_URL);
   $code = curl_getinfo($hcurl, CURLINFO_HTTP_CODE);

   curl_close($hcurl);
   if($code!=200)
   {
      return "";
   }
   return $newurl;
}

Operation manual

Open the command line console, navigate to the directory containing the site pages to be updated. Enter:

php c:/unredir/unredir.php [options]

In the command, replace the above directory with the one where you installed unresh.

There are two options:

-t test, check the result without changing the files.

-v, view all scanned pages.

Loading

Versions:

See also...

Converting HTTP to HTTPS. This script replaces http with https for a specific domain. It complements it as it also changes the references in the text. But it only considers redirects for the specified domain.