Categories
PHP

Duplicating WordPress Blog: “The Hasin Vai” Method

Hasin Hayder, a ZCE and Open Source enthusiast shared a nice trick to duplicate your wordpress blog just using the simple combination of php and mod_rewrite of apache. Read his original blog post here. I am going to explain the mechanism step by step here.

The PHP Code:
Put this code inside the index.php on the target URL’s root directory.

<?php
$dataurl = $primaryurl = ?http://hasin.wordpress.com?;//old domain
$secondaryurl = ?http://blog.ofhas.in?; //new domain
$path =array_keys($_GET);
if(!empty($path[0])) $dataurl = ?{$primaryurl}/{$path[0]}?;
$data = file_get_contents($dataurl);
$pattern = ?~{$primaryurl}/([dS/]+)~?;
$data = preg_replace($pattern,?{$secondaryurl}/$1?,$data);
$data = str_replace(array(?<a href=?{$primaryurl}?,?<form action=?{$secondaryurl}?),array(?<a href=?{$secondaryurl}?,?<form action=?{$primaryurl}?),$data);
echo $data;
?>

The URL REWRITING:
Put the following code inside a .htaccess file or in your apache httpd conf file.

RewriteEngine on
RewriteCond $1 !^(index.php|images|robots.txt)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{QUERY_STRING} (.+)

RewriteRule ^(.*)$ index.php?$1&%{QUERY_STRING}
RewriteCond $1 !^(index.php|images|robots.txt)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?$1

Explanation:
······# The URL Rewriting portion processes the requests to index.php (with and without GET query strings) and other static directories and files (images dir and robots.txt)
······# index.php handles the incoming requests as redirected by the URL Rewriting mechanism.
······# First we define the two data source or URL.

<?php
$dataurl = $primaryurl = ?http://hasin.wordpress.com?;//old domain
$secondaryurl = ?http://blog.ofhas.in?; //new domain
These two lines define the URLs involved. The primary URL is the URL where your wordpress blog is really hosted. The secondary url is the URL of the new location where the blog will show up

······# We sort out the path to the data to display and then copy the data as string.

$path =array_keys($_GET);
if(!empty($path[0])) $dataurl = ?{$primaryurl}/{$path[0]}?;
$data = file_get_contents($dataurl);

······# We then use Regular Expression to change all referrences to our primary URL so that they point to our new location.

$pattern = ?~{$primaryurl}/([dS/]+)~?;
$data = preg_replace($pattern,?{$secondaryurl}/$1?,$data);
$data = str_replace(array(?<a href=?{$primaryurl}?,?<form action=?{$secondaryurl}?),array(?<a href=?{$secondaryurl}?,?<form action=?{$primaryurl}?),$data);
Here we change the URLs and Form Targets. But what about Images with relative URL? I think there’s something wrong with that. I will ask Hasin Vai.

······# Lets print out the modofied data. Close the php block.

echo $data;
?>

That’s all. Put the php code in index.php and the Rewriting Rules in a .htaccess. Place the files in your public_html directory to fire off the ground.

PS: I haven’t yet tested the codes myself. But I believe it would work except for relative image URL.