Do not use, republish, in whole or in part, without the consent of the Author. TheTAZZone policy is that Authors retain the rights to the work they submit and/or post…we do not sell, publish, transmit, or have the right to give permission for such…TheTAZZone merely retains the right to use, retain, and publish submitted work within it’s Network

Code: Select all
Soda_Popinsky has very kindly allowed this tutorial of his to be hosted on the TAZ.

Custom Web Based Honeypots with GHH
by Soda_Popinsky


GHDB operated by johnny.ihackstuff.com


What is GHH?
GHH is the reaction to a new type of malicious web traffic: search engine hackers. GHH is a “Google Hack” honeypot. It is designed to provide reconaissance against attackers that use search engines as a hacking tool against your resources. GHH implements honeypot theory to provide additional security to your web presence.

What is a honeypot?
A honeypot is, to quote Lance Spitzner founder of the Honeynet Project:

“An information system resource whose value lies in unauthorized or illicit use of that resource.”

Simply put a honeypot is something that appears to be vulnerable, but in reality is recording illicit use by malicious attackers.

GHH allows administrators to track malicious hosts, observe who is perpetrating the attack and how it is being executed via the log. The data generated by this, or any other honeypot can be used to deny future access to attackers, notify service providers of attacks originating from their networks or act as an input for statistical analysis.

What are search engine hackers and why should I care?
Google has developed a powerful tool. The search engine that Google has implemented allows for searching on an immense amount of information. The Google index has swelled past 8 billion pages [February 2005] and continues to grow daily. Mirroring the growth of the Google index, the spread of web-based applications such as message boards and remote administrative tools has resulted in an increase in the number of misconfigured and vulnerable web apps available on the Internet.

These insecure tools, when combined with the power of a search engine and index which Google provides, results in a convenient attack vector for malicious users. It is in your best interest to be knowledgable of, and protect yourself from this threat.

This threat is amplified by tools like Foundstone’s Sitedigger, and Wikto, which automate this technique.

Building the Honeypot

(it’s recommended you have used or are familiar with how the pre-built honeypots work, as well as an existing knowledge of PHP)

At the GHH project page, you will find several pre-built honeypots. They may contain a certain file structure, but basically, they all have at least this minimum:

3 files

-The Honeypot
-The Config File
-The Log file

And a README.txt, which isn’t required to function but contains instructions for other users.

At the GHH project page, download the “Custom GHH Template”. Extract the contents and view config.php in a text editor. Each section is clearly marked with a header and footer, with small detail on what the section does. Find the configuration section, and change the $Filename variable to contain the filename or path to a text file (which you will have to make) that isn’t in the document root of your webserver. There is no default filename or default logfile to prevent predictable file locations. Your logs for your custom honeypot will go into this filepath, as well as any new honeypots that use this configuration file.

View template.php in a text editor and find the configuration section. Enter the filepath or filename for your configuration file (config.php doesn’t have to be in your document root).

Scroll down to where it says “Begin Custom Honeypot Section”.


Code: Select all
//Begin Custom Honeypot Section
//GHH Honeypot by Ryan McGeehan for GHDB Signature #365 (intitle:"PHP Shell *" "Enable stderr" filetype<img src="http://images.antionline.com/images/smilies/tongue.gif" border="0" alt="">hp)
$HoneypotName = "PHPSHELL";

//Trick PHP Shell page
echo “<html>\n<head>\n<title>PHP Shell 1.7</title>\n</head>\n<body>\n<h1>PHP Shell 1.7</h1>\n\n\n<form name=\”myform\” action=\”/mysidia/main/mid/uploaded/p-s.mid.php\” method=\”post\”>\n<p>Current working directory: <b>\n<a href=\”/mysidia/main/mid/uploaded/p-s.mid.php?work_dir=/\”>Root</a>/</b></p>\n\n<p>Choose new working directory:\n<select name=\”work_dir\” onChange=\”this.form.submit()\”>\n<br />\n<br /><html>\n<head>\n<title>PHP Shell 1.7</title>\n</head>\n<body>\n<h1>PHP Shell 1.7</h1>\n\n\n<form name=\”myform\” action=\”/mysidia/main/mid/uploaded/p-s.mid.php\” method=\”post\”>\n<p>Current working directory: <b>\n<a href=\”/mysidia/main/mid/uploaded/p-s.mid.php?work_dir=/\”>Root</a>/</b></p>\n\n<p>Choose new working directory:\n<select name=\”work_dir\” onChange=\”this.form.submit()\”>\n<br />\n<br />\n<br />\n<br />\n\n</select></p>\n\n<p>Command: <input type=\”text\” name=\”command\” size=\”60\”>\n<input name=\”submit_btn\” type=\”submit\” value=\”Execute Command\”></p>\n\n<p>Enable <code>stderr</code>-trapping? <input type=\”checkbox\” name=\”stderr\”></p>\n<textarea cols=\”80\” rows=\”20\” readonly>\n\n\n</textarea>\n</form>\n\n<script language=\”JavaScript\” type=\”text/javascript\”>\ndocument.forms[0].command.focus();\n</script>\n\n<hr>\n<i>Copyright &copy; 2000&ndash;2002, <a\nhref=\”mailto:gimpster@gimpster.com\”>Martin Geisler</a>. Get the latest\nversion at <a href=\”http://www.gimpster.com\”>www.gimpster.com</a>.</i>\n</body>\n</html>\n”;

//Find our PHP shell target in the referer site
if (strstr($Attack[‘referer’], “Shell”)){
$Signature[] = “Target in URL”;

//Finds if exact GHDB signature was used
if (strstr ($Attack[‘referer’], ” intitle%3A%22PHP+Shell+*%22+%22Enable+stderr%22+fi
$Signature[] = “GHDB Signature!”;

//End Custom Honeypot Section

The header shows that this template is using the honeypot code for the Google Hacking Database (GHDB) #365, which is a PHP Shell honeypot. PHP Shell is a vulnerability on a misconfigured webserver, and GHH is emulating it in this example.

Change $HoneypotName to a string that will describe the honeypot in the logs. GHH honeypots use quick and dirty names here as a standard.

The echo statement that appears is what imitates the vulnerable page. You will need the HTML output of the vulnerable website to place into this line. You can find HTML to use from the GHDB at the link provided above, maintained by johnny.ihackstuff.com. This brings up an important point called fingerprinting, which will be covered later.

The next line is a signature. It is a quick statement that searches for “Shell” in the referred URL. Many search engines have a referral included in their links, so it’s possible to determine which search engine and what query an attacker used to reach the honeypot. In this case, “Shell” is being searched for, and if found it will put “Target in URL” in the logs to do some of the investigation for us.

The next signature searches the referral for the exact GHDB Signature, this will tell us that the attacker either
A) Used a hacking tool
B) Used the GHDB database or
C) got lucky and crafted the same search as GHDB.

That’s how this section operates. Here are your tools to work with.

The $Attacker array contains these indexes:


Code: Select all
$Attacker['IP'] //Contains the Hosts IP
$Attacker['request'] //Contains the Hosts request to get to the honeypot
$Attacker['referer'] //Contains the referrer if exists
$Attacker['agent'] //Contains the hosts user agent
$Attacker['accept'] //The rest describe the browser and connection between host and server

You can use these indexes along with some logic written in PHP similar (or exact logic, searching for different strings) to the sample given above. When your code decides it has found something malicious in the $Attacker array, append it to the end of the $Signature array:


Code: Select all
if (strstr($Attack['referer'], "Shell")){
$Signature[] = "Target in URL"; //Append it like this, $array[]= "whatever"; will go to the end of the array

and it will appear in the logfile along with any other signatures found.

The Logs

Here’s an example log from testing:

The logs are done in the CSV format. (Comma Separated Values) Each field is separated by a comma. The fields in the document include:

Tripped: The honeypot was accessed / tripped. (If you have multiple honeypots, this will tell you which one was accessed)
Time of Attack: The time the honeypot was viewed
Host: The IP address of the attacker
Requested URI: The Uniform Resource Identifier made to reach your site
Referrer: This will have the query used in the search engine in most cases, alarming you to what the attacker attempted to find, and how they tried to find it. The most important detail of the log.
Accepts: Contents of the Accept: header if there is one.
Accepts Charset: Contents of the Accepts_Charset header if there is one.
Accept Language: Contents of the Accept-Language
Connection: Contents of the Connection: header from the current request
User Agent: The user agent of the attacker
Signatures: The signature of attack the honeypot was able to determine from a combination of browsers headers.

Making Sense of It All

Don’t panic if your log file ($Filename) has a large number of requests in it. Honeypots are meant to be accessed. This log is a potent source of information to see how attackers are reaching your site. By looking at the referrer field in the log, you will be able to determine where the attacker came from, the query they used, or if your honeypot has been discovered and linked.

Fingerprinting and Policy


Fingerprinting is an issue with all honeypots. Fingerprinting is the process of identifying a honeypot from a legit application or host. In GHH’s case, the HTML has to be nearly identical in most cases to the vulnerable application’s HTML. This way Google indexes the honeypots and vulnerable applications the same way and they cannot be fingerprinting without browsing the server or viewing the page. There are some cases when the HTML is different from honeypot to honeypot, which would require you to break up the string with randomized codes, numbers, or strings.


Be extremely careful when creating any policies around a GHH logfile. There can be false positives and it can become a potential risk if logs are actively involved with policies.


If anyone needs help trying out ghh, or has any suggestions for it, shoot me a pm. But seeing how I am “Under Investigation” now, I don’t know how long that offer will stand

So give it a roll and let me know how it goes.

By admin

Former Freehand Freelance Graphic Illustrator... been online since 2004 ( late starter ), blogging since 2005, presently writing a suspense-thriller e-book that began as a screenplay.