The information gathering steps of footprinting and scanning are the most importance before hacking. Good information gathering can make the difference between a successful penetration test and one that has failed to provide maximum benefit to the client.
We can say that Information is a weapon, a successful penetration testing and a hacking process need a lots of relevant information that is why, information gathering so called foot printing is the first step of hacking. So, gathering valid login names and emails are one of the most important parts for penetration testing.
TheHarvester has been developed in Python by Christian Martorella. It is a tool which provides us information of about e-mail accounts, user names and hostnames/subdomains from different public sources like search engines and PGP key server.
This tool is designed to help the penetration tester on an earlier stage; it is an effective, simple and easy to use. The sources supported are:
Google – emails, subdomains/hostnames
Google profiles – Employee names
Bing search – emails, subdomains/hostnames, virtual hosts
Pgp servers – emails, subdomains/hostnames
LinkedIn – Employee names
Exalead – emails, subdomain/hostnames
New features:
Time delays between requests
XML results export
Search a domain in all sources
Virtual host verifier
Getting Started:
If you are using kali linux, go the terminal and use the command theharvester.
In case, if it is not available in your distribution, than you can easily download it from http://code.google.com/p/theharvester/downlaod, simply download it and extract it.
Provide execute permission to the theHarvester.py by [chmod 755 theHavester.py]
After getting in to that, simply run. /theharvester, it will display version and other option that can be used with this tool with detailed description.
#theHarvester -d [url] -l 300 -b [search engine name]
#theHarvester -d sixthstartech.com -l 300 -b google
-d [url] will be the remote site from which you wants to fetch the juicy information.
-l will limit the search for specified number.
-b is used to specify search engine name.
From above information of email address we can identify pattern of the email addresses assigned to the employees of the organization.
#theHarvester -d sixthstartech.com -l 300 -b all
This command will grab the information from multiple search engines supported by the specific version of theHarvester.
Save the result in HTML file. Command:
#theHarvester.py -d sixthstartech.com -l 300 -b all -f pentest
To save results in html file -f parameter is used as shown in this example.















what about proxy? I nedd to launch Theharvester behind a network proxy! Theharvester exit immediately and not return any result!
You can use Tor to bypass proxy to scanning with harvester
I dont find any hosts while trying with google and I tried with all but still didnt result any emails for the same site you used
I cannot find the file generated by #theHarvester and the result does not show up in iceweasel, sorry this my sound stupid, but i’m kinda new to kali linux, so i’m just experimenting.
Thank You
Sir,Can you please demonstrate to compromise system outside the LAN?
It seems this program is blocked from google , anyone managed to retrieve with proxies for a large number of domains with something like socks?Tnks
impossible to save any file with -f
I’ve been trying to use it, but it does not show any result. It starts the search, but the results are always zero. How can I configure it? Is it necessary a proxy or something like that?