Recon Tool: Metagoofil
Reading Time: 2 Minutes
Reading Time: 2 Minutes
Recon Tool: Metagoofil
Metagoofil
Introduction
Whenever you are starting Penetration Testing, the first steps are recon and OSINT. These are the most important ones that would determine how you will proceed next to perform attacks and exploitation. A lot of times sensitive documents and info are publicly available that could contain info impacting your company and often from employees that shared wrongly files or who left the company leaving these traces for attackers to harvest in your recon process.
Metagoofil by opsdisk is an information-gathering tool. It is designed to extract all the metadata information from public documents that are available on websites. This tool uses two libraries to extract data. These are Hachoir and PdfMiner. After extracting all the data, this tool will generate a report which contains usernames, software versions, and servers or machine names that will help Penetration testers in the information-gathering phase. This tool can also extract MAC addresses from Microsoft office documents. It can give information about the hardware of the system by which they generated the report of the tool.
It searches Google for specific types of files being publicly hosted on a web site and optionally downloads them to your local box. This is useful for Open Source Intelligence gathering, penetration tests, or determining what files your organization is leaking to search indexers like Google. As an example, it uses the Google query below to find all the .pdf files being hosted on example.com and optionally downloads a local copy.
site:example.com filetype:pdf
This is a maintained fork of the original https://github.com/laramies/metagoofil and is currently installed by default on the Kali Operating System https://gitlab.com/kalilinux/packages/metagoofil. Unlike the original, a design decision was made to not do metadata analysis and instead defer to other tools like exiftool.
exiftool -r *.doc | egrep -i "Author|Creator|Email|Producer|Template" | sort -u
See Also: Complete Offensive Security and Ethical Hacking Course
Installation
Clone the git repository and install the requirements
git clone https://github.com/opsdisk/metagoofil
cd metagoofil
virtualenv -p python3 .venv # If using a virtual environment.
source .venv/bin/activate # If using a virtual environment.
pip install -r requirements.txt
See Also: Offensive Security Tool: Swaks – Swiss Army Knife for SMTP
Docker Installation & Usage
git clone https://github.com/opsdisk/metagoofil
cd metagoofil
docker build -t metagoofil.
# This will save the files in your current directory.
docker run -v $PWD:/data metagoofil -d kali.org -t pdf
See Also: Samsung Shattered Encryption on 100M Phones
Google is blocking me!
If you start getting HTTP 429 errors, Google has rightfully detected you as a bot and will block your IP for a set period of time. One solution is to use proxychains and a bank of proxies to round robin the lookups.
Install proxychains4
apt install proxychains4 -y
Edit the /etc/proxychains4.conf configuration file to round robin the look ups through different proxy servers. In the example below, 2 different dynamic SOCKS proxies have been set up with different local listening ports (9050 and 9051). If you don’t know how to utilize SSH and dynamic SOCKS proxies, you can pick up a copy of Cyber Plumber’s Handbook and interactive lab to learn all about Secure Shell (SSH) tunneling, port redirection, and bending traffic like a boss.
vim /etc/proxychains4.conf
round_robin
chain_len = 1
proxy_dns
remote_dns_subnet 224
tcp_read_time_out 15000
tcp_connect_time_out 8000
[ProxyList]
socks4 127.0.0.1 9050
socks4 127.0.0.1 9051
Throw proxychains4 in front of the Python script and each lookup will go through a different proxy (and thus source from a different IP). You could even tune down the -e delay time because you will be leveraging different proxy boxes.
proxychains4 python metagoofil.py -d https://github.com -f -t pdf,doc,xls
See Also: How ILOVEYOU worm became the first global computer virus pandemic