Recon Tool: Metagoofil

by | Feb 24, 2022 | Tools

Premium Content

Patreon

Subscribe to Patreon to watch this episode.

Reading Time: 2 Minutes

GitHub Link

Reading Time: 2 Minutes

Recon Tool: Metagoofil

GitHub Link

 

 

Metagoofil

 

Introduction

Whenever you are starting Penetration Testing, the first steps are recon and OSINT. These are the most important ones that would determine how you will proceed next to perform attacks and exploitation. A lot of times sensitive documents and info are publicly available that could contain info impacting your company and often from employees that shared wrongly files or who left the company leaving these traces for attackers to harvest in your recon process.

Metagoofil by opsdisk is an information-gathering tool. It is designed to extract all the metadata information from public documents that are available on websites. This tool uses two libraries to extract data. These are Hachoir and PdfMiner. After extracting all the data, this tool will generate a report which contains usernames, software versions, and servers or machine names that will help Penetration testers in the information-gathering phase. This tool can also extract MAC addresses from Microsoft office documents. It can give information about the hardware of the system by which they generated the report of the tool.

It searches Google for specific types of files being publicly hosted on a web site and optionally downloads them to your local box. This is useful for Open Source Intelligence gathering, penetration tests, or determining what files your organization is leaking to search indexers like Google. As an example, it uses the Google query below to find all the .pdf files being hosted on example.com and optionally downloads a local copy.

site:example.com filetype:pdf

 

This is a maintained fork of the original https://github.com/laramies/metagoofil and is currently installed by default on the Kali Operating System https://gitlab.com/kalilinux/packages/metagoofil. Unlike the original, a design decision was made to not do metadata analysis and instead defer to other tools like exiftool.

exiftool -r *.doc | egrep -i "Author|Creator|Email|Producer|Template" | sort -u

 

See Also: Complete Offensive Security and Ethical Hacking Course

 

Installation

Clone the git repository and install the requirements

git clone https://github.com/opsdisk/metagoofil

cd metagoofil

virtualenv -p python3 .venv # If using a virtual environment.

source .venv/bin/activate # If using a virtual environment.

pip install -r requirements.txt

 

See Also: Offensive Security Tool: Swaks – Swiss Army Knife for SMTP

 

Docker Installation & Usage

 

git clone https://github.com/opsdisk/metagoofil

cd metagoofil

docker build -t metagoofil.

# This will save the files in your current directory.

docker run -v $PWD:/data metagoofil -d kali.org -t pdf

 

See Also: Samsung Shattered Encryption on 100M Phones

 

Google is blocking me!

If you start getting HTTP 429 errors, Google has rightfully detected you as a bot and will block your IP for a set period of time. One solution is to use proxychains and a bank of proxies to round robin the lookups.

Install proxychains4

apt install proxychains4 -y

 

Edit the /etc/proxychains4.conf configuration file to round robin the look ups through different proxy servers. In the example below, 2 different dynamic SOCKS proxies have been set up with different local listening ports (9050 and 9051). If you don’t know how to utilize SSH and dynamic SOCKS proxies, you can pick up a copy of Cyber Plumber’s Handbook and interactive lab to learn all about Secure Shell (SSH) tunneling, port redirection, and bending traffic like a boss.

vim /etc/proxychains4.conf

round_robin

chain_len = 1

proxy_dns

remote_dns_subnet 224

tcp_read_time_out 15000

tcp_connect_time_out 8000

[ProxyList]

socks4 127.0.0.1 9050

socks4 127.0.0.1 9051

 

Throw proxychains4 in front of the Python script and each lookup will go through a different proxy (and thus source from a different IP). You could even tune down the -e delay time because you will be leveraging different proxy boxes.

proxychains4 python metagoofil.py -d https://github.com -f -t pdf,doc,xls

 

See Also: How ILOVEYOU worm became the first global computer virus pandemic

 


 

merch

 

Recent Tools

Offensive Security & Ethical Hacking Course

Begin the learning curve of hacking now!


Information Security Solutions

Find out how Pentesting Services can help you.


Join our Community

Share This