A plethora of data on the Internet is open source, which means it is available for public access. Anything from public databases to mass media to images and videos can be considered open source. However, the data is much more diverse and spread out than we realize when we make a Google search. A large amount of data like databases, files, and several web pages go under the radar because they can’t be indexed by search engines. Considering the vastness and abundance of data, it’s only logical that it can be used for drawing out analysis. This is where open source intelligence, often abbreviated as OSINT, comes into the picture. Open source intelligence framework refers to the process of collecting raw data legally from numerous resources on the Internet and then analyzing the data to help in decision-making, forecasts, and understanding public perception.
There are hundreds and thousands of terabytes of data that is available on the Internet, so scouring all of it is not possible. Even if you narrow it down to a particular social media application, the manual data collection is hard and time-consuming, to say the least. After that is out of the way, analyzing the data is another ball game altogether. Therefore, there is a need for open source intelligence tools and techniques that make this job easier for analysts. These open source intelligence tools dive deeper into the Internet than a simple search on any search engine. They collect data from numerous resources in a matter of minutes making the analysis of scattered open-source data convenient.
Let’s look at some of the top open source intelligence tools that have managed to make a splash recently.
Shodan is a network security monitor that focuses on the deep web. Regular search engines can only index web pages. However, Shodan can index virtually anything on the Internet. With the help of Shodan, you can access data from webcams, smart TVs, smartphones, medical devices among others. Basically, everything that is and can be connected to the Internet can be used as a source of information and Shodan helps users collect that information efficiently and in less time.
Shodan provides information that is useful for security professionals. It provides detailed information about the network and assets. Every time a service runs on an open port, it announces itself using a banner. The banner can be accessed by Shodan revealing important information regarding the request and the device that made it. Shodan also helps discover fingerprints of a particular entity on the network. Data such as FTP, Telnet, SSH, and HTTP server banners can be collected by Shodan. The results are sorted based on parameters like country, network, OS, and ports.
Built into Kali Linux, TheHarvester is an open source intelligence tool that collects information based on specific targets. It mostly deals with emails and domain information. The information-gathering using TheHarvester is quick and simple. This tool helps security professionals in the early stages of penetration testing. TheHarvester is developed in Python and collects valuable information like employee names, banners, open ports, subdomains, and virtual hosts from search engines like Bing, Yahoo, and from PGP key servers. It also collects data from social networks like LinkedIn. It’s an ideal choice for organizations looking to perform penetration testing on their own network.
3. Google Dorks
Google is the most popular search engine of all. And, even though it provides you with a humongous quantity of data, the data is not quite specific or helpful from an analytics point of view. However, with the help of open source intelligence tool Google Dorks, which has been in place since 2002, you can make more targeted searches with efficiency. Search engines index a lot of information about various entities connected to the Internet which comes in handy for analytics and insights. Dorking is done with the help of a number of operators:
Filetype: This operator is used to define a specific file type that a user needs to look for.
Ext: This operator is used to define what file extension to look for specifically.
Intext: This operator is used to find certain text on a page.
Intitle: This operator is used to retrieve web pages that have a certain text in their title.
Inurl: This operator is used to retrieve web pages with a certain text in their URLs.
Log files are also indexed by search engines and they can be accessed using Google Dorks, which makes it ideal in finding vulnerabilities and hidden information.
Written in Java, this tool is also a part of the Kali Linux bundle. Maltego is efficient in tracking down the footprints of any target on the Internet. Data is collected from various sources and displayed graphically. Maltego is used by law enforcement, forensics, and security professionals for its quick and efficient data collection and visualization. It is available in a community and a commercial version. The community version is limited and can’t be used commercially and only returns a limited number of entities. Maltego helps find a connection between various entities connected to the Internet. The graphical layout makes it easy to see these relationships between two entities that may or may not be directly linked to each other.
This is another tool that comes along with the Kali Linux bundle. Recon-ng performs swift reconnaissance on remote targets. Written in Python, this tool has a simple command-line interface that fetches information about obscure targets. Recon-ng contains several modules like Google_site_web and Bing_domain_web that can be used to gather information about remote hosts in the domains indexed by the respective search engines. Bing_linkedin_cache is another module that helps fetch email addresses in a particular domain and can be used in social engineering.
TinEye is a reverse image search tool that helps you search the web for an image to check if it is available online and where. TinEye uses the neural network, machine learning, and pattern/watermark recognition to look for similar images on the web. The image search uses the picture and the parameters related to it instead of keywords to look for the picture online. TinEye is quite efficient as it provides similar matches for images that have been heavily altered. The image search can be made using an image itself or an image URL. API and browser extensions are available to look for a particular image directly instead of accessing the web application repeatedly. The search can be narrowed down using various filters made available by TinEye.
7. CheckUserames and KnowEm
Social media is home to enormous open source data, so looking for a username on all the different major social networks is like looking for a needle in the haystack. With the help of CheckUsernames, users can search for a username on various social networks at the same time. CheckUsernames can access over 150 social networks. However, KnowEm, a much wider version of this website, has access to over 500 websites.
Open source intelligence: New tools for a new world
All these open source intelligence tools are a part of the new trend that seems to have a promising future. With data growing every day at a snowballing pace, we have all the data we need to perform analysis and forecasts however there is a need of the right framework and tools that help curate this data in a manageable manner so that we can derive the most out of it.
Featured image: Pixabay