What Your OSINT Says About You

At the onset of any engagement, Triaxiom Security engineers will begin with research, often called Open Source Intelligence Gathering, or OSINT for short. Engineers will comb the internet, both manually and with automated tools and scripts, to better understand the client, their underlying technologies, and where weaknesses may lie. In general, any data obtained during this phase may be valuable, but may not be immediately useful.

The good news and bad news is that anyone is capable of conducting their own OSINT campaign without any significant technical barrier to entry. This means malicious users can find this data, but so can stakeholders/employees within an organization. Those responsible for security at their organization can easily discover just how large their online footprint is with some basic knowledge of where to look and what to look for. Outlined below are a few steps that can be completed by anyone with an internet connection, and while tools exist for conducting research campaigns, they will only be touched on briefly as the goal of this blog is to make OSINT accessible to anyone in an organization.

Passive Reconnaissance

Most OSINT activities will begin with Passive Reconnaissance, that is, what information can be discovered without actually interacting with the target organization’s infrastructure. This may include a review of the following:

  • Social Media Platforms: LinkedIn, Facebook, Instagram, etc. These platforms can provide insight into an organization. For example, LinkedIn is great for gathering usernames for password attacks and discovering technologies in use based on job postings. Facebook and Instagram can provide insight into the key terms and language patterns (helpful for social engineering). Further, employees may take photos inside their office, allowing an attacker to map out the layout of the office, and if visible, clearly highlight what an employee badge looks like for cloning which may aid in a physical penetration test.
  • Certificate Transparency Logs: A quick way to get a handle on the size of an organization’s online footprint is to review certificate transparency logs. These logs are meant to provide a way to track and audit certificates issued to websites. Since they are public, they can be reviewed by anyone. With websites generally named after their function, it can be a quick way to gather a vast amount of information about an organization, all without having to ever interact with any actual infrastructure.
  • Search Engine Reviews:  The ubiquitous use of search engine and search engine optimization techniques can make reviewing an organization’s footprint both trivial and fruitful. It is not an uncommon occurrence to find confidential data with a simple search. If that does not provide data, a more targeted search can be used via search operators (sometimes referred to as “Google Dorks“). This programmatic syntax can be used with search terms to create a specific query for targeting everything from documentation to actual vulnerabilities.
  • Paste Sites/Cached Documents/Internet Archives: There is a reason the “Internet never forgets” is such a cliché – it’s true. Between paste sites that allow anonymous posting of nearly anything to archived copies of web pages, data truly does become permanent once it becomes public. This data may provide the keys to the kingdom or be a waste of time. It’s always worth reviewing.
  • Licensing and Regulation Requirements: It is not uncommon for certain types of businesses and even employees to be required to hold special licenses. Often, this information is public, and easily found if one knows where to look. The information here can vary, but likely includes contact information and, at a minimum, a pretense for building a sophisticated social engineering campaign.
  • Publicly Posted Code: If a developer has posted an organization’s code base on sites like GitHub, it may contain sensitive information such as API keys, usernames, or passwords. Even if sensitive data isn’t found, it may help inform active testing by understanding how an application functions.

Active Reconnaissance

Active reconnaissance is the next step in OSINT review. This will include interacting with an organization’s infrastructure. This phase can be extremely fruitful and provide significant assistance in furthering an attack chain. It should be noted that anyone planning to interact directly with an organization’s online presence should have a signed document granting them permission. Triaxiom Security will never perform active reconnaissance on a client without explicit permission to do so.

  • Websites Owned by the Organization: Reviewing the sites discovered during passive reconnaissance is a common first step. Technology in use on the sites, reviewing the source information, and reviewing products/services offered all feed into information needed to make a comprehensive plan of attack and can be ascertained through looking at a site in a web browser.
  • Subdomain Discovery: It may be worthwhile to brute force subdomains to uncover even more attack surface. Tooling is beyond the scope of this blog, but there are hundreds of tools that make brute forcing subdomains an efficient way to map an organization’s infrastructure.
  • Crawling Websites: Either manually or with tools, it is possible to gather keywords, usernames, phone numbers, and anything else that could be useful for different types of follow-on attacks. Keywords could be gathered to make a targeted password spraying attempt more likely to succeed. Names of leadership or just employees in general could be used to make a social engineering attempt more realistic.
  • Confidential Information Leaks: As noted in the passive recon portion above, it is not uncommon to find confidential or private information on an organization’s website. This may include customer data, financial information, or information that could damage an organization’s ability to maintain a competitive advantage.

Reducing Your OSINT Footprint

With all the places to find data, it may seem like a daunting task to not only review it all, but somehow combat the glut of information available. The following is not a comprehensive list of every single step to limiting the size of an organization’s online presence, but serves as a jumping off point that can be tailored based on specific needs:

  • The first step is knowing what exists in the public information sphere, and to the best of one’s ability, where it exists at. In the passive reconnaissance phase, there is not a lot one can do about information held by a third party. Requesting it to be removed may or may not be successful. But knowing what data might be used against an organization can provide the security team a heads up of what to look for.
  • Providing user training based on the information that can be publicly accessed is imperative. For example, if staff members hold publicly accessible licensing, providing training regarding how those licensing boards will contact employees and what information they may request is important to help mitigate the risk of targeted social engineering campaigns.
  • Remove and/or obfuscate the data that the organization controls on its own web properties. This includes things like:
    • Ensure company social media doesn’t disclose information it shouldn’t.
    • Comparing application logs to the data found publicly available may provide insight into how attackers view that information. For example, if a company has a posting for a Salesforce administrator job and then suddenly sees an uptick in attacks on their public facing Salesforce infrastructure, it may indicate an attacker also saw the posting and confirmed the company was using this service.
    • Verify confidential data is not publicly accessible through crafted searches.
  • Make sure employees receive training on what is acceptable to post on third party platforms and what is not.

In summary, the Internet is teeming with data, and a lot of it is accessible to anyone who knows where to look through OSINT techniques. Organizations need to be mindful of the data they put out there and the data that already exists on the public domain. The best defense is knowing what’s out there and how it could be used against the organization. Once the breadth of the online footprint is known, removing particularly sensitive information or training staff on what to watch out for can be tailored to help them recognize suspicious activity.