How To Identify Sensitive Data Flows In The Enterprise

One of the most helpful things an organization can do when it comes to security is understanding what needs to be protected. An asset inventory is a great starting point, as it should include all of our hardware and the software you’re running. But perhaps more importantly, you really need to know where your sensitive data is among all those assets. Sensitive data flows and resting points are critically important to secure, but without knowing where these locations are and the sensitivity of the data stored there, you’re really just guessing. This information will help make sounds risk assessments and more accurately communicate risk and the need for certain controls to executive management. We’ve talked about why identifying these flows is important in the past, so let’s dig into how you should go about this today.

Identifying Sensitive Data Flows and Resting Spots

Everyone in tech/security should have a baseline understanding of data in transit and data at rest. Basically, we’re concerned about the data moving through your network from entry to exit (in transit) and all locations that you store sensitive data (when it’s at rest). But how are you supposed to identify all those locations? Well if this is the part where you think I’ve got some awesome tool to recommend that will do it for you and help you through this process, sorry!

There’s really no great technical tool that’s going to provide much value in this process from our experience. There are certainly great Data Loss Prevention (DLP) tools out there that can do some of this, like catching/identifying Social Security Numbers at your network ingress/egress points or catching credit card numbers attempting to be sent through email. But DLP solutions can only catch what can be identified with a signature ultimately, so they are false positive prone when it comes to credit card numbers and social security numbers and basically helpless when it comes to identifying confidential information or intellectual property. If your organization has a great data classification and labeling policy, these tools also may be more useful, but enterprises with that implemented are few and far between.

So without tools, what are we left with? A time-consuming, resource intensive, manual process? Yep! It’s not as bad as it sounds. To really dig into the guts of sensitive data flows and what you have in your organization, you need to start with business processes. Data is sent/received to facilitate the business, so by having conversations and figuring out what all the departments in your organization are doing when it comes to business operations, you can start understanding the data behind those operations. Let’s look at some high level steps to guide you through this process though:

  1. Set up meetings with all of the different departments/silos within the enterprise. These meetings should include the department head or someone who is knowledgeable about ALL the things that department does, and the employees who actually perform these functions. A variety of perspectives that cover the departments are necessary so you can understand known, approved processes and back-channel, “Bob is the only who knows about this” processes. In some organizations, it may be helpful to separate the department lead from the employees so they feel they can speak more freely, or to have a third-party help facilitate who can disarm and make employees feel comfortable discussing things.
  2. In these meetings, you should be understanding and mapping a couple things:
    1. What types of sensitive data does each department work with? You can rank/prioritize these as well.
    2. Where do they receive that data from? When is it first created or where does it enter the organization?
    3. What happens to it? Why is it necessary, how is it handled, what security controls are in place? You want to know everything about their handling of the data.
    4. Where does it go internally? Track the sensitive data and all potential/confirmed storage locations til it stops, using network diagrams and asset inventories where possible.
    5. When/where does it leave the organization? Who is it sent to and why?
    6. What third parties are involved in this process and what types of sensitive data do they create/send/receive?
  3. Copious notes and a solid understanding of business processes obtained during these meetings should help you to create a couple things:
    1. Sensitive data list of all the types you touch/store and what criticality the business places on these data types. This information should help inform the level of effective security applied and risk assessments for the organization.
    2. A list of all sensitive data storage locations, as specific as you can be. These should be considered your critical assets.
    3. Data flow diagrams showing the systems/network devices sensitive data traverses on your network. These should also be considered critical assets.
    4. A list of third-parties/vendors that touch sensitive data for your organization and could pose a risk to that data.

All of this documentation is incredibly useful when baselined and maintained. It can be used in a variety of security processes to better prioritized and apply new security controls, reduce organizational exposure by removing unnecessary sensitive data flows/storage, and help improve third-party security verification by placing more stringent requirements on your vendors that deal with your most sensitive data. Overall, the process of mapping your sensitive data flows is difficult and time consuming, but it can pay huge dividends when done right. Whether you are doing it yourself or engaging a third-party to help you with this process, you’ll have a much better understanding of your security risks when you’re done.