Using Science to Combat Data Loss: Analyzing Breaches by Type and Industry

Introducing a taxonomy for classifying data loss incidents with public information, Interhack examined publicized data breaches by type and industry and found significant results for Finance, Education, Public Administration, and Health Care. A firm understanding of the rates at which types of breaches occur, proportionate to one another, helps with the distribution of limited security budgets, by helping guide the expenditure of capital to where it will have the greatest impact.

C. Matthew Curtin and Lee T. Ayres
Interhack Corporation

Where should defenses be deployed? Security managers can answer the question by knowing what types of breaches there are, and the rates that they occur relative to one another. A number of methods for determining such rates have been proposed with a view to helping with this decision making. Unfortunately, such methods sometimes tend towards anecdote, might be part of a marketing campaign, or lack the context needed to drive informed decisions.

We propose a taxonomy to classify incidents of the loss of control over sensitive information. The taxonomy is hierarchical in nature, allowing classification of incidents to a level of precision appropriate to the amount of information available. Analysis of incidents using the taxonomy may also work with the precision appropriate given the question at hand and data available. We then explore the proportion of breach types in a subset of data losses accumulated by the Identity Theft Resource Center (ITRC). Using the 2002 North American Industry Classification System (NAICS), we classify breach events according to the industry sector in which they occurred.

Downloads:
Using Science to Combat Data Loss

We conclude that the taxonomy is useful and that analysis of incidents by type and industry yields results that can be instructive to practitioners who need to understand how and where breaches are actually occurring. For example, the Health Care and Social Assistance sector reported a larger than average proportion of lost and stolen computing hardware, but reported an unusually low proportion of compromised hosts. Educational Services reported a disproportionately large number of compromised hosts, while insider conduct and lost and stolen hardware were well below the proportion common to the set as a whole. Public Administration's proportion of compromised host reports was below average, but their share of processing errors was well above the norm. The Finance and Insurance sector experienced the smallest overall proportion of processing errors, but the highest proportion of insider misconduct. Other sectors showed no statistically significant difference from the average, either due to a true lack of variance, or due to an insignificant number of samples for the statistical tests being used.