Dark Data: Business Asset or Business Liability?
The University of British Columbia Director of Data Governance & BI George Firican argues that dark data increases cost and the risk of reputational loss if not managed properly
In the era of cheap storage many businesses carry unused information produced as part of normal business operations the value of which is unknown – this is dark data.
In 2020, the global cost of dark data storage worldwide may reach $3.3 trillion USD according to data protection firm Veritas Technologies.
In this episode of Data Conversations Over Coffee, University of British Columbia Director of Data Governance & BI George Firican argues that businesses must develop a robust policy for managing their dark data.
“Even beyond the cost of storage, just managing that dark data eats a lot of valuable resources,” Firican says.
Dark data can be generated by everything from log files to old emails and even surveillance footage. In many cases, this data is forgotten about or is not classified as data because of low levels of data maturity across the business.
However, it is not simply a case of wasted storage, dark data can also present a business risk.
“Old emails present a risk especially if they contain the names of past employees,” Firican notes. “And everything is vulnerable to legal discovery if any potential litigation emerges.”
To manage the risks presented by dark data and to begin to extract value from it Firican suggests a four-pillar policy approach: be aware that dark data exists, classify it, assign metadata tags, and perform a cost-benefit analysis.
Firican goes on to explain that extracting value from dark data can be a real challenge, especially where the data is unstructured or semi-structured.
“We are also constricted by the technology that we have access to,” continues Firican.
“Even if certain employees do understand the potential value [of dark data] they may not have the tools or the skills to be able to mine that data,” he concludes.
- Discover your dark data. Make sure you have a process to identify and classify dark data
- Delete the data you do not need. Make sure that dark data is in your regular disposition schedule
- Look at your competitors. How are your peers making the most of their dark data?