Barnaby Davies

Barnaby Davies

September 2018

Data Retention

Data Retention: Work out which files and documents have value and which do not

There was a man from Skegness,
Whose data retention was in a mess;
A report from the regulator,
And too late for the incinerator,
He wished he'd kept rather less.

People almost universally go looking for what they can (or should) delete when they are trying to undertake a data housekeeping exercise or perform some sort of retention management activity. Conversely, I'd like to promote the approach of identifying data that you can (and want to) keep while at the same time deleting the rest. Most folks are potentially going to be a bit uncomfortable about that.

Those feelings of discomfort usually arise from any approach to deleting files and documents not fully informed by understanding the business value, the business owners or the content. And, I get that.
Let us imagine you have 25000 boxes of archived paper records sitting with an external provider (as many businesses do).

Unfortunately, the contents, the business area and the date the content was archived are only known for 20,000 of the boxes. And, for our purposes here, let's say the other 5000 boxes have no information whatsoever. You could go and take the lid off and have a look, but really? Like that's ever going to happen.

We could inventory those 5000 boxes. You could scan it, OCR it and do all sorts of things with it, but actually, you're not going to because it doesn't belong to anyone, no one knows what's in it, you can't make a business case for spending the money and you are never going to get anyone to take ownership of it. So it is going to sit there. And, none of those boxes will ever be retrieved from archive (because no one knows what's in them).

So a pretty strong argument can be made that they have no business value whatsoever.
But these points don't hold true if you start looking at the archive boxes from a risk / cost perspective. It is costing you money to store them, that much you know. You'd have to assume they contain personal data which is almost certainly beyond its retention period. If they contain personal identifiable data (PII), you also have a duty to ensure it is accurate - which it probably won't be. In the event that someone ever left it on the doorstep - it would also present a data breach risk.
So just to recap - this data has no identifiable value whatsoever, some known costs and quite a bit of risk and uncertainty. So should we feel more uncomfortable about deleting it or keeping it? I know what my instincts are telling me.
And of course, why is the situation any different in the world of electronic documents and files? It's not. You don't know what it is, you don't know what it's for and the stuff has not been accessed for 10 years. What exactly do you think is going to change in the next 5 years?
I've started thinking of all those stray files on the network as nothing better than litter. In fact, in this scenario, it is quite difficult to differentiate between this;
And this


So how did it come to pass that you're sitting on a 20 year old mountain of digital (and paper) records and it is no one's job to fix but you are accountable for its compliance and retention or some part of either?

First, no one does retention well (yet). The most obvious reason for this is retention compliance on the whole seems to be interpreted as not getting caught by the regulator rather than actually being compliant. But, on this front at least attitudes are changing. Secondly, you might find that it's a people, equipment and policy thing. Does your retention policy state unequivocally clearly what should happen at the end of a document's retention period? Does it say who should do it? My guess is probably not.

DocAuthority are helping businesses work out which files and documents have value and which do not. We use machine learning to do this. It's then a very simple and quick job to implement retention against policies and check who has got access to sensitive data in your estate.

If you have it, you have to safeguard it, maintain its accuracy and ensure data retention periods are respected. Storage might be cheap but managing data most certainly is not.


If you would like to know more or sign-up for a no fee trial deployment of DocAuthority in your organisation, get in touch with the team.


Recent posts


Information Management

Your Information Catalog is the first step towards Information Governance

I have worked with many organizations to develop their information governance program.  The first step I always ask them to do is to develop a macro ...

by Alan Weintraub January 2019

nasa-53884-unsplash (1)


Research launch reveals the business value of data

Last week we launched our ground-breaking research into the value of business data with the Ponemon Institute. Launch attendees were given a sneak ...

by Mike Quinn December 2018


Information Management

Making Records Managers Information Heroes

Records Managers have always struggled to be viewed as providing strategic value to their organization.  If you look at the history of records ...

by Alan Weintraub December 2018