People and the companies they work for hoard data – it’s a fact brought out in survey after survey. Hoarders are not always proud of their habit and are often curious about the options available. Contrary to popular belief, in many cases it is OK to hoard data. Sometimes it is necessary, and in many cases the data being saved can be of great value to the company. Having clarity on the purpose and requirements in your own organization will provide insight into best practices for maximizing the value of the content you keep with the greatest efficiency.
The Four Hoarder Personas
There are four hoarder personas: Pacifist, Captive, Opportunist and Capitalist. Take a look below to decide which of these best describes your situation and to get ideas on best practices and technologies for your situation.
This persona describes an individual or an organization espousing the policy that it is OK to keep everything, even when there are no requirements to retain data. There are no formal data deletion policies or guidelines for deciding what to delete. These users don’t take the time to delete their content, and IT is not empowered to delete it for them. Risk and the cost of doing nothing different is tolerable on all fronts. Storage and protection costs are acceptable; backup windows are satisfactory; there is no legal exposure resulting from keeping all that content laying around; and there is no motivation to shave costs of storage or infrastructure. If this describes your situation, congratulations on finding a rare nirvana.
Regulations and corporate policies are driving the need to hoard data for years or even decades. The day-to-day business value of the preserved content is negligible. Time-to-data and performance metrics, if they exist, will help decide between the likely [technology] choices below. Organizations involved in finance and healthcare are well represented in this persona.
This group generates and acquires valuable content. They have made substantial investments to develop the content, and it would be sinful to not have it available when a perfect use arises in the future. They often want to contrast with, or build upon, historical snapshots or perhaps take advantage of an opportunity to monetize the content. The use of the Opportunist’s hoarded content is generally unplanned. An opportunity will surface, and if it is not easy to get to the relevant content, the opportunity to leverage it may quickly disappear. The organization that can be nimble and regularly draw from the past can gain tremendous advantage. Those who can impressively go beyond only current content will be the star performers.
Content is king. Capitalists are in the content business and generate or capture content that is difficult if not impossible to reproduce. They market, sell and otherwise monetize their content. Their data and content are core to their business strategy, and success is measured by how quickly they can deliver the content, how economically they can store it until it is needed, and even by the volume of the repository from which they draw.
Use Case Requirements and Technologies
The personas above each carry a set of requirements for data storage architectures. Longer time to access data is acceptable to some while completely unacceptable to others. However, in almost all cases, when hoarding large amounts of content, the most important thing to avoid is using expensive high-performance storage for the hoarded content.
There are many great tools available to help understand how much of a company’s content is not active (typically 50% – 80%) and [demonstrate/reinforce] that inactive content should be stored on a less expensive tier (LTFS tape, object storage disk or cloud). Cheap NAS is not a good option once the cost to protect content is considered – protection software and replication hardware will be added, raising cost of ownership and burdening infrastructure.
When discussing best practices, referring to specific storage technology choices is unavoidable. Two key areas must be understood to have a complete view of best hoarding practices: data movers and storage technologies.
Best Practices Based on Persona
Pacifists and Captives: Leverage Your Backup Process
Retained data is not strategic for you so investments should be focused on protecting the currently active data and leveraging that process for long term retention. Disk with deduplication or tape backup are both very acceptable alternatives. Speedy access to retained content is not critical, so it is acceptable to leverage backup jobs for retention by copying tape backup to deep archive, or sending a copy of backup data to be archived in a cloud.
Opportunists: Deploy a Cost-Effective Active Archive
You want to take advantage of content when it’s needed, and you cannot predict when that will be. LTFS tape or object storage disk are very cost-effective means of hoarding content. These technology choices enable ready access (active archive) to content. Where high growth, larger scale and global access are important, object storage is the obvious choice, though LTFS tape behind a global access infrastructure is still worth considering.
Capitalists: Integrate Active Access and Content Protection
Disk backup is critical when practical, but backing up very large content data is not always practical. Some content sets are tens to hundreds of terabytes or more. For these environments archive and protection need to be one in the same. Data dispersed object storage is perfect for this use case. Data can be cost effectively and simultaneously stored and protected. Smaller environments (i.e., less than 200TB of data) may do well with LTFS tape, but larger environments still need to consider object storage for their hoard.
As you can see there are many good reasons for hoarding data, and as the motivations for hoarding become clear, so does the best way to manage it.
So which type of hoarder are you?
Want to Learn More?
Archive means a lot of different things to a lot of different people. Check out our Archive Storage page and see how to do more with your data.