ESG asks: Why so many copies?

September 13, 2018
cloud backup

Data growth isn’t the hot topic it was a few years back. But nothing has really changed—organizations generate and store more data than ever. And, a lot of it is secondary (copy) data, according to the 2018 Enterprise Strategy Group Master Survey on Data Protection. 

The majority of respondents said their organization requires three to five times more storage for copy data than production data. Many others require up to 10x more storage for copy data. 

Obviously, backup represents a good chunk of the duplicate data that organizations create. However, you might be surprised by how much data is copied for other business activities. “It’s actually a pretty even split between data protection and other business uses,” said Edwin Yuen, senior analyst with Enterprise Strategy Group.

According to the ESG survey, 44% of secondary data is created for data protection purposes [snapshots, backup, DR]. The other 56% includes copies created for: 

  • Application development
  • Long term retention/compliance
  • Data mining
  • Testing
  • Sales demonstrations

According to Yuen, these numbers indicate that companies are simply looking to get more out of their data.

“Growing interest in CI/CD (continuous integration/continuous delivery) for application development is driving secondary data growth as companies strive for greater agility,” he said. “In the past, dev teams had to work with test data that wasn’t up to date [because copying data was so time consuming]. That’s no longer an issue because hardware is so fast.”

In other words, developers create more database copies, more frequently.

What’s next?

This is a trend doesn’t appear to be slowing down—at least not in the near term. Gartner’s latest Magic Quadrant for Data Center Backup and Recovery Solutions predicts that 30% of large enterprises will create secondary data for business activities other than data protection by 2020, up from 15% at the beginning of 2017.

“Data mining for statistical analysis and machine learning is probably the next big secondary data growth trend,” said Yuen.

While these studies were enterprise focused, it is likely that SMBs are seeing a similar copy data growth—or will soon. However, storing and managing massive amounts of data is cost-prohibitive for most SMBs. So, it is important to develop an effective secondary data management strategy. Choosing data protection tools with granular, policy-based retention settings is one way to help alleviate secondary data growth. These products automatically delete older, unnecessary data, freeing up capacity and reducing storage costs.