Metadata is information about information. It is a description of what a piece of data is about, when and how it was made, how it is connected to other information, etc. Metadata is what the data in a file or sector header is in a storage device or system. Metadata is also the information that is used for indexing and searching content in a search engine and it is an important component in big data analytics. Fast access to metadata is a key to faster response in a storage system.
According to a recent survey from SanDisk 83% of IT decision makers are looking to upgrade their enterprise environment with some form of flash memory in the next 5 years and this trend is being driven at least in part by metadata dependent applications such as big data analytics.
For several years several companies have been developing methods for storing metadata associated with content on flash memory in order to enable finding the referenced data on slower storage devices (such as hard disk drives or magnetic tape). Metadata management using high-speed flash memory to store the metadata is an important tool for finding information in modern data centers. Many storage systems companies are using a flash layer or even a flash appliance for metadata storage.
Data Direct Network (DDN) Exascaler Lustre Storage System includes its Infinite Memory Engine with a flash memory NVM-based buffer cache that can store metadata and other frequently used content. HDS also features flash and HDD storage systems where the flash memory is used for performance acceleration. Increasingly companies are introducing mixed storage systems where flash memory is used for metadata and other frequently accessed data. Flash memory storage appliances, produced by many companies, may also be used for metadata storage and other means for boosting system performance.
A start-up company called Primary Data, started by founders of Fusion-io (sold earlier this year to SanDisk), is offering a platform that virtualizes data across a single global namespace. This data virtualization uncouples application data from the underlying storage hardware. The company says that this platform allows managing direct attached, network attached and private and public cloud storage as a single namespace, providing greater efficiency and performance and will change digital storage as much as virtualization has changed the use of servers.
One of the keys to the Primary Data approach is keeping content metadata on server-based flash-based storage while the underlying data is likely to be kept on more cost effective (in terms of $/GB) HDD storage. The company says it will also improve performance by additional storage intelligence that doesn’t try to save temporary and scratch files and focuses on protecting valuable primary data—as the company’s name indicates.
It is clear that metadata access is a key to many business applications and flash memory allows speedy access to metadata and thus the data that it refers to. Optimizing metadata is indeed the key to future digital storage solutions.