A Death Knell for the File

Why the Old Metaphors Don’t Work Anymore

I’ve already talked about how we need to change the way we think about storage. Partly because we are creating disunified silos of information on SAN and NAS distributed around the enterprise. Partly because increasing disk capacity is creating performance, redundancy and backup headaches. But this rethink is also being driven by another factor: the way we access data is changing.

A Death Knell for the File

One of the most common objections I hear in response to storage metaphors like object is that users really need files, with their presentation paradigms typically being about modification times and permissions. But the way we interact with data today means that the ‘file’ is an increasingly outdated metaphor. Today, we interact with data through applications, not as files. The file paradigm is an increasingly unnecessary intermediate step; a legacy of how computer technology has evolved. Today’s generation of apps are focused on making our data meaningful before presenting it to us. We are far more engaged with visual rather than numerical information – and users in the future will interact with pre-processed information through analytic systems or other applications.

Not a Bucket but a Pipeline

The possibilities of pre-processing information for users is waking people up to the idea that there is a lot of unused processing power in a storage system. Object stores are ideally placed for converting or processing raw data because they are built out of general purpose servers. This is blurring the lines between storage and computation and presents a different paradigm. Storage is no longer something that we can view as a bucket into which we dump stuff. Storage can be seen as a pipeline through which data moves. It’s a more functional and user-centric vision of what storage can be for users than the traditional bucket metaphor. The open-source object-storage system Ceph is architected with this kind of pre-processing in mind. Ceph offers many hooks in the code for users to write custom plugins to execute on write or read. This kind of pre-processing is already happening in many industries: adding watermarks to images on ingest in the media industry; or first stage analysis of survey data in the oil and gas industries.

Creating New Metaphors

If the problem of information silos within the enterprise and disk performance, redundancy and backup headaches are the negative drivers forcing us to rethink the way we approach storage, then the new ways we access data and the opportunities for pre-processing stored data are the positive forces which are impacting on the way we should think about data storage. And in response to these positive forces, too, object storage is the most obvious and commonsense answer.