|

Learn about CSC's Storage
Services.

Archiving E-mail
E-mail archiving is mostly driven by compliance and usually is funded
by legal departments, not by IT. Which is fine with IT, because e-mail
archiving does not necessarily save money.
File system archiving saves money because it’s self-funding. It
only takes 12 to 14 months to recover the upfront cost of putting the
system in. But e-mail archiving actually drives costs up a little bit
because it is no longer done by users on the desktop, subject to quotas.
Quotas force users to make daily decisions about which e-mails to keep
and delete.
The absence of quotas and legal requirements
to keep e-mails for a specified number of years amounts to having an unlimited
inbox. Hence the expense and the legal department’s role in paying
the cost of storage. Because e-mail archiving is seen as a corporate liability,
it is taken out of the hands of users and put under central control.
|
 |
By Richard M. James
There are two good reasons why most organizations think they need more storage: There is more information, and more of it is digital. In addition to gathering more information than ever before about customers, products, and services, organizations now have to find room for content that used to be in non-digital media such as paper, tape, and film. Another reason is that Sarbanes-Oxley and similar regulations now require that data be stored for a fixed number of years, or even indefinitely.
|
For many CIOs, buying more storage looks like the only option.
But that is a short-term fix that does not address long-term information
management needs. Meeting those long-term need means doing one or
both of two things: storing less information and storing it more
efficiently.
Thinking inside the box
Traditionally, the storage industry was not about information but
about containers to put it in — providing servers with disk
drives. Then came storage area networks (SAN), which centralized
disks and other storage devices by linking them to data servers.
The trend was always upward, toward bigger and bigger boxes for
more and more data.
Now we have moved from thinking about the box to thinking about
what’s inside it, about content. We have had to change our
thinking because of changes in technology. For years, there were
few options in the market other than expensive, high-performance
products called enterprise storage. But the rising cost of storage
led to market demand for alternatives to the gold-plated enterprise
environment.
The industry responded to this initially by creating network attached
storage for users’ file data. That helped the situation, but
created challenges similar to those of storage area networks. Nowadays,
the industry has responded to the demand by developing lower-performance,
lower-cost storage options. These alternatives drove down the total
cost of storage, but they also required a different way of thinking
about content. By giving us different kinds of boxes, the new technology
requires us to think about the kinds of content that can be appropriately
stored in each box.
Storing less and storing it more efficiently
One of the first steps in deciding which kind of content goes in
which kind of storage is identifying data that should not be stored
at all. On average, about 10 to 15 percent of the data in most organizations’
systems is junk — personal files of people who have left the
company, copies of copies of files — and can be deleted.
The next step is to classify active files into high-, medium-,
and low-value content. There won’t be universal agreement
on which files belong in which category, but nearly everyone will
agree that high-end ERP systems should go in high-end storage. Mid-level
storage is for files that do not need the highest level of protection
but are nonetheless business-critical and frequently used for production.
Work-group files and similar business content can go in the cheapest
storage.
Then there is the least valuable data, which usually is also the
oldest. For example, files that are more than 90 days old are not
accessed very often, and files that are a year or more old are almost
never accessed. It is likely that these low-value files make up
a large percentage of what is in an organization’s system,
and they should be archived.
Figure 1. File System Classification
Some files are more easily managed than others. Start with the
easiest decisions: Leave system data alone, delete nonbusiness data,
and do the same with redundant application data once the reference
source copy has been located. This leaves valid business data, which
must be classified. The easiest to identify is e-mail, which often
finds its way onto file servers and gets treated as enterprise data.
This should be archived. As you move down the stack, cultural and
complex business systems issues make choices more difficult.
Archiving technology has also changed
The biggest barrier to archiving is cultural, not technical. For
many CIOs, the word archiving does not yet mean online data and
remote replication. Instead, it calls up memories of yesterday.
Archiving data used to mean putting it on
tapes that were sent to an off-site vault. The problem with tapes
is that they wear down after repeated use and are inherently not
as robust as disk drives, so they have a higher failure rate during
restore. It was often the case that attempts to retrieve data years
after it had been archived would fail because the old tapes were
no longer usable. Archiving in years past was too often no more
than an expensive way of throwing information away.
Archiving today is different. There are now less expensive archiving
platforms that no longer have to be backed up. Files archived on
those platforms still show up on users’ screens, but with
different icons. These platforms keep archived files available for
searches and also maintain their integrity, keeping the data in
them usable even after file formats become obsolete. Today’s
archiving technology focuses on content management to achieve a
number of business goals, including lower total cost of ownership
for data storage and seamless user access to archived data. Another
goal is to meet compliance requirements, ensuring the archived content
is retained per market regulations and is deleted when it is supposed
to be.
Backups are one of the major pain points in today’s IT. We
know we should back up our data because we may not be able to recover
it if we don’t. But if we’re using tape, there’s
a chance we won’t be able to recover our data even if we do
back it up.
That does not mean tape is going away. Contracts are still being
written that specify tape. But offsite storage is expensive and
so is tape itself. Tapes are getting bigger and tape drives are
getting faster, but they are not getting cheaper. Disk drives are
getting cheaper as well as getting faster. Within a year or so,
the cost of disk-based virtual tape will drop below the cost of
tape.
Disks have other advantages. It is quicker to retrieve data from
them than from tapes, because disks store data randomly while tapes
do it sequentially. Data on disks can also be replicated remotely,
sending copies to a system at another location. Instead of offsiting
a physical tape, the organization simply buys a network connection.
That means there is no need to hang tapes or remove cartridges,
and there is no need to worry about tapes falling off a truck and
getting into the wrong hands.
The barriers are still cultural
The main reason more organizations do not use tiered storage now
is cultural: They think any data that is not in the most expensive
storage is somehow in jeopardy. This is why so many organizations
still have only one kind of storage, usually expensive, that holds
everything from SAP data to employees’ MP3 files. Even organizations
that have realized that tiered storage is both secure and less expensive
may still resist archiving data.
A trend that is countering these cultural barriers is the move
toward information risk management. Instead of the tactical and
technical focus that is typical of most information security programs,
information risk management seeks to develop a business data policy
for protecting information. This trend is creating new positions
that are often outside IT, in corporate security, legal, or financial
departments.
The people in these positions soon realize that it is not cost
effective to give the highest level of protection to all data. From
there, it is a small step to tiered storage — placing content
with different levels of business value in different kinds of containers
— and to archiving.
Richard M. James is the director
of CSC’s Storage Market Solutions.
|