“You have to take into account not only the initial cost, but procurement, evaluation and management costs, as well as the business value perspective and meantime to recovery,” National Library of Medicine’s Simon Liu says.

Dec 31 2009

When One Won't Do — try two or three

Learn how to make the leap to a multitier storage architecture.

Until recently, agencies had no choice but to back up their critical data, applications and servers on costly Fibre Channel attached redundant array of independent disks or direct-attached storage systems, regardless of the relative importance of the information or how often users accessed it.

But now, more efficient, less expensive options mean agencies can rein in costs by turning to tiered storage, which can classify data based on use patterns and relative importance. The benefit? The system reserves the most efficient, high-cost storage for the most mission-critical data while relegating less important, infrequently accessed data to less expensive storage options.

Although there are myriad options for building a tiered storage architecture, most systems consist of three layers:

  • First tier: the most mission-critical data; generally relies on high-performance, high-speed Fibre Channel disk storage.
  • Second tier: mid-level data that is perhaps used frequently; often comprised of a lower-performance Fibre Channel configuration, such as serial ATA drives or optical jukeboxes.
  • Third tier: less important or less frequently accessed data; could include SATA drives, serial-attached SCSI drives, a disk library system, a virtual tape library or even old-fashioned magnetic tape reels and cartridges.

The tiers are not static; typically these systems are set to automatically move data to the appropriate tier as use increases or decreases.

Restoring Health

With its National Institutes of Health's Medline database receiving billions of hits each year, the systems team at the National Library of Medicine knew a few years ago that the library needed to move from a cumbersome, labor-intensive and costly DAS approach to scalable tiered storage. NLM is responsible for many large databases of Web-accessible health-care information.

"We needed a storage architecture that could accommodate our ever-growing data," says Simon Liu, NLM's computer and communications systems director. The consolidation afforded by the tiered infrastructure has simplified maintenance of the agency's storage systems and also improved security enforcement, he says.

To determine which databases to assign to the different tiers, Liu's team worked with the agency's stakeholders to identify the relative importance of each database, considering factors such as usage, end users and data customers, and cost. Armed with that information, NLM assigned each type of data to one of four categories: HOT for data that must be available at all times, and WARM, COLD and ICE, indicating relative importance.

Determining where within the tiers that data should reside is critical, and the best way to do that is by understanding the requirements of those who own, create, generate or use the information, says Ken Steinhardt, chief technology officer for customer operations at EMC of Hopkinton, Mass.

"If that initial assessment is done right, an agency will be able to help weather weaker choices of technology," he says.

For NLM, choosing the technology and the vendor consisted of three steps. Liu and his team conducted an industrial survey, relying on analyst reports and other information services to identify appropriate technologies. After identifying which vendors offered those technologies, the team invited the vendors to conduct briefings for the group. Finally, the agency evaluated all the assembled information, ultimately choosing the most appropriate technology and vendors. NLM ultimately implemented systems from EMC and from a handful of other storage vendors.

Calculating Excellence

Although there certainly are hard costs — anywhere from $10,000 to more than $100,000 — associated with buying and implementing a tiered storage architecture, other factors far outweigh those costs when justifying the move to this approach.

"By consolidating both servers and storage, you're reducing redundancy significantly. By reducing 50 servers down to 25, for example, you're now managing half the number of servers and backing up only half the amount of content," says Gary Lyng, worldwide marketing director for ILM and StorageWorks Solutions at Hewlett-Packard of Palo Alto, Calif. "Not only does that mean that you purchase and maintain half the number of servers, but you've reduced the backup window significantly, reducing operational expenses and allowing you to redeploy servers and storage to other projects. And IT is freed up to focus on proactive projects versus constantly working on backups."

To ensure the most bang for the buck from a tiered storage setup, agencies should make sure to identify the strategic value of information and identify the type of return they are seeking, says Charles King, principal analyst at Pund-IT, a Hayward, Calif., storage consultancy. "But sometimes it's not so easy to quantify in dollars. You're more likely to quantify it in terms of improved storage access or performance."

Liu prefers to look at it as a total cost of ownership issue.

"You have to take into account not only the initial cost, but procurement, evaluation and management costs, as well as the business value perspective and meantime to recovery," he says. To measure the value NLM has received from its tiered storage architecture, the IT team looks at a variety of variables, including the difference in the amount of time it takes to back up data, how long it takes to recover from a failure and how long it takes to add new storage.

Cost is important, but it shouldn't be the primary factor in moving to tiered storage, King says. "Ideally, the primary benefit is that at the end of the day, the organization ends up with the appropriate support for the appropriate application."

<p>Photo: Randall Scott</p>