Friday, April 25, 2008

Tips for ‘green’ storage

By Alex Young

The storage industry has received a lot of the “green” limelight recently. Hard disk drives use power to run and to cool down, and maximising their use is important. With organisations around the world generating extremely large volumes of data every day, the need to store, access, and protect this information is paramount. So how can you ensure data is suitably stored without unnecessarily harming the environment?

The efficiency of the product is the invisible, yet key, factor, as poor efficiency leads to higher power usage. If you improve the efficiency of your storage infrastructure, then your data centre will become greener and your TCO lower.

Although some vendors focus on the power consumption of storage products when they are idle, much higher savings can be realised by reducing the power requirements of those products when they are operational (which is most of the time).

Following are some considerations that will help put IT managers on the right path to greener data centres.


When should disks be idle?

Disk arrays are often deployed in 24×7 environments such as database transactions for online retailers, Websites, e-mail servers, etc. In these cases, the drives provide round-the-clock services and rarely have a chance to stop spinning. Also, most users require the full performance from the array, non-stop, which makes some heavily hyped power-saving features—such as idle disks—disappointing in real-life applications.

One of the instances when arrays are often idle is when they are part of the backup solution, where the drives are written once and only occasionally accessed. In some disk-to-disk or disk-to-disk-to-tape backup configurations, data is written to the array just a few hours a day and is rarely read. In such circumstances, power-saving features such as idle disks are practical and effective.


Boost efficiency

Efficiency is closely linked to maximisation of resources. Scheduled backups and other maintenance tasks should take place outside of the peak working hours, and they should use the scheduler function available on disk arrays to avoid affecting the service performance. In addition, analysis of the applications’ time patterns and scheduling automated tasks enable arrays to be used around the clock in a more efficient manner.

Another way to ensure the infrastructure is more efficient is to use the “snapshot backup” function in RAID arrays to minimise the backup window, rather than host software. The data transfer is performed without the host software intervening, thus avoiding unnecessary host CPU utilisation and bandwidth. As a result, disk backups require seconds rather than hours. Moreover, the frequency of full backups and archiving can be decreased via the use of snapshot-based backups, which saves energy because the devices can be turned off when not in use.


Sharing storage

In a typical office with 50 users, each PC has a built-in disk drive and many store important data, which needs to be backed up. It doesn’t make sense to give each user a USB disk for backup; if each USB disk has 300GB of capacity and uses 60W of power, this corresponds to ~15TB of capacity and 3,000W of power requirements. Besides the management issue, some users may need more capacity, while others may only need to back up small amounts of data.

By backing up to a central disk array over the network, all the unused storage is consolidated and the array can serve more users, while making better use of available capacity. If 50 users share a 15TB, 500W disk array, then the individual power consumption is just 10W.


Different apps and needs

Some RAID arrays are purpose-built for data archiving, some are designed for 24×7 high-availability environments, and others can be tuned for use in different application environments. These arrays offer different levels of availability, security, performance, and capacity. Due to the different types of drives used in high-availability (as opposed to archiving) arrays, configurations and power requirements differ. Putting them side-by-side to compare their environmental impact can lead to misleading results.

For example, demanding environments such as database applications should rely on drives with a high spindle speed (10,000rpm or higher) and high-density arrays so that more transactions per second can be performed. With the help of the RAID controller’s cache, the RAID system can process vast amounts of small transactions. On the other hand, disk drives used in data archiving systems need to offer high capacity and low cost. The performance requirement here is not high, thanks to the snapshot backup functionality, and reliability is not critical for data archiving because most of the drives are in idle mode.

This is why you cannot use a RAID system designed or configured for data archiving in a high-availability environment. Can you imagine an online banking user having to wait for 30 seconds for the drives to “wake up” before being able to access their account details? When random transactions keep coming, all the drives are active, so the ability to have idle disks holds no appeal.


Alex Young is director of technical marketing for Infortrend Europe (www.infor trend.com).

InfoStor Europe March, 2008
Author(s) : Alex Young

No comments: