Agencies need to stretch resources, such as physical space and energy, while reducing costs. Data center infrastructure management (DCIM) tools give them a means to improve the efficiency of computing resources in the data center, as well as power and cooling equipment.
IT departments are finding that DCIM and associated tools are useful beyond facilities infrastructure and can be used to manage data center technologies, tools and processes, including the interaction and dependencies among IT resources — such as servers, storage and networking equipment, hardware, software and services — along with power, cooling, floor space and other resources.
Among the benefits of an effective DCIM deployment are:
- Improved reliability, availability and serviceability (RAS) and quality of service (QoS)
- Elimination of pockets of underutilized hardware and software resources
- Establishment of metrics for service planning, budgeting and compliance audits
- Removal of complexity and waste, which can lower costs
- Performance and capacity planning
Removing Cost and Complexity from DCIM
Simply cutting costs can have an adverse effect on RAS, QoS and performance. Instead, getting rid of waste and unnecessary complexity can reduce costs without affecting these factors.
IT administrators should look for signs of complexity that could be created by people or processes as well as complexity related to hardware, software or even facilities. Knowing where cooling needs to be directed, applying different thermal zones and adjusting temperatures to be more effective can reduce costs without hampering performance. Smart cooling and intelligent power management techniques tied to workload and service levels can accomplish this objective.
DCIM information can be stored in performance management and configuration management databases. Regardless of whether these are extensive solutions or small and simple, they should aim for reducing complexity and improving understanding of data center operations.
Data Center Metrics That Matter
How can an agency effectively manage what it has if it does not know how it is being used?
Data centers are information factories, and administrators can use this information to gain insight into how resources are being used. This store of information includes data regarding the facility, energy usage and costs, equipment health and status, tooling, available resources, scheduling, workflows, service delivery quality and waste.
Establishing key performance indicators provides timely insight and awareness, both in real time and from a historical perspective. KPIs should include coverage for data protection (backup, continuity of operations, disaster recovery and archiving) to ensure that federal and agency requirements are met.
Improving RAS, Performance and QoS
Agencies should determine the level of RAS or resiliency that they need from their data centers, being careful to distinguish this from what users merely want.
For example, many users request the highest level of availability and performance, along with the lowest recovery time objective, while demanding service from a cloud provider because they believe it to be less expensive. However, administrators should determine level of service, availability, durability, performance and security that are provided for a given price. Administrators frequently find that users who say they want a higher service level are willing to pay only for a lower level. The level they are willing to pay for is the true level of need.
Some applications can benefit from higher levels of service involving performance, RAS and security, which can improve productivity and resiliency. For data storage performance, agencies can consider using fewer yet faster NAND flash solid-state drives for input-output consolidation.
For storage capacity, agencies should look at using fewer yet higher capacity hard-disk drives, as well as tape and cloud resources. Also, IT shops should implement technologies to reduce the data footprint, including archiving, compression, deduplication, tiering and thin provisioning.
Agencies should strive to find an acceptable balance between RAS and performance. What might appear as a performance problem or bottleneck may be due to lack of RAS or related issues. On the other hand, what appears as a RAS problem may in fact be a performance bottleneck.
Performance and Capacity Planning
Agencies can take a number of steps in planning their data center operations to improve service while reducing costs:
- Establish a process to review the performance of IT resources against the capacity of those resources to ensure that they are sufficient and being used effectively to meet the demands of the data center workload.
- Review the operation of specific hardware and software, as well as the performance of the overall data center against specific applications. This practice should be tied to IT service management, which aligns service delivery with the needs of the agency.
- Establish separate planning and forecasting for physical facilities and IT equipment. This planning should be integrated, and planners should share their information.
Small changes or improvements carried out on a large scale can have a significant positive impact on data center resources such as power, cooling, floor space, hardware and software usage and service delivery.
Enabling growth without compromise while stretching budgets further requires working smarter. Using a comprehensive DCIM system along with metrics that matter enables intelligent and timely decision-making.