Data Center

Cloud Computing for Real

Moving to the cloud may be new territory for your agency, so learn how to craft an SLA that aligns with your needs and mitigates risk.

Karen Mercedes Goertzel, Holly Lynne Schmidt, Theodore Winograd And Kristy Mosteller

Karen Mercedes Goertzel, Holly Lynne Schmidt, Theodore Winograd and Kristy Mosteller are subject-matter experts in software assurance, cross-domain information sharing, information assurance and cybersecurity technologies and trends for <a target="_blank" href="http://boozallen.com">Booz Allen Hamilton</a>.

As federal organizations evaluate the potential of migrating services to public cloud computing environments (CCEs), the decision-making process requires a strong understanding of the technology’s capabilities and limitations, evolving government security policies, and how an organization’s security policies can be implemented and enforced while using a public CCE — in addition to a comprehensive transition methodology.

Although a comprehensive evaluation of the cloud-computing environment is an undoubtedly complex undertaking, no single element is more important than selecting the right vendor. The primary concern associated with cloud offerings is that customer data is stored offsite at the vendor’s data centers and therefore must be protected by the vendor’s security controls. An additional concern with cloud offerings is that data from multiple customers is potentially co-located in one facility — increasing the value of the data stored at the center.

Although many vendors provide customers thorough descriptions of their existing security controls, few — if any — allow customers to perform a detailed audit of their security controls and standards. In addition, few cloud vendors are willing to modify their security controls for cloud offerings at a customer’s request.

Even so, there are a number of factors that can be evaluated in fine detail. Let’s consider six of them:

Number 1: Service-Level Agreements

SLAs represent both a source of security risk in public CCEs and a means of addressing security risk. As providers of a commodity service, many public CCE providers tend to standardize their SLAs for all of their customers. SLA standardization can introduce risk by failing to address and accommodate the specific security requirements of individual customers.

In the absence of an SLA that satisfies all of a customer’s requirements, that customer will need to negotiate with the provider to tailor the SLA to satisfy the organization’s needs; otherwise, the organization will need to find alternative means by which to minimize residual risk.

With this in mind, some of the leading public CCE providers recognize a need to meet their customers halfway for widespread public CCE adoption to occur. For example, Amazon has begun offering “virtual private cloud” services that combine the outsourcing advantages of the public cloud with increased customer visibility, control and service tailoring (at a cost higher than that of Amazon’s fully public Elastic Compute Cloud cloud service). But for many customers, the perceived cost savings associated with using public CCEs offered under less-than-ideal standard SLA terms will outweigh the potential losses associated with the increased risk that accompanies the acceptance of such terms.

The key to success is setting realistic expectations from the outset, so that all stakeholders understand the exact level of service to expect from the cloud provider. If the SLA meets the performance/availability/reliability (PAR) needs of the organization, the cost-benefit analysis is usually clear (keeping in mind that PAR must explicitly include all of the customer’s security considerations). If the SLA falls short, agencies should factor the quantified level of risk associated with the shortfall into their cost-benefit equation.

Number 2: Standard Vendor SLAs and Negotiability

SLAs are not a new concept. Agencies should apply the same standards in a CCE agreement that they would in any SLA. When entering the tenuous nature of the cloud, however, organizations must be increasingly vigilant in assessing their needs, what they are or are not willing to negotiate, and the price they are willing to pay for guarantees and assurances.

The key is to annotate nonnegotiable needs as the baseline price point and proceed from there. At a minimum, the SLA should include details about asset ownership, downtime, customer service and pricing. Organizations should be familiar with common language used in the CCE, as well as standard SLA language. Combined, this language creates an agreement that may differ from an organization’s understanding of typical SLA terminology.

Large-scale providers typically use more generic language to encompass broader capabilities and to relieve the agency’s (i.e., the customer’s) responsibility. It is the organization’s responsibility to ensure the SLA expressly identifies and defines specific issues and desired services.

Number 3: Data Ownership and Control

Agencies should ensure the SLA clearly defines who has access to the data and the protections that are in place. The data and IT managers will need to understand how the provider’s infrastructure and services are used to provide persistent access to needed applications and data sets. Continuity is important.

In a perfect world, a vendor could guarantee access 100 percent of the time, but, in reality, a guarantee like that is impossible. The agency’s legal department needs to understand the differences between common SLA terms such as “average configuration downtime” or “network downtime” versus “systems downtime.” Organizations should also have a clear definition of who owns the data and should consider self-protecting data options as necessary.

Number 4: Auditing and Accountability

Although most cloud providers will record access to the system in specified log files, gaining access to audit logs can be a difficult process. In some instances, the cloud provider’s logs may be insufficient for a particular agency’s needs. To this end, organizations are often forced to run their production applications on an implicit SLA that is usually interpreted as a simple equation of receiving X units of service for N price. Unfortunately, this implicit SLA does not address the scalability of an organization’s resources based on changes in demand.

Auditing becomes another crucial factor in assessing the agency’s true needs and being able to meet ever-changing demands in service. Instead of accepting what the CCE provider sends the organization at the end of the month as a bill, an organization should understand that cloud computing is complex enough that a reasonable set of runtime information must be made available to substantiate the provider’s claim for compensation.

This point is particularly true in developing an SLA. If the agency’s infrastructure is regularly adjusting to meet demands, it is essential to be able to verify that the infrastructure is reacting the way that was contracted. To complicate matters further, CCEs — which are innately self-service and on demand — create a plethora of data to be maintained and filtered. For this reason, SLAs with providers should explicitly state that real-time auditing or logging (for accountability) will be performed and resulting reports will be made accessible. A tailored audit can provide the agency a clear understanding of where responsibilities lie.

Number 5: Compliance Environments

Compliance environments cited by experts as important for cloud computing include Statement on Auditing Standard (SAS) 70, Payment Card Industry Data Security Standards(PCI DS)and the Health Insurance Portability and Accountability Act (HIPAA).

In particular, experts cite the section in SAS 70 on service organizationsissued by the Auditing Standards Board of the American Institute of Certified Public Accountants. Basically stated, the vendor managing the cloud must be able to describe what is happening, where the information comes in, what the vendor does when it gets the information, how it gets back to the users, the controls over the processing of the data and, most important, what happens to the data when it gets to the cloud.

Although many public CCE providers have SAS 70 or even PCI compliance, many do not fully address NIST Special Publication 800-53, Recommended Security Controls for Federal Information Systems and Organizations. SP 800-53 outlines the security controls expected of federal organizations. Although many cloud providers lack a full understanding of recommended security controls, some adequately comply with SP 800-53.

When assessing a provider, organizations should consider the following:

Is the provider familiar with federal requirements? Are the security controls the CCE provider is responsible for compliant? How does the CCE provider display evidence of compliance? How is compliance maintained?

Number 6: Quality of Service and Quality of Protection Concerns

Public cloud service SLAs must account for at least two vendors: the cloud service provider and the communications vendor that provides the circuit by which the users accesses the cloud service.

This dual-provider scenario adds complexity not only when crafting the SLA but also for holding vendors accountable (and legally liable) for failures. In addition, agencies must acknowledge that all service providers will experience downtime at some point because of situations beyond their control, including natural disasters and interruptions in the public infrastructure. Most service providers offer an assurance of 99.5 percent uptime.

Even with expected legalese, a provider can make a reasonable attempt to guarantee an acceptable level of service. With that in mind, the real question becomes the following: What happens when service is interrupted?

Some providers have established mechanisms to assist organizations in assessing the quality of service they are receiving and to inform organizations of any potential downtime or service disruptions. Although these mechanisms do not mitigate the actual loss of time or data, they can be used to prepare preventative and contingency measures after an SLA has been established.

The bottom line in addressing sustainment issues is an agency’s ownership of risk. Using available resources to monitor the health of a provider’s service capabilities (as available) is one viable option. If an agency is clear in its expectations early on and develops appropriate SLA language, it will enjoy greater assurance of continuity of operations.