Data Consolidation Phase 4: Control and Administer the Consolidated Environment

With the increased use of high-bandwidth, fiber-optic networks and cloud computing, more and more companies are looking to consolidate their data center operations. In his book Administering Data Centers Servers, Storage, and Voice Over IP, Kailash Jayaswal provides insight regarding the best practices to consolidate data center operations. Phase 4 of data center consolidation consists of controlling and administering the newly consolidated environment, which is discussed in this section.

Sticking a new environment within an otherwise smooth operation is annoying. A consolidated set of applications sharing the same servers, resources, and infrastructure is an added nuisance. It requires diligent planning and tough operational processes to allay the impact.

The Sarbanes-Oxley Public Company Accounting Act of 2002 requires publicly traded U.S. corporations to set up and follow strict access controls and change-tracking policies, especially for servers containing financial data.

Formal policies and tight change management is far from the world of distributed client-server computing. Application developers and administrators do not like to work by someone else's rules. Policies, used by some within data centers, have no following among system and data administrators in the UNIX and Windows world. Tools are created, and servers and storage are provided as and when necessary. Such behavior gravely impairs smooth operations within a centralized environment.

It is crucial to develop a set of robust policies and operational procedures to improve the following:

  • Increase accountability — Each person's activities on important servers can be easily tracked. This is necessary to undo commands if necessary and make each individual liable for his or her work. At one client site, all UNIX administrators were forced to log in as themselves and use sudo for root-level commands to force work logs.
  • Increase availability — A recent Gartner report has shown that only 20 percent of service downtime can be attributed to hardware failures, 40 percent is caused by application and software errors, and 40 percent is caused by operational errors made by humans. To increase service uptime, it is therefore necessary to make the staff follow preset policies and a set of well-documented, technical procedures.
  • Increase security — When a set of rules is accepted and followed by all levels of IT management, there is less temptation to ignore the rules and do tasks that would compromise security.

To maintain an acceptable level of accountability, service availability, and security, it is necessary to identify and prioritize areas that can most negatively jeopardize required service levels. The following are important categories to manage an environment shared by many applications and users:

  • Execution control — This covers policies for scheduling various data center resources among the users of the consolidated environment. The resources include staff time and hardware such as servers, network, and storage. It also includes formulating and enforcing business processes that impact several user groups.
  • Problem management — This is about identifying and resolving problems, hopefully before they flare up. Most corporations have a 24x7 network operations control room or a help desk that serves as a first step in problem resolution. They have a list of internal staff members that they can tap into for problems outside their technical capabilities. They also have access to folks who can manage the relationship with customers during the sensitive period of "why can't anyone get the server up right now?" Problem management is crucial to delivering promises service levels.
  • Change control — This includes things such as scheduling activities or changes in the environment, performing the work, and documenting the results of the change. These changes include hardware changes, reboots, software upgrades, configuration changes, and user account changes. Being able to perform such tasks effectively and quickly, while adhering to all the required policies, is a critical factor determining the delivery of the promised services levels.
  • Many corporations have a change-control board that must review and approve all scheduled changes. The board must ensure that no stakeholders in the consolidated environment are negatively impacted by the proposed change.
  • Asset control — In an environment that is not centralized, there is a one-application-to-one-server relationship. Each user group knows of the servers that it uses. In a consolidated environment, the asset control team (or person) must set up a computing model to determine the resource usage by each user group. This is necessary for hardware and IT staff charges that must be based on usage.
    Another important work for the asset control team is to acquire and track all used hardware and decommission hardware that is not being used. Every piece of equipment must have an asset tag identification (that is, a unique corporate-wide number assigned to it). It eases documenting details for a wide range of devices in a database.
  • Personnel management — This includes staff recruitment, training, and proper use of the staff-at-hand. Staff consolidation is an important goal for centralizing services. You must ensure that the right number of employees, with the correct skill sets, are assigned to each work.

From Administering Data Centers Servers, Storage, and Voice Over IP by Kailash Jayaswal. Copyright 2006 John Wiley & Sons, Inc. All Rights Reserved. Used by arrangement with John Wiley & Sons, Inc.