Most large and midsize law firms today have disaster recovery plans in place. But are the plans realistic? Do they truly meet the needs of the organization? Do they really bring to the project the full range of options that are available to sophisticated providers of legal services? These questions are of paramount importance for those managing a disaster recovery plan, since the potential for natural disasters or terrorist threats is becoming an increasingly bigger concern for businesses to face.
Two key components of a modern DR plan are backup and replication. Replication used to be something that was dependent upon resources unavailable to many firms, such as backup or secondary data centers. However, such a project can become more feasible for the small and midsize firm by renting space for its hardware at a network services provider. Other advances have occurred in software that enable smaller and midsize firms to leverage their internal resources to create far more robust plans than previously possible.
One common mistake in DR strategy is that it is seen as purely an IT responsibility. In actuality, the subject should be treated as a business continuity measure, which means that while very much a technology concern, it is an issue that affects the whole firm’s livelihood.
The budgetary realities of DR are often greatly underestimated. Nonetheless, there are ways — with sophisticated information technology leadership and the right management of vendors — to execute an appropriate backup and replication plan and disaster recovery strategy.
What’s at Stake
Some firms make the mistake of placing DR strategy behind other “mission-critical” deployments. But one can effectively argue that law firms have a fiduciary duty to their clients to be able to protect clients’ data and ensure its accuracy and integrity. The passage and enforcement of Sarbanes-Oxley and the Health Insurance Portability and Accountability Act make this all the more a reality. (You can see already how this is not merely an IT concern.)
Aside from pride of reputation, the threat of a malpractice suit should, in and of itself, pose enough concern to ensure that a firm acts responsibly in the representation of a client. An interruption in client service would be seen as a profligate waste of a company’s time and money. And, as statistics show, in the end, clients don’t care about firms’ excuses for interruptions in service — if their needs are not being met, they will take their business to another firm. Just as bad press can tarnish a company’s reputation, an interruption or failure to perform due to natural disaster or an attack on the firm’s system creates a cost that will, in the end, be borne by the firm. This, then, becomes a significant business strategy — clients need to be assured that protection and accessibility of their data is part of the package of professional services that they will be receiving.
Recently, I led a disaster recovery plan implementation for my former firm, one with 200 attorneys and 500 total personnel. This was a full-service firm in the Southeast, which was directly in the line of hurricane activity.
An effective disaster recovery strategy requires that data be stored redundantly — at multiple physical locations. An organization needs to start by asking some basic questions that may not have simple answers. How will it protect its data? How does it backup that data? How does it replicate, or host off-site, etc.? How will it ensure availability of data? (Will it need to rebuild servers and restore from tapes?) Where will it source hardware? Where will it set up shop?
There are several options and providers to consider:
• Traditional tape. This provides the most simple data protection, but is not always the most reliable and does not provide the same level of availability, as servers would need to be rebuilt and data restored.
• Disk to disk (VTL). With the advent of cheaper hard disks and “virtual tape libraries,” this provides a slight improvement over simple tape backup, as backup and restoration can be performed more quickly.
• Online backup (AmeriVault, eVault, Mozy). This is a simple and, oftentimes for smaller shops with less data, less expensive solution. Numerous vendors offer various packages that depend upon total volume being protected for determining pricing. This provides “always available” access to data.
• Off-site to another office. For firms with multiple offices, this provides a better utilization of resources at the “remote” site and can enable staff with the ease of managing redundant data and equipment.
• Co-location. This is for single-office firms or those firms that don’t have the resources, or want, to invest in the infrastructure required to build a redundant data center. Numerous vendors can provide cost-competitive proposals for “hosting” space.
• Fully outsourced (IBM, SunGuard). A “high-end” and often expensive solution wherein a third-party provider offers a “one-stop-shop” for hosting and replication/backup services.
EXPLORING OUR Needs
At the crux of the matter was the question: “What must lawyers have to meet client needs?” The answers could, for the sake of this article, be boiled down to: e-mail and calendaring, documents, and Internet access.
The next question becomes: “What do we need to run the firm?” Financial systems and payroll were the main concerns, from an administrative perspective. There were other functions that could be lined up under this administrative hierarchy, but the maxim, “You can protect anything — for a price” rang true during this process. A firm’s CIO or chief knowledge counsel must prioritize what data needs to be protected the most.
There were several options appropriate for us that were discussed: Backup (on-site, off-site, hosted); replication (real-time, set interval, etc.); continuous data protection (CDP) products; outsourcing the full process; or “a little bit of everything,” wherein we would utilize a collection of various strategies.
Disaster preparedness was clearly one priority that needed to be addressed — in 2004 there were four hurricanes in our region. Power was lost in some cities for days. Firms caught unprepared had no access to their data. Many attorneys lost physical access to their offices.
At my previous firm, where we tackled this project early on, disaster recovery and business continuity were, at the time, as yet unrefined. Under our “old” system of data protection, backup took more than 30 hours to complete. Recovery could have taken days.
Before our redesign of the system, our network was comprised of 50 Windows 2003 servers, distributed across our 10 offices, and approximately 2.5 terabytes of active data. The wide area network (WAN) topology was a dual hub-and-spoke, with the two largest offices acting as the hubs; in the event of a failure at either office, at least half of the firm would be impacted. This design left the network in a precarious position. The firm used Microsoft Exchange for e-mail (with approximately 600 mailboxes and total mail stores of more than 500 GB), split between two servers — one in each of the two “hub” offices. We had an average “change rate” (the amount of data processed by the server) of around 10 GB/day. The firm’s document management system (Interwoven) stored roughly 3 million documents, also distributed on servers across the network. We also relied on many other typical systems, such as Elite.
With servers at each of our 10 offices and substantial amounts of data to protect, the existing tape backup was proving inadequate; long backup and restore windows and lack of real-time protection made it a poor solution. We considered our options and then settled on a hybrid approach — disk-to-disk-to-tape backups in conjunction with real-time replication and “fail-over” (the ability to switch from a primary server at one site to a secondary, replica server at another site) for high availability.
OUR CHOSEN STRATEGY
An effective disaster recovery solution requires a data replication solution capable of transporting large amounts of data over the long distances required for geographical diversity of the data storage sites. One strategic step we took was to consolidate our systems; doing so helped simplify management of servers and data by centralizing them into one location. This drastically simplified and improved the backup and recovery process while reducing costs and complexity. It also enabled a simpler replication strategy (one-to-one, as opposed to many-to-many or many-to-one).
One of the major strengths of the redesigned network is that the MPLS “cloud” links each location directly to every other location, rather than having to route all traffic through a primary “hub” location. This helped reduce the dependence on a single site for all communications and access to data. Further, the consolidation of all systems into a primary site vastly reduced the complexity of establishing a secondary, replica site.
One of the key components in the consolidation effort was the utilization of WAN optimization devices. The firm utilized Riverbed Technology to make this happen. By virtue of centralizing all servers into one location, the WAN became a critical component in providing services and functionality to the remaining nine “remote” sites. Because of the large volumes of data and numerous applications running across the WAN, personnel at those remote sites would be reliant upon more robust connectivity. The Riverbed devices helped provide that.
WAN optimization enabled centralization and consolidation; further, it vastly reduced the complexity of our replication scenario. After evaluating several products, we settled on a host-based replication product from CA called XOSoft High Availability.
The software enabled the firm to replicate all of the data from our critical servers at the main data center in Florida to a secondary data center — a co-location facility hosted by our WAN provider, Global Crossing. We chose the Chicago site to minimize the likelihood of any one single event affecting both the primary and secondary sites (the most common event in Florida is hurricanes, though other factors such as power grids and threats of terrorism were also considered).
In executing our plan, in addition to the Riverbed and CA products that we purchased, we utilized as much pre-existing software and hardware as possible. By relocating servers from each of the “remote” sites into the primary and secondary data centers, we were able to reduce costs and minimize the financial impact to the firm.
Once all of the servers were consolidated, backup performance immediately improved. Because we were no longer moving vast quantities of data across the WAN, we were able to perform complete backups of all systems in less than six hours. We employed a disk-to-disk backup scenario (performed at times of lower network utilization) followed by a further backup from disk-to-tape, which was performed immediately following. Additionally, all data was being replicated in “real time” from the primary site in Florida to the secondary site in Chicago.
In addition to the dramatic reduction in backup time (from 30 hours down to six), the firm is now able to provide true “high availability” of all systems; in the event of a failure of one of the systems at the primary site (or of all systems there), the secondary servers can be “failed-over” to provide immediate access to all critical functionality (e-mail, documents, financials, etc.).
Further, the software performing the replication and fail-over also had other features that provided the firm with true continuous data protection by enabling almost instantaneous recovery of deleted or corrupted files. The combined system, in backup and recovery terms, enabled an almost zero recovery point objective (RPO) — meaning we could “roll back” or return the systems to any point in time to restore data. Also, for all practical purposes, we achieved an almost zero recovery time objective (RTO) — the switch over to the secondary servers took only moments; systems were unavailable for only a negligible period of time.
Our timing for orchestrating this couldn’t have been better. In 2005, there were 27 named storms, including 15 hurricanes. Four reached Category 5 status.
That year, there were two direct hits to the Fort Lauderdale area. Hurricane Katrina was a Category 1 at impact, while Hurricane Wilma was a Category 3 at impact. Offices in the storm path were routinely closed, beforehand, for storms of this magnitude.
Prior to the storms’ arrival each time, the firm activated the Chicago data center (by switching those servers to “active” status). Because we had the luxury of performing the switch before the pending disaster, we were able to slowly and methodically bring the servers online. The full process required less than 15 minutes.
On the day of Katrina’s impact to Fort Lauderdale, the firm’s primary data center lost power for less than 12 hours. Regardless, all systems remained online in Chicago and fully operational. Hurricane Wilma directly impacted Fort Lauderdale as well as four other offices. Each of those offices remained closed from one to five days. During that time, all systems remained operational from the Chicago data center. Attorneys worked from open offices or via remote access.
Though we had the benefit of knowing when the “disaster” would arrive, our plan was designed to be just as effective if activated after the fact.
For events where we were able to put the plan into action beforehand, IT staff was called upon to activate it. However, because we couldn’t always rely upon the availability of those same staff members during an unannounced event, we ensured that someone outside the department would also be well versed in the plan and capable of activating it.
The plan needs to be documented and carefully explained so that anyone can carry it to the finish line. Monthly partial tests of the plan and annual full testing are a must.
And, it’s important to remember that DR plans are part of an overall business continuity plan — they can be done on any budget. The right plan is determined by the size and particular needs of the organization. v
A longer version of this piece originally appeared in LJN’s Legal Tech Newsletter, an affiliate of the Recorder.
Ben Weinberger is chief information officer of Lathrop & Gage L.C. The Kansas City, Mo.-based law firm, with approximately 300 attorneys in 10 offices nationwide, recently announced it will merge with a Los Angeles trial boutique in January. Weinberger’s previous experience includes work for the city attorney’s office in Los Angeles as director of information technology, as well as for IT consulting firms in London and Chicago.