Tech

A look at how Mass General Brigham recovered from the CrowdStrike outage

The tech team at Mass General Brigham put code on a thumb drive to bring thousands of devices back from the blue screen of death.
article cover

Drs Producoes/Getty Images

5 min read

Just like any other Friday at 1:45am, Adam Landman was fast asleep on July 19.

But unlike any other Friday morning, Mass General Brigham (MGB)’s chief information officer was woken up by a phone call about the last thing someone with his title wants to hear: a cyber outage.

Landman said at first he thought the issue was a malicious attack after cartoon characters popped up on displays. But he soon discovered it was the CrowdStrike outage, a major cyber shutdown caused by a Microsoft update that impacted many different industries.

Healthcare Brew reported that day how MGB had canceled elective procedures and ambulatory care at its facilities. The hospital’s electronic medical records system, Epic, could also only be used in read-only mode, forcing staff to manually document care on paper. By Sunday, most of MGB’s systems were back up and running.

Landman spoke with Healthcare Brew about MGB’s weekend recovering from the CrowdStrike outage.

Middle of the night. After breathing a sigh of relief that the incident wasn’t “something sinister,” Landman said he quickly began meeting with his team, including Paul Biddinger, chief preparedness and continuity officer.

Meanwhile, on-site staff were sent into “downtime procedures,” which meant resorting to handwriting notes.

“The majority of our healthcare workforce now has not really worked on paper, and so we know that there are a lot of challenges in making that switch, especially in the middle of the night,” Biddinger said.

Landman and other leadership opened up a physical command center at around 2:30 am at MGB’s headquarters in Somerville, Massachusetts, and asked each hospital to open its own command center to coordinate a system-wide recovery.

The team gave itself a deadline of 5am to decide if the hospital could go forward with its elective surgical and outpatient procedures that would have started at 6am that day, but ultimately, all nonessential cases were rescheduled.

“We really, really hate postponing care. But if there’s a safety and throughput issue, we have to do it,” Biddinger said.

Landman recalled that the downtime lasted from the morning until 11pm on July 19. Chief Nurse Debra Burke said she couldn’t remember the Mass General Hospital’s system being down for that many hours since it moved to electronic medical records over 30 years ago.

An Epic recovery. According to Landman and Biddinger, MGB had 45,000 computers with the blue screen of death it needed to bring back online.

When the outage took the computers and servers down, it also took down MGB’s encryption protection tool, Microsoft BitLocker, which locks a computer in case it goes missing or is stolen.

Navigate the healthcare industry

Healthcare Brew covers pharmaceutical developments, health startups, the latest tech, and how it impacts hospitals and providers to keep administrators and providers informed.

In order to unlock the security feature, Landman said, staff had to follow the procedures that CrowdStrike provided, which involved inputting a 48-character recovery code into every machine. This manual, multistep process took about 15 to 20 minutes per computer.

Once Epic was back up at around 11pm on Friday, he said MGB was able to resume ambulatory, inpatient, and emergency care.

Clinicians later had to digitally document everything they had written on paper that day, which often didn’t match the layout in Epic, Burke said. She added that the hospital plans to print out templates that look like the digital interface to make the transition smoother in case of future outages.

Fighting the blue screen. Overriding security measures proved to be a time-consuming process, with tens of thousands of computers still down on Friday night across the system, Landman said. At 20 minutes for 45,000 computers, complete recovery would take up to about 15,000 hours.

That’s when members of the tech team, including Mike Ricci and Shawn Martineau, had a major breakthrough: It programmed a thumb drive that could be plugged into any computer and access BitLocker with just the click of a few buttons.

“This was really transformational because, all of a sudden, we were able to hand a USB stick with this script on it to any workforce member across MGB, and they were able to fix devices,” Landman said, adding that at the peak of the recovery process, hospital employees were regaining access to about 2,000 computers per hour.

On July 21, MGB’s tech team shared a version of the thumb drive code in a repository on GitHub, a collaborative platform for software developers, so other businesses could use it in their recovery. “We did hear that a couple of organizations were able to use the script in their own organization to be able to help remediate devices,” Landman said.

Overall, Landman said his biggest takeaway is that it’s not a matter of if cyber issues will happen, but when—and that they won’t always be malicious indicidents.

“Digital threats are one of the biggest threats to the delivery of healthcare that we have across the country right now because computers are so integral to how we deliver medical care,” Biddinger said. “Hospitals and healthcare systems have to really prioritize the resilience of their system, the security of the system, so that they’re as protected as they can be. But no system is infallible.”

Navigate the healthcare industry

Healthcare Brew covers pharmaceutical developments, health startups, the latest tech, and how it impacts hospitals and providers to keep administrators and providers informed.

H
B