AlgoBuzz Blog

Everything you ever wanted to know about security policy management, and much more.

Generic filters
Exact matches only
Search in title
Search in content
Search in excerpt
Filter by Custom Post Type

Halloween Horror: How One Bad Decision Brought Down an Enterprise E-Commerce Site in Minutes


As its Halloween I thought I would share an e-commerce horror story—where one bad decision was able to bring a large organization to its knees— and provide some of my insights on how this could have been avoided.

I had the invaluable opportunity to work on a project for an e-commerce service provider – a company that handles most of the transactions of its type across the U.S. One day, some members of the firewall team made a few untested and out-of-band changes to their core security policy. Suddenly network communication between the e-commerce application and the Internet was blocked, and the entire revenue-generating portion of the business was offline for a few hours. Ouch!

So, how and why did this happen? Looking back I believe there were seven underlying business process reasons that led to the outage:

  1. Executive management was mostly disconnected from IT which, as with most security challenges, was a key contributor to this event.
  2. IT management (ironically) decided they were going to implement the ITIL framework to enhance change management functions within the department. This created a lot of procedural complexity (bureaucracy) that actually inhibited collaboration.
  3. Cross-functional IT team members did not communicate well, and failed to keep each other in the loop on what was being done at any given time.
  4. The proper staff and technology resources were not in place to facilitate a proactive way to truly manage the firewall rules and ensuing events.
  5. Minimal standards (at best) existed across firewall rule configurations which created unnecessary complexity in the network environment.
  6. The proper steps were not taken in IT audits and information security assessments that would have uncovered these broken firewall rules and the organization’s susceptibility to this risk.
  7. No metrics were in place for determining the ongoing resiliency of the firewall environment and network as a whole.

In reality, it didn’t take mere minutes for this outage to occur. It was an accumulation of bad choices across the enterprise over the months, and likely years, leading up to the actual outage.

So make sure that you and the people on your team see the bigger picture, consider the long-term impact of your choices, have reasonable processes in place, and most importantly – think before you act.

Happy Halloween!

Subscribe to Blog

Receive notifications of new posts by email.