Causes of Global IT Outage, Solutions and How to Prevent?
Causes of Global IT Outage, Solutions and How to Prevent?

Causes of Global IT Outage, Solutions and How to Prevent?

On July 19, 2024, a major causes of global IT outage disrupted computer-reliant services around the world, grounding flights, affecting television broadcasts, and impacting telecommunications. In the Philippines, the House of Representatives‘ website (congress.gov.ph) was one of the many affected, displaying an error page generated by Cloudflare, a cloud-based cybersecurity company.

SSL handshake failed, cloudflare
SSL handshake failed, cloudflare

The outage had a broad impact, including on the Business Process Outsourcing (BPO) industry, which relies heavily on uninterrupted IT services. For instance, many of the Top 14 BPO Companies in Davao City also experienced disruptions, affecting their operations and service delivery.

Causes of Global IT Outage, Solutions and How to Prevent?

The Cause

Initial reports suggest that the global IT havoc originated from an update pushed out by cybersecurity firm CrowdStrike. While the specific details of the update are yet to be confirmed, it is likely that a flaw in the update caused widespread system failures. The Department of Information and Communication Technology (DICT) in the Philippines stated that none of its systems were affected, but companies using the problematic cybersecurity service provider experienced disruptions.

Fixing the Issue

To address and fix such a massive cyber outage, the following steps are essential:

  1. Identify the Source: Quickly determine the root cause of the problem. In this case, cybersecurity experts would analyze the CrowdStrike update to identify the flaw.
  2. Rollback the Update: If the update is the confirmed cause, rolling back to a previous stable version can immediately mitigate the issue.
  3. Patch the Flaw: Develop and deploy a patch to fix the identified flaw in the update. This requires close coordination between the cybersecurity firm and affected entities.
  4. Restore Services: Gradually restore services to ensure stability and prevent further disruptions. Continuous monitoring during this phase is crucial to detect any lingering issues.
  5. Communication: Keep all stakeholders, including the public, informed about the progress and expected resolution times. Transparent communication helps manage expectations and reduce panic.

Preventing Future Outages

Preventing similar incidents in the future involves several proactive measures:

  1. Rigorous Testing: Before deploying updates, conduct extensive testing in controlled environments to identify potential issues.
  2. Backup Systems: Maintain robust backup systems and procedures to quickly revert to a previous stable state in case of an update failure.
  3. Redundancy: Implement redundancy in critical systems to ensure continued operation even if one part fails.
  4. Continuous Monitoring: Establish continuous monitoring and incident response protocols to quickly detect and address issues as they arise.
  5. Regular Audits: Perform regular security audits and updates to ensure systems are secure and up-to-date with the latest protection measures.

Who Can Fix It?

Fixing such a widespread cyber outage requires the expertise of cybersecurity professionals and IT specialists. Key players involved in the resolution include:

  1. Cybersecurity Firms: Companies like CrowdStrike and Cloudflare play a critical role in identifying and patching the flaw.
  2. IT Departments: Internal IT teams within affected organizations work to implement patches, restore services, and communicate with stakeholders.
  3. Government Agencies: Bodies like the DICT monitor the situation, coordinate with affected parties, and ensure public safety and transparency.
  4. Independent Security Experts: External consultants and experts may be brought in to provide additional support and verification.

Conclusion

The global cyber outage on July 19, 2024, serves as a stark reminder of the interconnected nature of the internet space and the potential risks associated with it. By understanding the causes, implementing robust solutions, and taking preventive measures, we can mitigate the impact of such incidents and ensure a more resilient cyber infrastructure for the future.

Comments

No comments yet. Why don’t you start the discussion?

    Leave a Reply

    Your email address will not be published. Required fields are marked *