- Published on
Research Project - Outcome
You are viewing a draft version of this post. Information may not be up to date, complete or valid.
If this is a school related post/project, I have included a to-do list near the bottom of the post.
- Authors
- Name
- Mitchell
Introduction - ~250 words
On the 19th of July 2024, millions of Windows computers running an anti-virus known as CrowdStrike Falcon suffered a major crash in the kernel level of the device. “You can see folks saying hey uh oh this is affecting us on a Friday and it’s taken out an airport or a bank or whatever state, municipality, government, 911.” is a quote from security researcher John Hammond, who released a video on the same day as the incident discussing the anti-virus and the problems that were caused by it. As mentioned by John Hammond, 911 services across many states within the United States were down due to this worldwide crash problem. According to N'dea Yancey-Bragg (2024), calls in Alaska that were both non-emergency and 911 emergency calls were unanswered for about seven hours. This report will review three key findings that were prevalent during the research of this event. The first key finding was that millions of businesses depend on the CrowdStrike Falcon anti-virus. The second key finding was from the root cause analysis that was available on the official CrowdStrike website, which goes into detail on how the event occurred, indicating that the root cause was a RegEx failure within the kernel driver. Additionally, the final key finding was also found in the same document which CrowdStrike admits to not testing and staging their code before releasing it to the millions of businesses that rely on them. This report aimed to put the information in a way that is understandable to the general public.
Key Finding 1: Millions of businesses depend on anti-viruses like CrowdStrike - ~385 words
There are millions of businesses whose entire organisational structure relies on the security of their anti-viruses, this includes CrowdStrike’s endpoint protection which is in their product Falcon. Researching the topic of recommended anti-viruses led to multiple websites mentioning CrowdStrike and giving their opinion on it. The websites TechRepublic and Gartner gave CrowdStrike a positive recommendation, with TechRepublic stating that CrowdStrike Falcon was the best for top-tier security and Gartner giving it 4.8/5 stars. However, another website eSecurity Planet only gave CrowdStrike a 3.8/5 rating and provided reasons why CrowdStrike is not the best, “Frequent console design changes”, “No manual way of quarantining files” and “Limited Linux OS support”. According to Dell (n.d.), CrowdStrike’s Falcon is an agent-based sensor that can be installed on every common operating system, including Windows, Linux and MacOS. It works by having a predefined list of prevention hashes which are in a SHA256 format. SHA256 also known as the Secure Hash Algorithm 2 is a set of cryptographic hash functions designed by the National Security Agency in the United States. According to Blockchains (2020), hashing is a method that converts data into a unique string of text. This provides a way of sharing data and strings without a third party being able to inspect it since both endpoints will need a certificate to decrypt it. CrowdStrike does not provide direct statistics for the amount of customers that are using their products, however, the market share values are available on 6sense (n.d.) which is a tracking platform. 6sense says that CrowdStrike has a market share of 23.76% which makes them rank #1 on the 6sense website. They also provide a list of customers that use CrowdStrike, such as multiple health systems like St. Luke’s Health System in Idaho, United States and SCL Health from Colorado. This list also contains the widely used card company, Visa Inc. So far, CrowdStrike has not faced any repercussions for the negligence that occurred with the incident. However, according to BBC (2024), a senior executive from the company, Adam Meyers had to apologise before the US congress and received a “grilling” from Congress for their negligence. Additionally, CrowdStrike was not the only one that had suffered economic losses to their organisations, with the ones relying on them suffering the most. The company Delta, who runs an airline stated that they had lost over 500 million United States dollars as a result of the incident.
Key Finding 2: An extra input was the primary cause in the kernel driver - ~375 words
It was shown in reports released on the official CrowdStrike website in a document titled “Root Cause Analysis” that the root cause of the CrowdStrike incident was a failure in the processing stage before the content was consumed and processed further by the RegEx engine which is deployed on the Falcon sensors (CrowdStrike, 2024). For some background on how the failure occurred, the kernel is in ring 0 of the protection ring which is a set of rings that defines the level of access to system properties, with 3 being the least privileged and 0 being the most privileged (GeeksforGeeks, 2020). The kernel is responsible for bridging the gap between the operating system and the hardware (Bigelow, 2022). Anything that runs on the kernel is critical to the operations of the system and if any failures are not handled correctly within the driver, a bug check (also commonly referred to as a Blue Screen of Death, BSoD) will be forced for the system administrator to diagnose a “dump” of the error (DOMARS, 2024).On the day of the CrowdStrike incident, an update was pushed out that contained a logic error that computers were not able to compute to continue running and as such a bug check was triggered causing systems worldwide to crash (Mrwhosetheboss, 2024). In technical terms, the Falcon sensor was sent 21 inputs when it only expected 20 leading it to cause an out-of-bounds read (Mudaliar, 2024). According to Mitre.org (2009), an out-of-bounds read is what occurs when a program or application attempts to access a piece of memory that is outside of its assigned memory buffer. This out-of-bounds read led to the crash of the kernel driver which triggered the bug check. A statement from Mudaliar (2024) who read over the report describes that CrowdStrike listed the various ways they are going to use to mitigate the same problem from reoccurring in the future, including but not limited to certificate pinning which ensures that the request cannot be manipulated between the server and the client and checksum validation which will ensure that files were not corrupted or tampered with during transit. CrowdStrike has stated that they will rework and improve the content interpreter to ensure that checks are added to prevent additional fields from being sent and processed.
Key Finding 3: CrowdStrike did not test before releases - ~380 words
In the same report released by CrowdStrike, they mention ways that they are going to implement to prevent any future catastrophic events like what occurred. One of the implementations mentioned is number 6, with a general summary of “Template Instances should have staged deployment”. In this section, they talk about how future deployments should be staged before release. CrowdStrike should receive consequences from countries that used it, including the United States because this shows that they are negligent towards their anti-virus definitions and updates. Currently, Delta Airlines which is a company that was affected by the CrowdStrike incident is pursuing legal action against CrowdStrike for their negligence after they lost over $500 million United States dollars because of their systems being offline (Shepardson, 2024a). CrowdStrike is trying to avoid responsibility by questioning why Delta suffered more than other airlines and claiming that it was the fault of Delta’s IT staff (Shepardson, 2024b). Delta is seeking the same amount of money that was lost during the incident. It is common knowledge that all updates should be staged and tested before they are released to any customers. However, CrowdStrike failed to follow this general rule and as we can see, it has major repercussions for the business's share prices. An extremely basic computer or even virtual machine setup could have prevented the incident from occurring entirely by detecting or viewing the crash in real-time, and they would have been able to diagnose it before pushing out the updates. If organisations and big corporations release updates without testing them prior, then there could be significant issues raised and/or repercussions for their negligence. Issues raised could include security problems, general bugs, and major vulnerabilities that could lead to other company devices being compromised and used as a part of a botnet. According to Palo Alto Networks (n.d.), a botnet is a network of computers that are infected with malware which is under the control of a central server used to send instructions to them, like sending thousands of network requests to websites to take them offline. Due to the nature of botnets, they could be sent over-the-air updates to enable more sophisticated malicious actions and behaviour. Fortunately, the CrowdStrike incident did not have anything to do with malware or botnets and was just a bug in the kernel-level driver.
Conclusion - ~303 words
All things considered, CrowdStrike and their anti-virus Falcon sensor caused catastrophic damage to services around the world, taking systems offline that were responsible for emergency services, airlines and other critical business services. CrowdStrike showed their incompetency and negligence during these events and only gave out generic PR responses instead of having their CEO, George Kurtz apologise for what had occurred on that Friday. As discussed in this report, millions of businesses rely on anti-viruses such as CrowdStrike to protect their confidential and critical services from being infected by viruses and malware. Unfortunately, some businesses such as Delta Airlines relied on CrowdStrike too much for their critical airline systems and as a result were majorly affected by the event with their finances suffering significantly. Organisations who are still using CrowdStrike should consider moving to a more trusted anti-virus like SentinelOne which is similar to CrowdStrike’s Falcon. Their content interpreter should be updated to fix any issues that arose as a result of the incident which could prevent further catastrophic incidents from occurring in the future. CrowdStrike should start testing and staging their updates before releasing them to production in environments. Additionally, it has been identified that their content interpreter should handle any exceptions gracefully to prevent any bug checks from occurring. CrowdStrike should be improving on their PR responses as the past responses for this event were not clear and could be improved on such as the CEO not speaking out about the event and instead just linking to an article they made. Fortunately, CrowdStrike has taken some steps to improve its resources to prevent such an issue from occurring again, such as setting up machines to test before final production releases. This report and its findings should be used as a guide for other cybersecurity companies to consider and learn from.
Word count: maximum 2000 words, excluding in text referencing and titles
Reference list:
6sense (n.d.). Crowdstrike - Market Share, Competitor Insights in Endpoint Protection. [online] 6sense.com. Available at: https://6sense.com/tech/endpoint-protection/crowdstrike-market-share.
Alspach, K. (2024). 5 Things To Know On Delta’s Lawsuit Against CrowdStrike. [online] Crn.com. Available at: https://www.crn.com/news/security/2024/5-things-to-know-on-delta-s-lawsuit-against-crowdstrike [Accessed 1 Nov. 2024].
Basan, M. (2024). 7 Best Antivirus for Any Size Business in 2024. [online] eSecurity Planet. Available at: https://www.esecurityplanet.com/products/antivirus-software/.
Bigelow, S. (2022). What is kernel? - Definition from WhatIs.com. [online] SearchDataCenter. Available at: https://www.techtarget.com/searchdatacenter/definition/kernel.
Blockchains, 101 (2020). Cryptographic Hashing: A Beginner’s Guide. [online] 101 Blockchains. Available at: https://101blockchains.com/cryptographic-hashing/.
CrowdStrike (2024). Falcon Content Update Remediation and Guidance Hub | CrowdStrike. [online] crowdstrike.com. Available at: https://www.crowdstrike.com/falcon-content-update-remediation-and-guidance-hub/.
Dell (n.d.). What is CrowdStrike? | Dell Australia. [online] www.dell.com. Available at: https://www.dell.com/support/kbdoc/en-au/000126839/what-is-crowdstrike.
DOMARS (2024). Bug Checks (Blue Screens) - Windows drivers. [online] Microsoft.com. Available at: https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/bug-checks--blue-screens- [Accessed 23 Sep. 2024].
GeeksforGeeks (2020). Protection Ring. [online] GeeksforGeeks. Available at: https://www.geeksforgeeks.org/protection-ring/.
Inc, G. (n.d.). Endpoint Security and Protection Software Reviews 2021 | Gartner Peer Insights. [online] Gartner. Available at: https://www.gartner.com/reviews/market/endpoint-protection-platforms.
Mitre.org. (2009). CWE - CWE-125: Out-of-bounds Read (3.4.1). [online] Available at: https://cwe.mitre.org/data/definitions/125.html.
Mrwhosetheboss (2024). 13 Most EMBARRASSING Tech Fails of all time. [online] YouTube. Available at: https://www.youtube.com/watch?v=8Vk_2Kq9Njw [Accessed 20 Oct. 2024].
Mudaliar, A. (2024). CrowdStrike Outage Root Cause Analyzed. [online] Spiceworks Inc. Available at: https://www.spiceworks.com/it-security/network-security/news/crowdstrike-reveals-root-cause-analysis-global-outage/.
N'dea Yancey-Bragg (2024). Day of chaos: How CrowdStrike outage disrupted 911 dispatches, hospitals, flights. [online] USA TODAY. Available at: https://www.usatoday.com/story/news/nation/2024/07/19/crowdstrike-outages-what-happened/74474725007/.
News, B. (2024a). CrowdStrike boss apologises before US Congress for global IT outage. [online] Bbc.com. Available at: https://www.bbc.com/news/articles/c23k4yyjxp3o [Accessed 25 Sep. 2024].
News, R. (2024b). CrowdStrike probably didn’t test update behind global IT outage - experts. [online] RNZ. Available at: https://www.rnz.co.nz/news/world/522672/crowdstrike-probably-didn-t-test-update-behind-global-it-outage-experts.
Palo Alto Networks. (n.d.). What is a Botnet? [online] Available at: https://www.paloaltonetworks.com.au/cyberpedia/what-is-botnet [Accessed 3 Nov. 2024].
Shepardson, D. (2024a). CrowdStrike, Delta sue each other over flight disruptions. [online] Reuters. Available at: https://www.reuters.com/legal/crowdstrike-delta-sue-each-other-over-flight-disruptions-2024-10-28/ [Accessed 1 Nov. 2024].
Shepardson, D. (2024b). Delta sues CrowdStrike over software update that prompted mass flight disruptions. [online] Reuters. Available at: https://www.reuters.com/legal/delta-sues-crowdstrike-over-software-update-that-prompted-mass-flight-2024-10-25/.
Wallen, J. (2022). 10 Best antivirus software for businesses in 2022. [online] TechRepublic. Available at: https://www.techrepublic.com/article/best-antivirus-software/.