Skip to content

Data breaches and data leaks

All companies face the critical task of protecting their data from potential leaks and breaches. Failure to do so can result in fines, regulatory restrictions, reputational damage, and other significant risks.

Data leaks and breaches are not the same, though.

Data leaks vs data breaches: what’s the difference?

Data leaks occur unintentionally, exposing sensitive information due to human error or system flaws. In contrast, data breaches involve unauthorized access and theft of personal or corporate data, often initiated by attackers breaking into systems.

While the terms “data breach” and “data leak” are sometimes used interchangeably, the difference between them lies in intent: leaks are accidental, breaches are intentional. Both can result from deliberate attacks, but the method varies: breaching involves breaking in, leaking involves data slipping out.

Data leaks

What exactly is a data leak?

A data leak occurs when sensitive or confidential data accidentally slips out of its secure environment. Data leaks don’t necessarily involve hacking. Instead, they usually occur because of human error, system flaws, or misconfigured settings.

Here are some examples of data leaks:

  1. A public API that returns information about any user in the system without permissions required to do so.
  2. A system that sends an email to one customer with the data of another customer.
  3. An AI model that outputs the data of another customer accidentally.
  4. A debug line in code that logs sensitive data to another system, enabling more employees than originally intended to access the sensitive data.

Data leaks have always been stealthy and hard to track. Normally, we find them in hindsight, by looking at the logs we write and finding sensitive customer data there. Worse, sometimes we learn the hard way, when it’s too late and the damage is done: a bad actor is either blackmailing us or publishing our customers’ sensitive data.

Unfortunately, web application firewalls can’t always stop data leaks. There’s simply no pattern or signature to look for: it’s all just legitimate API-based traffic.

Even when we do find leaks in logs, we still need to find the origin in the application, identify the relevant developer, issue a patch, and delete the leaked data.

How to prevent data leaks

As the digital landscape expands, so does the risk of data breaches and leakage. Protecting sensitive information is paramount for any business. Here’s how you can improve your defenses and prevent data leakage:

  1. Access control and the least privilege principle. Implement access control mechanisms to restrict user access to sensitive data to the minimum needed to perform their jobs.
  2. Employee training. Make sure that employees are up-to-date on security protocols and best practices.
  3. Strong password security and MFA. Fortify the first line of defense with strong password policies and multifactor authentication (MFA).
  4. Encryption of sensitive data. Shield sensitive data at rest and in transit with robust encryption mechanisms. Choose application-level encryption for better protection whenever possible.
  5. Security audits and monitoring. Maintain vigilance through regular security audits and continuous monitoring for suspicious activity.
  6. Use DLP solutions. Implement data loss prevention (DLP) to prevent sensitive data exfiltration.
  7. Incident response and recovery. Prepare for data leaks with a plan for response, investigation, and recovery.

By adopting these measures, you can significantly enhance the security and privacy of your company’s data.

Data breaches

What about a data breach?

A data breach is a security incident where unauthorized parties gain access to sensitive or confidential information, including personal or corporate data. A data breach begins with attackers breaking into the system and stealing the data.

All data breaches are a result of an intentional cyberattack, although not all cyberattacks lead to data breaches.

Here’s how the concept of personal data breach is defined in GDPR:

‘Personal data breach’ means a breach of security leading to the accidental or unlawful destruction, loss, alteration, unauthorised disclosure of, or access to, personal data transmitted, stored or otherwise processed.

Are data breaches unavoidable?

We’ve seen a significant rise in data breaches over the past decade. According to the ITRC Annual Data Breach Report 2023, a total of 3205 data compromise incidents were reported in 2023 — a 78% increase over 2022 and a 273% increase over 2018. Most of the incidents reported in 2023 were classified as data breaches.

Data leaks and breaches can cost businesses millions of dollars in penalties and damage, putting an organization’s reputation and operations at risk.

The industry consensus is that data breaches are inevitable. This stems from several reasons:

  • The complexity of today’s environment, the loss of the perimeter, and the human factor involved.
  • The sophistication of attackers who only need to find one vulnerability, such as only one employee account to take over.
  • The increase in organizations’ digital footprint and number of digital assets.

A breach is more a question of ‘when’ than ‘if.’ Proactively safeguarding your data is no longer optional but a must. Why wait for the fire to start before investing in a fire extinguisher?

Not all data needs equal protection; instead, you should identify and focus on your crown jewels, be it PII or other kinds of information that your company considers sensitive. Apply the principle of least privilege for all data accesses. Leverage techniques like masking, field-level encryption, and data tokenization to reduce sensitive data footprints and eliminate the damage in case of a breach.

Proactively safeguarding your data before a breach happens is crucial. By implementing protective measures in advance, you can ensure the security of your users’ data. It also simplifies privacy compliance and reduces the data scope requiring active protection.

Examples of data leaks and data breaches

Data leak: GitHub

In 2016, GitHub discovered that a bug in their password reset functionality inadvertently stored user passwords in plain text within the company’s internal logs. The company has reassured users that this security lapse exposed passwords to a limited number of employees with access to these logs.

Typically, GitHub ensures password security using the bcrypt hashing algorithm. However, in this case, a bug led to logging passwords in plain text.

The incident came to light during a routine audit conducted by GitHub, dispelling any concerns of a malicious hack. This highlights the significance of regular security audits in identifying weaknesses in handling sensitive information before they escalate into more significant security incidents.

While the exact number of affected users remains unknown, GitHub’s transparency and swift response to the issue demonstrates their commitment to addressing security concerns promptly.

The GitHub plaintext password logging incident exemplifies that even the most vigilant organizations can face sensitive data leaks.

Data leak: OWASP

The Open Web Application Security Project (OWASP) recently experienced a data leak that exposed around 1000 member resumes submitted between 2006 and 2014.

OWASP, a non-profit dedicated to software security, discovered a misconfiguration on an older Wiki server that exposed member information. The leak included personal information such as names, email addresses, and phone numbers.

OWASP has mitigated the issue and notified impacted individuals as a precaution.

Data breach: Sisense

Sisense, a business intelligence company, experienced a massive data breach in April 2024. Attackers infiltrated Sisense’s systems and stole several terabytes of customer data, including email passwords, access tokens, and SSL certificates.

The breach originated with unauthorized access to Sisense’s code repository on GitLab. Once there, hackers gained credentials that granted them entry into Sisense’s Amazon S3 buckets containing sensitive information.

This incident raised concerns because Sisense serves critical infrastructure sectors. It also triggered an investigation by the US Cybersecurity and Infrastructure Security Agency (CISA), given that Sisense customers include critical infrastructure sector organizations. The investigation found the stolen data wasn’t encrypted at rest, exposing a security gap.

Once credentials to the S3 buckets are granted, the built-in encryption is unable to provide any real value. The transparent nature of encryption is very convenient but also beneficial for the intruder.

The exposure of such a large volume of sensitive data not only poses a risk to data security but also to the integrity and privacy of the affected customers, potentially leading to loss of trust and reputational damage for Sisense.

Data breach: AT&T

AT&T is facing legal action after a reported data breach exposed personal information of millions of customers.

Filed by Morgan & Morgan on behalf of Patricia Dean and other plaintiffs, the class-action lawsuit alleges that AT&T failed to properly safeguard customer data, resulting in a breach affecting approximately 73 million current and former AT&T customers. The lawsuit contends that compromised sensitive information included names, addresses, social security numbers, and email addresses.

The lawsuit centers on a data breach initially reported in 2021. While AT&T initially denied the breach, further investigation confirmed the authenticity of the breach in March 2024. The lawsuit contends that AT&T’s delayed response and potential security lapses significantly increased the risk of identity theft and fraud for affected customers.

Plaintiffs seek compensation for damages incurred as a result of the breach, along with court-ordered improvements to AT&T’s data security protocols and credit monitoring services for impacted customers.

Data breach: EquiLend

EquiLend, a financial services company, recently informed customers about a data breach linked to a ransomware attack in January 2024.

While they initially reported a technical issue, they later confirmed a ransomware attack that caused disruption of a part of their services for almost two weeks.

The data breach resulting from the attack was significant, exposing sensitive information like names, birth dates, and social security numbers of employees. However, EquiLend says there’s no evidence of malicious access to or theft of financial transaction data related to the company’s clients.

Data breach: Infosys McCamish

Over 57,000 social security and account numbers, along with other types of sensitive customer information, have been compromised in a recent data breach targeting Infosys McCamish, a vendor associated with Bank of America.

Affected customers were informed 90 days post discovery, exceeding the maximum notification period of 30 days allowed in some countries. This delay allowed more time for damage by threat actors.

This incident highlights the finance sector’s growing vulnerability to digital threats and the importance of collective efforts in data protection, signaling a critical need for proactive security measures beyond standard compliance.

Data breach: 23andMe

In October 2023, 23andMe, a genetic testing company, has disclosed a data breach impacting the information of 6.9 million users.

Hackers used a common technique called credential stuffing to steal user data. The attack involved automatically trying to log in on behalf of users whose passwords were previously leaked and published online.

The impact extended beyond individual accounts. Users who opted into 23andMe’s DNA Relatives feature inadvertently shared some of their information with others, amplifying the scope of the breach.

A class-action lawsuit against 23andMe was filed, and as of September 2024, a proposed settlement agreement required the company to pay $30 million to customers affected by the data breach.

Data breach: British Airways

In the summer of 2018, British Airways (BA) experienced a massive data breach in which the personal and financial details of approximately 500,000 customers were compromised. The breach occurred due to poor security practices and a failure to adequately protect customer data.

The Information Commissioner’s Office (ICO) found that British Airways had not implemented sufficient security measures to protect customer information. More significantly, the company was found to be storing an excessive amount of customer data beyond what was necessary for transactions, violating the principle of data minimization under GDPR.

British Airways was fortunate to face a reduced fine of £20 million, down from a staggering £183 million originally intended, as the penalty was mitigated due to the impact of the COVID-19 pandemic on the aviation industry.

Additionally, British Airways paid an undisclosed amount to customers involved in a group-action legal claim, which was settled out of court.