Detecting Domain Generation Algorithms (DGAs): A Healthcare Perspective

Protecting Healthcare from DGAs: A Deep Dive into Detection Strategies

Reut Ginat Eliash, Data Scientist

Mar 13, 2025

Blog

In the world of cybersecurity, few challenges are as dynamic and elusive as detecting Domain Generation Algorithms (DGAs). For organizations like Cynerio, operating in the healthcare sector, the stakes are especially high. Protecting sensitive patient data and ensuring uninterrupted operations is critical, and DGAs represent a potent tool in an attacker’s arsenal.

What are DGAs

DGAs are algorithms designed to generate large numbers of domain names in a systematic way. Malware uses this technique to generate domains to communicate with command-and-control (C2) servers, ensuring persistence even if some domains are detected and blocked. This adaptability makes them a formidable challenge for defense systems.

The DGA technique enhances the resiliency of the malware and its ability to evade detection. While specific domains can be detected and blocked, blocking all possible domains that the DGA can generate is unfeasible, and so traditional tools such as DNS filtering or firewall blocklists are rendered inefficient.

Why DGAs Matter in Healthcare

Healthcare networks are a prime target for cyberattacks due to their high-value data and critical role in patient care. In 2024, the team observed a significant increase in ransomware attacks targeted at healthcare. According to the TRM Labs Report (August 2024), 30% of ransomware attacks in the United States targeted healthcare organizations.

A notable example of a DGA-enabled attack in healthcare was the LockBit ransomware group, which targeted multiple healthcare facilities in 2024.

LockBit is notorious for using Domain Generation Algorithms (DGAs) to dynamically create numerous domain names for its command-and-control servers. This technique allowed the ransomware to evade detection, even as defenders worked to block malicious domains. According to a report by Reuters(Reuters, 2024), LockBit and its affiliates extorted at least $500 million in payments from victims, as well as causing significant costs from lost revenue and incident response and recovery.

With security teams in hospitals being stretched thin and understaffed, A DGA-enabled attack could easily fly under the radar, compromise sensitive information or disrupt essential services. This risk drives the team’s commitment to developing robust defenses, including leveraging machine learning to detect and mitigate DGAs.

Developing a Machine Learning Model for Detection

To address this challenge, CynerioLive team developed a machine learning model using XGBoost, trained on a dataset of tagged domains. This included extracting the second-level domain (SLD) from each domain name to focus on its core components. Additionally, the team incorporated entropy calculation to measure the randomness of the domain names, a common trait of many DGAs. Lastly, ordinal encoding was applied to transform categorical data into numerical formats suitable for the model.
This approach combines domain structure analysis with machine learning techniques.

The model performed exceptionally well on the training set, achieving 95% accuracy. However, as with any machine learning model, the true test lies in its ability to generalize to unseen data.

A Surprising Drop in Performance

When the team tested the model on a new dataset, the results were unexpected: accuracy dropped to 63%. Curious and concerned, the team delved deeper into the data. What was found was eye-opening: the validation dataset contained eight different types of DGAs, and the model’s performance varied drastically across them.

Here’s how the model fared against different DGAs in the validation dataset:

This revealed a critical insight: one size doesn’t fit all when it comes to DGA detection. The model excelled at detecting some DGAs, like Zeus and Ramdo, but struggled significantly with others, such as Matsnu and Rovnix. These discrepancies highlighted the importance of training on a diverse dataset that represents the wide variety of DGAs in use today.

Key Lessons Learned

The findings emphasized the nuanced nature of DGA detection. Not all DGAs behave the same; different algorithms generate domains in different ways. For instance, the patterns created by Rovnix and Matsnu differ significantly from those of Zeus or Ramdo, primarily due to their use of wordlist-based generation techniques, which result in more human-like domain names and present greater challenges for detection. This variability underscores the need to account for the diversity of DGA behaviors in any detection strategy. It also demonstrates how changing the generation algorithm can be extremely useful in evading detection. It’s critical to stay up-to-date on emerging threats, and to build detection mechanisms that are as resilient as possible to these evasion attempts.

Looking Ahead

The findings highlight several areas for future investigation. One key focus is the overall patterns of DGA activity. For example, when examining DGA-generated domains, it has been observed that many fail to resolve, but a small subset often does. These successfully resolving domains may represent the most critical targets for investigation and blocking. Understanding these patterns, potentially through AI-driven analytics and anomaly detection, could significantly enhance the ability to identify and mitigate threats.

Another priority is preparing for the inevitable emergence of new DGAs. Attackers are constantly innovating, and the next DGA might use techniques that evade current detection methods. Building flexible, adaptive defenses is crucial to staying ahead.

Finally, this work has reinforced the need for continuous learning. By incorporating diverse datasets, applying AI-enhanced feature engineering, and exploring advanced features, Models can be refined to better account for the wide variety of DGAs in the wild.

Conclusion

Detecting DGAs is a complex, evolving challenge, particularly in high-stakes industries like healthcare. This journey so far has underscored the importance of understanding the diverse behaviors of different DGAs, training on heterogeneous datasets, and continuously adapting to new threats.

RMM Tools in Healthcare

Roey Vilnai, VP Data at Cynerio

Apr 23, 2025

ICSMA-25-030-01 - Contec Health CMS8000 Patient Monitor

Roey Vilnai, VP Data

Jan 31, 2025

Go Back to Blog

Detecting Domain Generation Algorithms (DGAs): A Healthcare Perspective

read next

RMM Tools in Healthcare

ICSMA-25-030-01 - Contec Health CMS8000 Patient Monitor

Get Your Free Pass to HIMSS21