We benchmarked DAST products, and this is what we learned

We benchmarked DAST products, and this is what we learned

When we started, we wanted to understand how to validate the quality of Escape's scanner findings and be able to benchmark them. Dynamic application scanning solutions are notorious for not being able to scan complex vulnerabilities, specifically the business logic vulnerabilities, and other deficiencies, and even though we believed we could do much better, we needed a way to prove it.

At Escape, we developed a new AI-powered algorithm that can test the business logic of modern APIs at scale. So, we wanted to benchmark it against famous Vulnerable Apps of different technologies (REST & GraphQL) against ZAP (the long-lasting open-source leader) and Stackhawk.

For this benchmark, we are focusing on vulnerabilities found for two vulnerable applications - VAmPI (Vulnerable REST API with OWASP Top 10 vulnerabilities for security testing) and DVGA (Damn Vulnerable GraphQL Application). The goal was to compare the results for different API types. In addition, we compared such variables as the number of requests, setup time, scan duration, and coverage.

We will go through them all in detail below.

As you go through this benchmark, you are welcome to find out more about Escape's testing platform, or try it by yourself to compare it with your DAST scanner and create your own benchmark. If you are interested in learning how you can manage API security at scale, enriched with a contextual understanding of business logic, you can request a demo for Escape's API discovery and security platform.

Landscape

We will review these solutions for two well-known vulnerable applications: VAmPI (Escape vs. ZAP) for REST API and DVGA (Escape vs. ZAP vs. Stackhawk) for GraphQL API.

We will review these solutions under 5 sections in this benchmark:

  • Vulnerabilities found: This section assesses the effectiveness of each DAST tool in identifying known vulnerabilities within the target vulnerable applications. The goal is to determine which solution is more thorough and accurate in identifying security weaknesses.
  • Number of requests: This section evaluates the efficiency of each DAST solution in terms of the number of requests made to the target applications during the scanning process. Lowering the number of requests doesn't always have an impact on the performance, it can mean that DAST scan is able to send correct requests faster to get data, reuse it, and then inject payloads. A number of requests reflect the optimization and effectiveness of the scanning algorithms employed by each solution.
  • Setup time (time to value): This section measures the amount of time and effort required to set up each solution for scanning the target applications. It includes tasks such as configuring the scanning parameters, defining authentication mechanisms, and preparing the environment for testing.
  • Scan duration: This section examines the time each DAST solution takes to complete the scanning process for the target applications. A shorter scan duration can minimize the time required for security testing, however, it should not compromise the thoroughness and accuracy of the testing process.
  • Coverage: This section evaluates the extent to which each solution scans and tests all API routes, ensuring comprehensive coverage of the application's attack surface. 

VAmPI: REST API Showdown

VAmPI is probably the most known vulnerable REST API with OWASP Top 10 vulnerabilities for security testing.

Since Stackhawk is built upon ZAP for REST APIs, we only compare Escape with ZAP in this section.

Here's the detailed head-to-head comparison of Escape and ZAP's VAmPI scanning:

ZAP vs Escape comparison on VAmPI
🗒️
Note: We have tried Access control plugin from ZAP marketplace but it was breaking the scan results, so we had to remove it.

Vulnerabilities found

regexDOS (Regular Expression Denial of Service): No clear winner.

Main dangers: Regular Expression Denial of Service (regexDOS) attacks occur when a regular expression used in an application's input validation process is crafted in such a way that it takes an excessive amount of time to evaluate certain inputs, leading to a denial-of-service (DoS) condition. The main danger of not being able to identify regexDOS vulnerabilities is the potential for attackers to exploit them to disrupt the availability of the application, causing downtime and impacting user experience.

Challenges in identification: regexDOS vulnerabilities are challenging to identify due to the nature of regular expressions and the complexity of crafting inputs that trigger excessive evaluation times. These vulnerabilities often require deep analysis of the application's input validation logic and the behavior of regular expressions under various input scenarios. Additionally, regexDOS attacks may not manifest with typical inputs, making them difficult to detect using automated scanning tools. As a result, custom configurations or manual inspection may be necessary to uncover and mitigate regexDOS vulnerabilities effectively.

Mass Assignment: Escape: yes, ZAP: no.

Main dangers: Mass Assignment vulnerabilities occur when an application allows client-supplied data to directly modify the internal state of an object, potentially leading to unauthorized access or modification of sensitive data or system resources. The main danger of not identifying Mass Assignment vulnerabilities is the risk of unauthorized data manipulation or privilege escalation, which can compromise the confidentiality, integrity, and availability of the application and its data.

Lack of Resources & Rate Limiting: Escape: yes, ZAP: no.

Main dangers: Without proper resource management and rate-limiting mechanisms, applications are vulnerable to abuse by malicious actors who can perform activities such as brute-force attacks, scraping, or denial-of-service attacks. The main dangers associated with not identifying these vulnerabilities include resource exhaustion, degraded performance, and increased susceptibility to abuse, leading to service disruptions and potential financial losses.

User and Password Enumeration: Escape: yes, ZAP: no.

Main dangers: User and Password Enumeration vulnerabilities allow attackers to enumerate valid usernames or passwords through techniques such as error messages, timing discrepancies, or differences in server responses. The main danger is that attackers can use this information to launch targeted attacks, such as brute-force attacks or credential stuffing, to gain unauthorized access to user accounts or sensitive information.

Excessive Data Exposure Through Debug Endpoint: Escape: yes, ZAP: no.

Main dangers: Exposing sensitive information or debug endpoints in a production environment can lead to unauthorized access or disclosure of sensitive data, such as credentials, session tokens, or internal system details. The main danger is that attackers can exploit this information to gain unauthorized access, escalate privileges, or launch further attacks against the application or its users, potentially leading to data breaches or service disruptions.

SQL Injection (SQLi): Escape: yes, ZAP: yes.

Main dangers: SQL Injection vulnerabilities allow attackers to manipulate SQL queries executed by the application's database, potentially leading to unauthorized access, data manipulation, or even data loss. The main danger is the risk of attackers gaining unauthorized access to sensitive data, modifying database contents, or executing arbitrary commands on the underlying database server, which can result in data breaches, data corruption, or compromise of the entire application.

Unauthorized Password Change: Escape: needs custom configurations, ZAP: no.

Main dangers: Unauthorized Password Change vulnerabilities allow attackers to change the passwords of user accounts without proper authentication or authorization, leading to account takeover or unauthorized access. The main danger is that attackers can gain control over user accounts, impersonate legitimate users, and access sensitive information or perform malicious actions on behalf of the compromised accounts, posing a significant security and privacy risk to the affected users and the application.

Broken Object Level Authorization: Escape: needs custom configurations, ZAP: no.

Main dangers: Broken Object Level Authorization vulnerabilities occur when an application fails to properly enforce access controls, allowing unauthorized users to access or manipulate sensitive objects or resources. The main danger is that attackers can exploit these vulnerabilities to gain unauthorized access to sensitive data or functionality, bypassing intended access controls and potentially causing data breaches, unauthorized transactions, or other security incidents that compromise the confidentiality, integrity, and availability of the application and its data.

The main advantage of Escape for all found vulnerabilities is the detailed explanations of how to fix them. For example, here is an example of how to fix injection vulnerability for VAmPI:

Injection fix adapted to Flask

Number of requests

Escape: 860, ZAP: 3546
Escape required significantly fewer requests (860) compared to ZAP's 3546. This difference is crucial as it minimizes the impact on system resources and network bandwidth. With fewer requests, Escape optimizes the scanning process, reducing the load on the target application and its server. It means that Escape's DAST scan is able to send correct requests faster to get data, reuse it, and then inject payloads. This efficiency not only speeds up the scanning process but also ensures smoother performance of the application during testing, minimizing disruptions to regular operations.

Setup time

Escape: 1 mins, ZAP: 1 mins

ZAP's user-friendly GUI felt nice and easy to use. The ability to save multiple profiles for scans added convenience to our testing process. Escape's setup took slightly longer, requiring approximately 1.5 minutes compared to ZAP's 1 minute, but overall, it's very close. Escape's setup process remained efficient and user-friendly, allowing users to configure and deploy the scanning environment still pretty quickly.

Scan duration

Escape: 4 mins, ZAP: 4 mins

The scan was run on small pods with 1 CPU and 2GB of RAM. Escape completed the scan in 4 minutes, providing thorough testing within a reasonable timeframe. ZAP has also completed the scan in approximately 4 mins. Both scan durations are reasonable and ensure that security testing does not unduly prolong the development lifecycle or delay the release of software updates.

It is crucial to consider scan duration, coverage, and vulnerabilities together, as scan duration alone has no value. Therefore, we strongly recommend considering these three data points simultaneously.

The 20 seconds of extra time spent by Escape make no significant difference in scanning. They allow for a more thorough examination of potential vulnerabilities, resulting in higher confidence in the application's security posture.

Coverage

Escape: 76%, ZAP: 1.90%

Escape demonstrated superior coverage, scanning 76% of the target application, compared to ZAP's mere 1.90%. This extensive coverage is crucial for in-depth security testing and mitigation strategies. Scanning a larger portion of the application allows Escape to identify a wider range of vulnerabilities and security weaknesses, including those in less frequently accessed areas or functionalities. This comprehensive assessment helps organizations prioritize and address the most critical security risks, ultimately enhancing the application's overall security and reducing the likelihood of successful attacks.

DVGA: GraphQL API Showdown

For DVGA (Damn Vulnerable GraphQL Application), we compared results between 3 DAST scanners : ZAP, Escape and Stackhawk.

To run a scan with ZAP, we also tried the GraphQL support plugin on the marketplace. This add-on allows you to import GraphQL definitions and send queries generated from them.

Unfortunately, the GraphQL plugin managed to import the schema, but the generated queries were really bad and random: 62 with status code 200, and 60 of which were the same query on the home page.

ZAP vs Escape vs StackHawk comparison on DVGA

Vulnerabilities found

GraphQL JWT Token Forgery: Escape: No, ZAP: No, StackHawk: No

Main dangers: This vulnerability could lead to unauthorized access if attackers are able to forge or manipulate JSON Web Tokens (JWTs) used for authentication in GraphQL requests. Without proper token validation, attackers could gain unauthorized access to sensitive data or functionality within the application.

OS Command Injection: Escape: Yes, ZAP: No, StackHawk: Yes

Main dangers: OS Command Injection vulnerabilities pose a significant risk of remote code execution, allowing attackers to execute arbitrary commands on the underlying operating system. This can lead to data breaches, system compromise, and potentially full control over the server hosting the GraphQL API.

SQL Injection: Escape: Yes, ZAP: No, StackHawk: Yes

Main dangers:
SQL Injection vulnerabilities enable attackers to manipulate SQL queries executed by the GraphQL API, potentially leading to unauthorized access, data manipulation, or even data loss. This can result in data breaches, data corruption, or compromise of the entire application's data.

Circular Fragment Attack: Escape: Yes, ZAP: No, StackHawk: No

Main dangers:
Circular Fragment vulnerabilities can result in excessive server-side processing, potentially leading to denial-of-service (DoS) conditions if attackers exploit them to create recursive or infinitely nested GraphQL queries. This can disrupt the availability of the application, leading to downtime and impacting user experience.

Deep Recursion Query Attack: Escape: Yes, ZAP: No, StackHawk: No

Main dangers:
Deep Recursion Query attacks can similarly lead to denial-of-service conditions by consuming excessive server resources with deeply nested GraphQL queries. This can degrade the performance of the application and potentially render it unresponsive to legitimate user requests.

GraphQL Interface Exploit: Escape: Yes, ZAP: No, StackHawk: No

Main dangers:
GraphQL Interface vulnerabilities may allow attackers to access or manipulate sensitive data or functionality by exploiting flaws in the interface definitions. This can lead to unauthorized access, data breaches, or other security incidents compromising the confidentiality, integrity, and availability of the application.

Server-Side Request Forgery: Escape: Yes, ZAP: No, StackHawk: No

Main dangers:
Server-side Request Forgery vulnerabilities enable attackers to send arbitrary requests from the server hosting the GraphQL API, potentially leading to unauthorized access to internal resources or services. This can facilitate further attacks, such as accessing sensitive data or launching attacks against other systems within the network.

Detecting GraphQL Framework: Escape: Yes, ZAP: No, StackHawk: No

Main dangers:
The inability to detect GraphQL can hinder security monitoring and incident response efforts, making it difficult to identify and mitigate attacks targeting the GraphQL API. This can increase the risk of successful attacks going undetected, allowing attackers to maintain persistence and continue exploiting vulnerabilities over time.

GraphQL Fingerprinting: Escape: Yes, ZAP: No, StackHawk: No

Main dangers:
Without proper fingerprinting of GraphQL, attackers can gain insights into the structure and capabilities of the API, aiding in the identification and exploitation of vulnerabilities. This can facilitate targeted attacks, leading to unauthorized access, data breaches, or other security incidents.

Batch Query Attack: Escape: Yes, ZAP: No, StackHawk: No

Main dangers:
A Batch Query Attack involves sending a large number of GraphQL queries simultaneously or in quick succession. This can overwhelm the server, leading to denial-of-service (DoS) conditions or performance degradation. Attackers may exploit this vulnerability to disrupt the availability of the GraphQL API, causing downtime and impacting user experience.

Resource-Intensive Query Attack: Escape: Yes, ZAP: No, StackHawk: No

Main dangers:
Resource-intensive Query Attacks involve crafting GraphQL queries that are designed to consume excessive server resources, such as CPU, memory, or network bandwidth. This can result in performance degradation, slowdowns, or even server crashes. Attackers may exploit this vulnerability to exhaust server resources, leading to denial-of-service (DoS) conditions and disrupting the availability of the GraphQL API.

Aliases-Based Attack: Escape: Yes, ZAP: No, StackHawk: No

Main dangers:
Aliases-based Attacks involve manipulating GraphQL query aliases to bypass access controls or retrieve sensitive data. By crafting queries with different aliases, attackers may exploit vulnerabilities in the GraphQL resolver functions to access unauthorized data or execute unauthorized actions. This can lead to data breaches, unauthorized access, or other security incidents compromising the confidentiality and integrity of the application's data.

GraphQL Introspection Exploit: Escape: Yes, ZAP: No, StackHawk: No

Main dangers:
GraphQL Introspection vulnerabilities allow attackers to gain insights into the schema and structure of the GraphQL API. Attackers may use introspection queries to discover sensitive data or identify potential attack vectors within the API. This can aid in the identification and exploitation of vulnerabilities, leading to unauthorized access, data breaches, or other security incidents compromising the confidentiality and integrity of the application's data.

GraphQL Field Suggestions Exploit: Escape: No, ZAP: No, StackHawk: No

Main dangers:
Graphql Field Suggestions vulnerabilities involve exposing sensitive information or functionality through GraphQL field suggestions. Attackers may leverage field suggestions to discover hidden or undocumented endpoints, access sensitive data, or execute unauthorized actions. This can lead to unauthorized access, data breaches, or other security incidents compromising the confidentiality and integrity of the application's data.

HTML Injection: Escape: No, ZAP: No, StackHawk: No

Main dangers:
HTML Injection vulnerabilities allow attackers to inject and execute arbitrary HTML or JavaScript code within the response generated by the GraphQL API. Attackers may exploit this vulnerability to conduct cross-site scripting (XSS) attacks, steal sensitive information, or perform unauthorized actions on behalf of authenticated users. This can lead to data breaches, unauthorized access, or other security incidents compromising the confidentiality and integrity of the application's data.

Stored XSS: Escape: No, ZAP: No, StackHawk: No

Main dangers:
Stored XSS vulnerabilities occur when untrusted data is stored and later rendered on a web page without proper validation or encoding. Attackers may exploit this vulnerability to inject malicious scripts that execute in the context of authenticated users, leading to unauthorized access, session hijacking, or other security incidents, compromising the confidentiality and integrity of the application's data.

Stack Trace Errors: Escape: No, ZAP: No, StackHawk: No

Main dangers:
Stack Traces Errors Resource-intensive vulnerabilities involve exposing detailed error messages or stack traces in GraphQL responses. Attackers may exploit this vulnerability to gather sensitive information about the application's internal workings, such as file paths, library versions, or error details. This information can aid attackers in crafting targeted attacks, identifying vulnerabilities, or conducting reconnaissance activities, ultimately compromising the security of the application.

GraphQL Query Weak Password Protection, Log Injection / Log Spoofing, GraphQL Interface Protection Bypass, GraphQL Query Deny List Bypass, Arbitrary File Write / Path Traversal were found by none of the tools in this scan. However, these vulnerabilities can be found by implementing Escape Rules - custom rules functionality within Escape's platform.

💡
In addition to the vulnerabilities found in the DVGA application, note that Escape's scanner supports many GraphQL-specific checks, such as recursive queries, schema leakage, API brute forcing, and security misconfigurations, which ZAP and StackHawk do not.

Number of requests

Escape: 1282, ZAP: 157, StackHawk: 8743

The number of requests made by each tool during the scanning process varies significantly. In this comparison, Escape sent 1282 requests, while ZAP only sent 157 requests. This considerable difference can be attributed to ZAP's lack of understanding of GraphQL and problems with the dedicated plugin we mentioned above (generated queries were badly formatted with random arguments: 62 with status code 200, and 60 of which were the same query on the home page), leading it to send minimal requests. Since ZAP doesn't comprehend the business logic of GraphQL applications, it essentially sends almost nothing in comparison to Escape and StackHawk. As a result, Escape and StackHawk, which are capable of handling GraphQL requests, generate a much higher number of requests in their scanning processes.

The difference between Escape and StackHawk can be explained by Escape's superior algorithm that understands an application quickly and requires fewer requests to cover a broader section of it:

Infered status code of the DVGA responses over time during a scan of Escape

Disclaimer: GraphQL applications are built to return 200 errors only. This is why Escape has developed a “real-error” inference module  

For example, invalid GraphQL errors are supposed to be returned with a 200 error message by the GraphQL specification, but they are going to be inferred by Escape into “Bad Request” (400).

Escape's scanner is able to navigate through the API smoothly and the scan can be decomposed in 3 steps:

  • STEP 1: In this graph, we can identify at the beginning of the scan a lot of 404, 400 and 500 errors. This corresponds to the first iterations of the Feedback Driven Semantic API Exploration algorithm: Escape tries to understand the business logic of the API, and his first requests are less precise than the following.
  • STEP 2: Then the scan enters a phase where the algorithm generates sequences of requests that are mainly successful (200). From time to time, Escape stills sends “Bad Requests” (400) for 2 reasons:
  1. Fuzzing is performed for specific security purposes and might lead to bad requests
  2. The Feedback-Driven Semantic API Exploration algorithm tries not to be stuck in a local minimum of exploration and, therefore, keeps sending requests with slightly different parameters. Escape never sends twice the same requests.
  • STEP 3: Finally, since Escape was executed in its most powerful mode during this scan, we can see that starting at some point (approximately 60% of the duration), DVGA answers a bunch of 500 errors and then stops responding during the rest of the scan (408).

Setup time

Escape: 20 sec, ZAP: 1 min StackHawk: 1 min

Engineers have very little time and, therefore, patience. The speed and easiness of the setup are usually a good indicator of a good developer tool. Escape had a superior setup time for the GraphQL application. However, it's still pretty close with ZAP and StackHawk. Thus, we can clearly see that they mostly all live up to this expectation when it comes to setup speed.

Scan duration

Escape: 12 min, ZAP: 27 sec StackHawk: 18 min

The scan was run on small pods with 1 CPU and 2GB of RAM.

Since ZAP only tests the surface of the API without assessing its business logic, that is why the scan ends very fast. During this test, Stackhawk appeared to be 50% slower than Escape.

It is crucial to consider scan duration, coverage, and vulnerabilities together, as scan duration alone holds no value. Therefore, we strongly recommend keeping these three data points in mind simultaneously.

Coverage

Escape: 84%, ZAP: 1.30% StackHawk: Unknown

Escape demonstrated superior coverage, scanning 84% of the target application resolvers, compared to ZAP's mere 1.30%. StackHawk coverage is unknown. 

Escape's broad coverage is vital for comprehensive security testing and mitigation strategies. Obviously, by scanning a larger portion of the application, Escape can attempt better attack scenarios and detect a wider range of vulnerabilities.

Conclusion


We wanted to validate the effectiveness of Escape's scanner and establish benchmarks in a comprehensive comparison against industry-standard tools like ZAP and StackHawk. Our focus was on detecting various types of vulnerabilities within REST and GraphQL APIs.

Through rigorous testing on well-known vulnerable applications like VAmPI and DVGA, Escape's AI-powered algorithm consistently outperformed competitors, identifying a wide range of vulnerabilities while consuming minimal system resources.

Our goal is to try to update this benchmark every once in a while for more vulnerable applications.. We are excited to hear your feedback and comments on this, so please don't hesitate to reach out to us on Slack!