Input Validation and Sanitization in GraphQL

Input Validation and Sanitization in GraphQL
GraphQL Input Validation Banner

Why input validation and sanitization are important in GraphQL?

GraphQL allows you to identify the data and validate inputs based on type information. By default, GraphQL Specification has the Int, Float, String, Boolean and ID Scalar types. But as a conscious API developer, you've probably come across situations where user input needs to be validated and sanitized before being processed. Input validation and sanitization are essential to ensure your data's integrity, consistency, and security. They play a critical role in preventing security vulnerabilities such as SQL injection and cross-site scripting (XSS) attacks.

In this blog post, we will discuss three approaches to input validation and sanitization in GraphQL: homemade middleware, directives, and custom scalars. We will use a simple movie rating use case below as an example to demonstrate how each approach can be applied.

Consider the following GraphQL schema for a movie review system:

type Movie {
  id: ID!
  name: String!
  description: String!
  reviews: [Review!]!
}

type Review {
  movieId: ID!
  authorEmail: String!
  rating: Int!
  comment: String
}

input ReviewInput {
  movieId: ID!
  authorEmail: String!
  rating: Int!
  comment: String
}

type Mutation {
  addReview(input: ReviewInput): Review
}

In this example, users can submit a product review by providing the product ID, their name as the author, a rating (from 1 to 5), and an optional comment. We'll use this schema as a basis to demonstrate input validation and sanitization techniques in each of the three approaches discussed in the article.

Three approaches to GraphQL input validation and sanitization

1. One middleware to rule them all

In this section, we use a code snippet provided by one of our fellow customers, Mehdi Sqalli from Myria. This middleware function would sanitize every field without the need of changing all the types everywhere, using the free open-source library DOMPurify.

npm install dompurify

Once installed, we can start creating our middleware:

const DOMPurify = require('dompurify');

export const sanitizeMiddleware = ({ input }, next) => {
  // Sanitize all input fields recursively
  function sanitizeObject(obj) {
    if (obj === null || typeof obj !== 'object') {
      return obj;
    }
    const sanitizedObj = {};
    for (const key in obj) {
      if (obj.hasOwnProperty(key)) {
        const value = obj[key];
        if (value !== null && typeof value === 'object') {
          sanitizedObj[key] = sanitizeObject(value);
        } else {
          sanitizedObj[key] = DOMPurify.sanitize(value);
        }
      }
    }
    return sanitizedObj;
  }

  // Sanitize the input object recursively
  const sanitizedInput = sanitizeObject(input);

  // Call the next middleware function or resolver
  return next({ input: sanitizedInput });
};

And then we just have to add our new middleware in the handler:

const handler = createGraphQLHandler({
  schema: makeExecutableSchema({
    typeDefs,
    resolvers,
  }),
  context: createContext,
  middleware: [sanitizeMiddleware],
});

With this homemade middleware in place, we ensure that all input fields for our product review system are properly sanitized before our resolvers process them. This provides you with an easy-to-implement solution, customizability, and centralized validation, but it comes at the cost of limited reusability, maintenance overhead, and potentially fewer advanced features.

In the next part, we will explore how to achieve input validation and sanitization using GraphQL directives.

2. Directives constraints

While GraphQL supports basic input validation based on type information, it doesn't cater to more complex validation needs like enforcing the minimum length of a string or a range for a number. One way to handle such validation requirements is by using schema directives.

To implement validation with schema directives, we'll use the graphql-constraint-directive module. First, install the module:

npm install graphql-constraint-directive

Next, update your schema configuration by adding the ConstraintDirective:

const { makeExecutableSchema } = require('graphql-tools');
const ConstraintDirective = require('graphql-constraint-directive');

const typeDefs = `YOUR_SCHEMA_DEFINITION_LANGUAGE`;

const schema = makeExecutableSchema({
  typeDefs,
  schemaDirectives: { constraint: ConstraintDirective },
});

You can now add the @constraint directive to your schema definition to define input validation rules. Here's an example of how to apply the @constraint directive to our movie rating example:

input ReviewInput {
  movieId: ID! @constraint(format: "uuid")
  authorEmail: String! @constraint(format: "email", maxLength: 255)
  rating: Int! @constraint(min: 1, max: 10)
  comment: String @constaint(pattern: "^[0-9a-zA-Z\s]*$", minLength: 10, maxLength: 255)
}

In this example, the @constraint directive is used to:

  • Ensure the movieId is at least 1 character long
  • Validate that the rating is an integer between 1 and 10
  • Enforce the authorEmail to be an email.
  • Enforce a minimum length of 10 characters and a maximum length of 500 characters for the optional review field, and remove special characters.

If the input data doesn't meet these validation rules, the server will respond with an error message. Using directives for input validation and sanitization provides a declarative and flexible approach, but it has some limitations, such as being limited to schema definition and relying on third-party modules.

3. The GraphQL Scalars open-source library

In GraphQL, it is also possible to define custom scalars. But the GraphQL Scalars library developed by our friends from The Guild comes with a massive predefined collection of common type-safe scalars used in GraphQL applications, with the associated sanitization and validation functions.

First, you'll need to install the graphql-scalars library to access the predefined type-safe GraphQL Scalars:

npm install graphql-scalars

You can either import all of the resolvers at once or import specific scalar resolvers:

import { UUID, EmailAddress, PositiveInt, NonEmptyString } from 'graphql-scalars';

Add the imported resolvers to your root resolver map:

const myResolverMap = {
  UUID: UUID,
  EmailAddress: EmailAddress,
  PositiveInt: PositiveInt,
  NonEmptyString: NonEmptyString,

  Query: {
    // stuff here
  },

  Mutation: {
    // stuff here
  },
};

Now that the custom scalars are set up, you can use them in your schema definition like any other scalar:

input ReviewInput {
  movieId: UUID!
  autherEmail: EmailAddress!
  rating: PositiveInt!
  comment: NonEmptyString
}

As for middleware or directives, this ensures your user inputs match the desired format and are safe for your application.

Using custom scalars from the graphql-scalars library offers type safety, input validation, and sanitization, along with reusability and reduced boilerplate code. However, it introduces a third-party dependency, may have limited customization options, and could have a slight performance impact. Overall, custom scalars provide a valuable way to improve data integrity and security in GraphQL APIs.

Conclusion

Input validation and sanitization are crucial steps in ensuring the security and integrity of your GraphQL API. In this article, we have explored three approaches to achieve this:

  • homemade middleware
  • directives
  • custom scalars.

Choose the approach that best fits your use case and development style, and remember to apply these techniques consistently across your API to minimize potential risks.

Or better yet, consider using a specialized security testing tool like Escape, which focuses on GraphQL vulnerabilities. Escape can quickly identify and fix issues across your API endpoints, offering remediation snippets within minutes—without the need for complex integrations or traffic monitoring.

💡 Want to learn more about GraphQL security?