Stop using Enums


Stop using enums. Graphics courtesy of undraw.co

First things first, was that title clickbait? Yes. But, it was a lot catchier than “Stop using enums where they aren’t the best choice and a different data structure would do” - that was a bit of a mouthful.

Second, what are enums? According to this Wikipedia page, enums (or enumerators) are defined as:

… a data type consisting of a set of named values called elements, members, enumeral, or enumerators of the type.

If all that jargon left a bad taste in your mouth, here’s a simplified version: Essentially, enums are a rather strict data type, wherein you define a set of values that it can take and it will be limited to just those values. 

This works in certain circumstances, particularly those in which you are confident that the set of values will not change. For example, in an email, you can send an email “to” a recipient or you can have them on “cc” or “bcc” - in this instance, having an enum to define what types of recipients there can be (i.e. “to”, “cc” and “bcc”) is fine, as it’s very unlikely that email will introduce a new type of recipient.

Third, story time.

What did enums do to hurt me


They called me a defined set of names and that hurt my feelings 😭

JK, they didn’t actually hurt me, I just used them in a few places that weren’t the best places to use an enum and now you (yes, you!) can learn from my mistakes for 9 easy payments of $9.95.


In the first instance...


... I was building a fairly large database structure and, given I had just discovered enums, I used them everywhere I could. Everywhere that I had a type I would use an enum, because I thought, “If I use an enum for these types, then I can clearly define what values are allowed and that should handle validations for me, yay!”. 

This was inherently not a bad idea, where I made my mistake was that I enforced this enum data type on the database itself. This was a postgreSQL database, which meant that if I needed to change the set of allowed values on the enum (like, if I had to add a new value), then I needed an entire database migration to do that 😩.

And as luck would have it, I needed to add a new value soon after creating the enum. 

The only problem was, I had no idea how to do that 😅. I was using the Sequelize ORM to interface between my express application and my database, but Sequelize doesn’t have any built in methods to adjust the enum values. So after a while of digging through StackOverflow questions and GitHub issues, I found this beautiful piece of raw SQL:

ALTER TYPE "enum_Subscriptions_mailingPreference" ADD VALUE 'to';

Now, I know what you’re thinking, what the heck is that “enum_Subscriptions_mailingPreference” thing? That, my friends, is the automatic “name” that postgreSQL gives the enum when you create it and you can find it by digging through the SQL structure of the particular table you want to alter using PG Admin or something like that. (To translate, that’s basically saying it’s an “enum” on the “Subscriptions” table and it’s on the “mailingPreference” column).

But wait! Every SQL migration needs a “down” method as well, so that the code knows how to undo the migration if needed, that looks like this:

DELETE FROM pg_enum WHERE enumlabel = 'to' AND enumtypid = (SELECT oid FROM pg_type WHERE typname = 'enum_Subscriptions_mailingPreference')

That’s a little more complicated, but it’s essentially doing the same thing, except this time it needs the “oid” of the particular type we want to remove - in this case the one where the mailingPreference is labeled “to”.

Believe me 😉



In the second instance…


… a colleague of mine was building a separate react app to host some of our companies products. In this situation, we decided to use GraphCMS as our backend to speed up development. 

There were broadly 3 different categories of products (so far) - Reports, Articles and Videos - where each had a number of sub-categories. Given my new-found love for enums, we decided it was a good idea to make the sub-categories enums, for the same reasons as before - being able to enforce the right values. So, we had 3 different enums - ReportTypes, ArticleTypes and VideoTypes.

If you’ve been following along so far, this was clearly a bad idea, because unlike types of recipients in an email, these sub-categories changed all the time. We would add and remove sub-categories of products as they were created or discontinued. But we’ve already spoken about why that’s bad in the previous section. What I want to focus on here is how that made it difficult to dynamically query the backend for that data.

First, a little context. 

GraphCMS (as the name implies) uses GraphQL to query it’s data. The benefit of which is we can query the backend for just the data that we need, instead of all of the data there is. Theoretically, I could, in a single query, get all the required reports, articles and videos from our database, like so:

{
  reports(where: {featured: true}) {
    // list of columns I need from this table
  }
  articles(where: {featured: true}) {
    // list of columns I need from this table
  }
  videos(where: {featured: true}) {
    // list of columns I need from this table
  }
}

Sounds great so far.

But what if I wanted to write a query that dynamically selected only the videos, articles and reports that were in a particular sub-category? Easy enough, we add a variable to the graphQL query and pass that to each of the “where” clauses in the query:

query getSubcatergoryData($typeName: String!) {
  reports(where: {featured: true, reportType: $typeName}) {
    // list of columns I need from this table
  }
  articles(where: {featured: true, articleType: $typeName}) {
    // list of columns I need from this table
  }
  videos(where: {featured: true, videoType: $typeName}) {
    // list of columns I need from this table
  }
}

But that won’t work. Why? See the bit where we declare the variable? It says:

$typeName: String!

The “String!” part is where we define the type of the variable - it’s a string (the “!” just means that this variable cannot be “null” when we run this query).

But hold up. The “reportType”, “articleType” and “videoType” columns in each of those tables is not of type “string” - it’s of type “ReportType”, “ArticleType” and “VideoType”, respectively (i.e. the enums we set earlier). So, GraphCMS throws an error saying that we’ve supplied a variable of the wrong type for what we are trying to do. 

OK. No problem. Let’s just pass in 3 different variables:

query getSubcatergoryData($reportType: ReportType, $articleType: ArticleType, $videoType: VideoType) {
  // rest of query omitted for brevity
}

But this doesn’t work either. If I pass in a ReportType, ArticleType or VideoType that isn’t part of that enum, then it’ll throw another error saying that the particular type is invalid.

Which means, in order to get this working, I need to first check whether the type exists in the enum and, only if it exists, run a query for that category of product. Which isn’t really the most efficient code.


OK, so what should we do instead?


There's a better way



In the first instance...


… If you really need an enum, then consider using it at the model-level in your code, rather than at the database level. This is something I’ve done before in a ruby-on-rails project. For example, in a rails environment,  let’s assume that we have a Blog model and we use an enum to decide if the status of the blog post is “published” or “draft”. 

We could add a “status” column to the blog table on the database and set that to a type of “integer” (i.e. not an enum). That would look something like this:

# Migration adding the status column to the Blog

class AddPostStatusToBlogs < ActiveRecord::Migration[5.1]
  def change
    add_column :blogs, :status, :integer, default: 0
  end
end

Then we could add some code to the blog model to create an enum based on some integers:

# Blog model code to set the enum

class Blog < ApplicationRecord
  enum status: { draft: 0, published: 1 }
end

And now, in our Blog controller, we could use this code, like so:

# Blog controller

def toggle_status
  if @blog.draft?
    @blog.published! # If it’s a draft, publish it
  elsif @blog.published?
    @blog.draft! # If it’s published, make it a draft
  end

  redirect_to blogs_url, notice: 'Post status has been updated'
end

Doing it this way still gives us the automatic validation that enums provide, while allowing us to add and remove values to our enum without a database migration - we just need to adjust the model code. Much easier!


In the second instance…


… Instead of using an enum for something where the data changes often, we can opt to use a separate table of “Subcategories” and associate each Article, Report and Video with the relevant sub-category.

That way, if we need to query all the associated items for a particular category, we could easily run a graphQL query like so:

query getSubcatergoryData($subCategoryName: String!) {
  subCategories {
    name: $subCategoryName

    // Then get the associated reports, articles and videos
    reports(where: {featured: true} {
      // list of columns I need from this table
    }
    articles(where: {featured: true} {
      // list of columns I need from this table
    }
    videos(where: {featured: true} {
      // list of columns I need from this table
    }
  }
}

...

Overall, the lesson here is to choose how and when you use enums. If your data is unlikely to change over time, go ahead and use an enum. If you really need database level validations for the data, go ahead and use an enum.

But if not, maybe consider less restrictive data types for your data 😅

Until next time, happy coding!