/* Code Comments */

Check For Duplicates - COUNT, CASE, and EXIST In Postgres

July 16, 2019

Before I write to a database, I want to make sure that I don’t create a duplicate record based on the name and parent record id.

To accomplish that, I want to abort early and alert the client if the name they’ve provided is a duplicate.

The pseudocode looks something like:

const createNewRecord({name, parent_id}) => {

const duplicate = await checkForDuplicate({name, parent_id})
if (duplicate) throw new Error(`Oy! A duplicate. Try a different name. We already have a value for --> `, ${name})

const data = await this.pool.query(SQL`
-- insert query
`);
return data.rows;

I’ve written previously about how to insert values into tables with Postgres, but what I wasn’t sure about was how to write a query that would return a boolean value for duplicates (i.e. SQL query to use in checkForDupcliate) (Note: I’m writing in Javascript and interpolating SQL statements using sql-template-string)

Start With COUNT(*)

One way to figure out if any records exist would be to count the total that meet a condition using COUNT(*).

For example:

SELECT COUNT(*) FROM my_table
  WHERE parent_id = ${parent_id}
  AND name ILIKE ${name};

This is asking a different question, however. Instead of asking a true/false question, I’m now asking how many. My conditional in Javascript would no longer be if (duplicate) but more reasonably if (count > 0).

From a purely computational perspective, counting is more expensive than checking for the existence of something.

Move On To Conditionals

Instead of the count, we can use the CASE conditional. The CASE is a generic conditional expression in Postgres.

The Postgres documentation for CASE provides the following definition:

CASE WHEN condition THEN result
     [WHEN ...]
     [ELSE result]
END

Great, but what is my condition?

In this case, all I care about is if something exists… which is convenient, because Postgres’ EXIST checks for exactly that.

EXIST is one of several subquery expressions (similar to IN, NOT IN, ALL, etc.).

It takes a subquery and evaluates to true if rows are returned by the subquery and false otherwise.

The resulting query might look like:

SELECT CASE WHEN EXISTS (
  SELECT * FROM my_table
  WHERE parent_id = ${parent_id}
  AND name ILIKE ${name}
  ) THEN TRUE::bool
  ELSE FALSE::bool
END `;

Refactoring Time: Dropping Case

This CASE approach works. But, it’s unnecessary. If I wanted to return something other than a boolean I could use CASE. E.g.,

SELECT CASE WHEN EXISTS (
  SELECT * FROM my_table
  WHERE parent_id = ${parent_id}
  AND name ILIKE ${name}
  ) THEN 'exists'::text
  ELSE 'does not exist'::text
END `;

But that’s not what I’m looking for. I just need TRUE/FALSE which is what the EXIST returns natively.

As a result, I simplified the query to the following:

SELECT EXISTS (
  SELECT * FROM my_table
  WHERE parent_id = ${parent_id}
  AND name ILIKE ${name}
  ) AS exists `

Additional Reading


Stephen Weiss

Thanks for reading! My name's Stephen Weiss. I live in Chicago with my wife, Kate, and dog, Finn.
Click here to see the archives of my weeks in review and sign up yourself!