How to generate and utilize ULIDs in ClickHouse

Learn how to generate and utilize ULIDs in ClickHouse, understand the benefits of ULIDs over UUIDs, and how to use them as sort keys in your applications.

Photo: Propel

In this post, we will explore the <span class="code-exp">generateULID</span> function and the <span class="code-exp">ULIDStringToDateTime</span> function in ClickHouse, and provide an overview of what a ULID is.

What is a ULID?

A ULID, or Universally Unique Lexicographically Sortable Identifier, is an identifier that is unique across space and time. It is similar to a UUID (Universally Unique Identifier), but with the added benefit of being lexicographically sortable. This means that if you sort ULIDs, they will be in the order of their creation time, which is a valuable property for many applications, especially those that deal with time-series data.

What is the <span class="code-exp-header">generateULID</span> function?

The <span class="code-exp">generateULID</span> function in ClickHouse is a function that generates a ULID. The syntax for the generateULID function is:

generateULID()

The <span class="code-exp">generateULID</span> function returns a FixedString type value. Here's an example of how you can use it:

SELECT generateULID()

┌─generateULID()─────────────┐
│ 01GNB2S2FGN2P93QPXDNB4EN2R │
└────────────────────────────┘

If you need to generate multiple values in one row, you can pass different expressions to the function, like so:

What is the <span class="code-exp-header">ULIDStringToDateTime</span> function?

The <span class="code-exp">ULIDStringToDateTime</span> function in ClickHouse is a function that extracts the timestamp from a ULID. The syntax for the <span class="code-exp">ULIDStringToDateTime</span> function is:

ULIDStringToDateTime(ulid[, timezone])

Here, <span class="code-exp">ulid</span> is the input ULID, which can be a <span class="code-exp">String</span> or <span class="code-exp">FixedString(26)</span>, and <span class="code-exp">timezone</span> is the timezone name for the returned value, which is optional.

The <span class="code-exp">ULIDStringToDateTime</span> function returns a timestamp with milliseconds precision. Here's an example of how you can use it:

SELECT ULIDStringToDateTime('01GNB2S2FGN2P93QPXDNB4EN2R')

┌─ULIDStringToDateTime('01GNB2S2FGN2P93QPXDNB4EN2R')─┐
│                            2022-12-28 00:40:37.616 │
└────────────────────────────────────────────────────┘

ULIDs vs UUIDs: Why we recommend ULIDs

While ULIDs and UUIDs both serve the purpose of providing unique identifiers, there are key differences that set them apart, and may make ULIDs more suitable for certain applications.

A UUID, or Universally Unique Identifier, is a 128-bit number used to uniquely identify some object or entity in the system. While they are great for ensuring uniqueness, they do not carry any other information or have any inherent order.

On the other hand, a ULID, or Universally Unique Lexicographically Sortable Identifier, while also ensuring uniqueness, has the added benefit of being lexicographically sortable. This means that if you sort ULIDs, they will be in the order of their creation time. This property is incredibly valuable for applications dealing with time-series data, or any application where the order of creation is important.

In addition, ULIDs are designed to be friendly with both humans and machines, with a 26-character length that can be easily read and processed. They are URL-safe and can be case-insensitively sorted, offering further advantages over UUIDs.

For these reasons, we recommend using ULIDs over UUIDs in most cases, especially when dealing with time-series data. They offer all the benefits of UUIDs, with added features that make them more versatile and user-friendly.

Advanced example: Using ULIDs as Sort Keys

Due to their lexicographically sortable property, ULIDs serve a dual purpose in the sort key. Not only do they ensure uniqueness, but also maintain order based on time of creation.

Consider a scenario where we have a table storing events that occur in a system. Each event has a timestamp and a ULID. If we want to retrieve events in the order they occurred, we can sort by the ULID.

Here's an example:

CREATE TABLE events (
   event_name String,
   event_time DateTime,
   event_id FixedString(26) DEFAULT generateULID()
) ENGINE = MergeTree()
ORDER BY event_id;

INSERT INTO events (event_name, event_time) VALUES
   ('event1', now()),
   ('event2', now()),
   ('event3', now());

SELECT *
FROM events
ORDER BY event_id;

┌─event_name─┬──────────event_time─┬─event_id─────────────┐
│ event1     │ 2022-12-28 00:40:37 │ 01GNB2S2FGN2P93QPXDN │
│ event2     │ 2022-12-28 00:40:37 │ 01GNB2S2FGN2P93QPXDN │
│ event3     │ 2022-12-28 00:40:37 │ 01GNB2S2FGN2P93QPXDN │
└────────────┴─────────────────────┴──────────────────────┘

As you can see, the events are returned in the order they were inserted into the table, thanks to the lexicographical sorting of ULIDs. This can be very useful in time-series data analysis and other applications where the order of events is important.

Build faster with Propel: A Serverless Clickhouse for developers

At Propel, we offer a fully managed ClickHouse service that allows you to focus more on drawing insights from your data and less on infrastructure management. Propel provides an API for data access, React components, and built-in multi-tenant access controls, making it easier and faster for you to build data-intensive applications.

You can connect your own ClickHouse with Propel, whether it's self-hosted or on the ClickHouse Cloud, or take advantage of our fully managed serverless cloud.

Connect your own ClickHouse



✅ Works with self-hosted ClickHouse.
✅ Works with ClickHouse Cloud.
✅ Full control of your ClickHouse deployment.
✅ Data stays in your infrastructure.
✅ Data is encrypted in transit.
✅ Only pay for data passthrough.

Read the docs

Fully managed serverless cloud



✅ Ingest data from any data source.
✅ No infrastructure to manage or scale.
✅ Serverless auto-scaling.
✅ Mission-critical availability.
✅ Unlimited storage.
✅ Data is encrypted at rest and in transit.
✅ Only pay for the storage you use and queries you make.

Read the docs

Conclusion

In this post, we explored the <span class="code-exp">generateULID</span> and <span class="code-exp">ULIDStringToDateTime</span> functions in ClickHouse. These functions provide a powerful way to generate unique identifiers and extract time information from them, which can be invaluable for applications that deal with time-series data.

ClickHouse offers many other powerful functions to help you manipulate and work with various data types. Don't hesitate to explore the official ClickHouse documentation to learn more about these functions and how to use them effectively.

Further reading

For more insights on how to use ClickHouse for your data operations, check out our other posts. We cover a wide range of topics, from advanced querying techniques to performance tuning for large datasets. Whether you're a beginner or an experienced data professional, there's always something new to learn!

Related posts

Start shipping today

Deliver the analytics your customers have been asking for.