Convert between different character encodings.
Encoding functions in help convert data between different character encodings.
These functions enable you to:
Returns a string with the length equal to the number of passed arguments, where each byte has the value of the corresponding argument.
Syntax:
Arguments:
number_1
, number_2
, …, number_n
(numeric): Numerical arguments interpreted as integers. Types: Int
, Float
.Returns:
String
.Example:
Result:
You can construct a string of arbitrary encoding by passing the corresponding bytes. Here’s an example for UTF-8:
Result:
If the value of an argument is out of the range of the UInt8
data type, it is converted to UInt8
with possible rounding and overflow.
Returns a string containing the argument’s hexadecimal representation.
Syntax
The function uses uppercase letters A-F and does not include any prefixes (like 0x
) or suffixes (like h
).
For integer arguments, it prints hex digits from the most significant to least significant (big-endian or “human-readable” order). It starts with the most significant non-zero byte (leading zero bytes are omitted) but always prints both digits of every byte even if the leading digit is zero.
Alias:
Arguments
arg
(String
, UInt
, Float
, Decimal
, Date
, DateTime
, or UUID
): A value to convert to hexadecimal.Returns
String
.Examples
Converting an integer to hexadecimal:
Result:
Converting a string to hexadecimal:
Result:
Converting a UUID to hexadecimal:
Result:
Float
and Decimal
types, the values are encoded as their representation in memory. As ClickHouse supports little-endian architecture, they are encoded in little-endian. Zero leading/trailing bytes are not omitted.Date
and DateTime
types, the values are formatted as corresponding integers (the number of days since Epoch for Date
and the value of Unix Timestamp for DateTime
).String
and FixedString
types, all bytes are encoded as two hexadecimal numbers. Zero bytes are not omitted.UUID
type, the values are encoded in big-endian order.Performs the opposite operation of hex
. It interprets each pair of hexadecimal digits in the argument as a number and converts it to the byte represented by that number. The return value is a binary string (BLOB).
If you want to convert the result to a number, you can use the reverse
and reinterpretAs<Type>
functions.
Syntax:
Alias:
Arguments:
arg
(String
, FixedString
): A string containing any number of hexadecimal digits.Supports both uppercase and lowercase letters A-F. The number of hexadecimal digits does not have to be even. If it is odd, the last digit is interpreted as the least significant half of the 00-0F byte. If the argument string contains anything other than hexadecimal digits, some implementation-defined result is returned (an exception isn’t thrown).
Returns:
String
)Example:
Result:
Converting the result to a number:
Result:
In this example, we use unhex
to convert the hexadecimal string ‘FFF’ to its binary representation, reverse the bytes (since ClickHouse uses little-endian), and then reinterpret the result as a UInt64.
For a numeric argument, unhex()
does not perform the inverse of hex(N)
.
Returns a string containing the argument’s binary representation.
Syntax:
Date
and DateTime
, it formats as corresponding integers (the number of days since Epoch for Date and the value of Unix Timestamp for DateTime).String
and FixedString
, all bytes are encoded as eight binary numbers. Zero bytes are not omitted.Float
and Decimal
types, they are encoded as their representation in memory (little-endian). Zero leading/trailing bytes are not omitted.UUID
type, it is encoded as a big-endian order string.Alias:
Arguments:
arg
(String
, FixedString
, UInt
, Float
, Decimal
, Date
, or DateTime
): A value to convert to binary.Returns:
String
.Examples:
In this example, the bin
function converts the integer 14 to its binary representation, resulting in the string ‘00001110’.
Result:
This second example converts a sequence of floating-point numbers to their binary representations. The third example converts a UUID to its binary representation.
Result:
In this example, the bin
function converts a UUID to its binary representation. The UUID ‘61f0c404-5cb3-11e7-907b-a6006ad3dba0’ is represented as a 128-bit number, and the bin
function outputs its binary form as a string of 128 binary digits. This can be useful for visualizing the binary structure of UUIDs, which are often used as unique identifiers in databases and distributed systems.
Result:
Performs the opposite operation of bin
. It interprets each pair of binary digits in the argument as a number and converts it to the byte represented by that number.
Syntax:
Alias:
Arguments:
arg
(String
): A string containing any number of binary digits.Returns:
String
)unbin()
does not return the inverse of bin()
.unbin
is invoked from within the clickhouse-client, binary strings are displayed using UTF-8.Example:
Result:
To convert the result to a number, you can use the reverse
and reinterpretAs<Type>
functions:
Result:
This function is particularly useful for decoding binary data or working with binary representations of numbers.
Converts an integer bitmask to a comma-separated string of powers of two that sum up to the original number.
Syntax
Arguments
num
(Int/UInt
): An integer value representing the bitmask.Returns
String
)Example
Result:
In this example, the bitmask 14 (binary 1110) is converted to its constituent powers of two: 2, 4, and 8, which represent different taco toppings when summed.
This function is useful for converting bitmask-encoded data into a human-readable format, such as displaying selected taco toppings from a bitmask representation.
Converts an integer bitmask to an array of powers of two that sum up to the original number.
Syntax
Arguments
num
(Int/UInt
): An integer value representing the bitmask.Returns
UInt64
numbers containing the list of powers of two that sum up to the source number. The numbers in the array are in ascending order. (Array(UInt64)
)Example
Result:
In this example, 14 is represented as 2 + 4 + 8, which could correspond to different taco toppings:
This function can be useful for decoding bitmask-encoded data, such as menu item customizations or feature flags.
If the input is 0, an empty array is returned.
Converts an integer to an array containing the positions of set bits (bits equal to 1) in ascending order.
Syntax
Arguments
arg
(Int/UInt
): Integer value.Returns:
Array(UInt64)
.Examples
Query:
Result:
Query:
Result:
This function is useful for analyzing binary data or working with bitmasks in a more readable format.
Calculates the Morton encoding (Z-curve) for a list of unsigned integers.
The function has two modes of operation:
Syntax:
Arguments:
arg1
, arg2
, …, argN
(UInt8
, UInt16
, UInt32
, UInt64
): Up to 8 unsigned integers or columns of unsigned integer types.Returns:
A UInt64
code.
Example:
Result:
Syntax:
Arguments:
range_mask
(Tuple
): A tuple specifying the amount of range expansion for each argument (1-8x).arg1
, arg2
, …, argN
(UInt8
, UInt16
, UInt32
, UInt64
): Up to 8 unsigned integers or columns of unsigned integer types.Returns:
A UInt64
code.
Example:
Result:
This mode is useful when dealing with arguments of different ranges or cardinality, such as IP addresses (0…FFFFFFFF) and taco shop IDs (0…FF).
Result:
Result:
Implementation details: The function can fit only so many bits of information into the Morton code as UInt64 has. Two arguments will have a range of maximum 2^32 (64/2) each, three arguments a range of max 2^21 (64/3) each, and so on. All overflow will be clamped to zero.
Decodes a Morton encoding (ZCurve) into the corresponding unsigned integer tuple.
The function has two modes of operation:
Syntax:
Arguments:
tuple_size
(Integer
): Integer value no more than 8.code
(UInt64
): UInt64 code.Returns: A tuple of the specified size containing UInt64
values.
Example:
Result:
Syntax:
Arguments:
range_mask
(Tuple
): A tuple specifying bit shifts for each dimension.code
(UInt64
): UInt64 code.Returns: A tuple containing UInt64
values.
Example:
Result:
This mode is useful when dealing with arguments of different ranges or cardinality, such as IP addresses (0…FFFFFFFF) and country codes (0…FF).
For a single argument without a tuple, the function returns the argument itself as a single-element tuple.
Result:
The function can accept a column of codes as the second argument:
Result:
This function is particularly useful for decoding multi-dimensional data that has been encoded into a single dimension using the Morton curve, often used in geospatial applications or for optimizing multi-dimensional database queries.
Calculates the Hilbert curve index for a list of unsigned integers.
The function has two modes of operation:
Syntax:
Arguments:
arg1
, arg2
(UInt
): Up to 2 unsigned integers or columns of unsigned integer type.Returns: A UInt64
code representing the Hilbert curve index.
Example:
Result:
Syntax:
Arguments:
range_mask
(Tuple
): A tuple specifying bit shifts for each dimension.arg1
, arg2
(UInt
): Up to 2 unsigned integers or columns of unsigned integer type.Returns: A UInt64
code representing the Hilbert curve index.
Example:
Result:
This mode is useful when dealing with arguments of different ranges or cardinality, such as IP addresses (0…FFFFFFFF) and country codes (0…FF).
For a single argument without a tuple, the function returns the argument itself as the Hilbert index:
Result:
The function can accept columns as arguments:
Result:
This function is particularly useful for encoding multi-dimensional data into a single dimension while preserving locality, often used in geospatial applications or for optimizing multi-dimensional database queries.
Decodes a Hilbert curve index back into a tuple of unsigned integers, representing coordinates in multi-dimensional space.
The function has two modes of operation:
Syntax:
Arguments:
tuple_size
(Integer
): Integer value no more than 2.code
(UInt64
): UInt64 code.Returns: A tuple of the specified size containing UInt64
values.
Example:
Result:
Syntax:
Arguments:
range_mask
(Tuple
): A tuple specifying bit shifts for each dimension.code
(UInt64
): UInt64 code.Returns: A tuple containing UInt64
values.
Example:
Result:
This mode is useful when dealing with arguments of different ranges or cardinality, such as IP addresses (0…FFFFFFFF) and country codes (0…FF).
For a single argument without a tuple, the function returns the argument itself as a single-element tuple.
Result:
The function can accept a column of codes as the second argument:
Result:
This function is particularly useful for decoding multi-dimensional data that has been encoded into a single dimension using the Hilbert curve, often used in geospatial applications or for optimizing multi-dimensional database queries.
Convert between different character encodings.
Encoding functions in help convert data between different character encodings.
These functions enable you to:
Returns a string with the length equal to the number of passed arguments, where each byte has the value of the corresponding argument.
Syntax:
Arguments:
number_1
, number_2
, …, number_n
(numeric): Numerical arguments interpreted as integers. Types: Int
, Float
.Returns:
String
.Example:
Result:
You can construct a string of arbitrary encoding by passing the corresponding bytes. Here’s an example for UTF-8:
Result:
If the value of an argument is out of the range of the UInt8
data type, it is converted to UInt8
with possible rounding and overflow.
Returns a string containing the argument’s hexadecimal representation.
Syntax
The function uses uppercase letters A-F and does not include any prefixes (like 0x
) or suffixes (like h
).
For integer arguments, it prints hex digits from the most significant to least significant (big-endian or “human-readable” order). It starts with the most significant non-zero byte (leading zero bytes are omitted) but always prints both digits of every byte even if the leading digit is zero.
Alias:
Arguments
arg
(String
, UInt
, Float
, Decimal
, Date
, DateTime
, or UUID
): A value to convert to hexadecimal.Returns
String
.Examples
Converting an integer to hexadecimal:
Result:
Converting a string to hexadecimal:
Result:
Converting a UUID to hexadecimal:
Result:
Float
and Decimal
types, the values are encoded as their representation in memory. As ClickHouse supports little-endian architecture, they are encoded in little-endian. Zero leading/trailing bytes are not omitted.Date
and DateTime
types, the values are formatted as corresponding integers (the number of days since Epoch for Date
and the value of Unix Timestamp for DateTime
).String
and FixedString
types, all bytes are encoded as two hexadecimal numbers. Zero bytes are not omitted.UUID
type, the values are encoded in big-endian order.Performs the opposite operation of hex
. It interprets each pair of hexadecimal digits in the argument as a number and converts it to the byte represented by that number. The return value is a binary string (BLOB).
If you want to convert the result to a number, you can use the reverse
and reinterpretAs<Type>
functions.
Syntax:
Alias:
Arguments:
arg
(String
, FixedString
): A string containing any number of hexadecimal digits.Supports both uppercase and lowercase letters A-F. The number of hexadecimal digits does not have to be even. If it is odd, the last digit is interpreted as the least significant half of the 00-0F byte. If the argument string contains anything other than hexadecimal digits, some implementation-defined result is returned (an exception isn’t thrown).
Returns:
String
)Example:
Result:
Converting the result to a number:
Result:
In this example, we use unhex
to convert the hexadecimal string ‘FFF’ to its binary representation, reverse the bytes (since ClickHouse uses little-endian), and then reinterpret the result as a UInt64.
For a numeric argument, unhex()
does not perform the inverse of hex(N)
.
Returns a string containing the argument’s binary representation.
Syntax:
Date
and DateTime
, it formats as corresponding integers (the number of days since Epoch for Date and the value of Unix Timestamp for DateTime).String
and FixedString
, all bytes are encoded as eight binary numbers. Zero bytes are not omitted.Float
and Decimal
types, they are encoded as their representation in memory (little-endian). Zero leading/trailing bytes are not omitted.UUID
type, it is encoded as a big-endian order string.Alias:
Arguments:
arg
(String
, FixedString
, UInt
, Float
, Decimal
, Date
, or DateTime
): A value to convert to binary.Returns:
String
.Examples:
In this example, the bin
function converts the integer 14 to its binary representation, resulting in the string ‘00001110’.
Result:
This second example converts a sequence of floating-point numbers to their binary representations. The third example converts a UUID to its binary representation.
Result:
In this example, the bin
function converts a UUID to its binary representation. The UUID ‘61f0c404-5cb3-11e7-907b-a6006ad3dba0’ is represented as a 128-bit number, and the bin
function outputs its binary form as a string of 128 binary digits. This can be useful for visualizing the binary structure of UUIDs, which are often used as unique identifiers in databases and distributed systems.
Result:
Performs the opposite operation of bin
. It interprets each pair of binary digits in the argument as a number and converts it to the byte represented by that number.
Syntax:
Alias:
Arguments:
arg
(String
): A string containing any number of binary digits.Returns:
String
)unbin()
does not return the inverse of bin()
.unbin
is invoked from within the clickhouse-client, binary strings are displayed using UTF-8.Example:
Result:
To convert the result to a number, you can use the reverse
and reinterpretAs<Type>
functions:
Result:
This function is particularly useful for decoding binary data or working with binary representations of numbers.
Converts an integer bitmask to a comma-separated string of powers of two that sum up to the original number.
Syntax
Arguments
num
(Int/UInt
): An integer value representing the bitmask.Returns
String
)Example
Result:
In this example, the bitmask 14 (binary 1110) is converted to its constituent powers of two: 2, 4, and 8, which represent different taco toppings when summed.
This function is useful for converting bitmask-encoded data into a human-readable format, such as displaying selected taco toppings from a bitmask representation.
Converts an integer bitmask to an array of powers of two that sum up to the original number.
Syntax
Arguments
num
(Int/UInt
): An integer value representing the bitmask.Returns
UInt64
numbers containing the list of powers of two that sum up to the source number. The numbers in the array are in ascending order. (Array(UInt64)
)Example
Result:
In this example, 14 is represented as 2 + 4 + 8, which could correspond to different taco toppings:
This function can be useful for decoding bitmask-encoded data, such as menu item customizations or feature flags.
If the input is 0, an empty array is returned.
Converts an integer to an array containing the positions of set bits (bits equal to 1) in ascending order.
Syntax
Arguments
arg
(Int/UInt
): Integer value.Returns:
Array(UInt64)
.Examples
Query:
Result:
Query:
Result:
This function is useful for analyzing binary data or working with bitmasks in a more readable format.
Calculates the Morton encoding (Z-curve) for a list of unsigned integers.
The function has two modes of operation:
Syntax:
Arguments:
arg1
, arg2
, …, argN
(UInt8
, UInt16
, UInt32
, UInt64
): Up to 8 unsigned integers or columns of unsigned integer types.Returns:
A UInt64
code.
Example:
Result:
Syntax:
Arguments:
range_mask
(Tuple
): A tuple specifying the amount of range expansion for each argument (1-8x).arg1
, arg2
, …, argN
(UInt8
, UInt16
, UInt32
, UInt64
): Up to 8 unsigned integers or columns of unsigned integer types.Returns:
A UInt64
code.
Example:
Result:
This mode is useful when dealing with arguments of different ranges or cardinality, such as IP addresses (0…FFFFFFFF) and taco shop IDs (0…FF).
Result:
Result:
Implementation details: The function can fit only so many bits of information into the Morton code as UInt64 has. Two arguments will have a range of maximum 2^32 (64/2) each, three arguments a range of max 2^21 (64/3) each, and so on. All overflow will be clamped to zero.
Decodes a Morton encoding (ZCurve) into the corresponding unsigned integer tuple.
The function has two modes of operation:
Syntax:
Arguments:
tuple_size
(Integer
): Integer value no more than 8.code
(UInt64
): UInt64 code.Returns: A tuple of the specified size containing UInt64
values.
Example:
Result:
Syntax:
Arguments:
range_mask
(Tuple
): A tuple specifying bit shifts for each dimension.code
(UInt64
): UInt64 code.Returns: A tuple containing UInt64
values.
Example:
Result:
This mode is useful when dealing with arguments of different ranges or cardinality, such as IP addresses (0…FFFFFFFF) and country codes (0…FF).
For a single argument without a tuple, the function returns the argument itself as a single-element tuple.
Result:
The function can accept a column of codes as the second argument:
Result:
This function is particularly useful for decoding multi-dimensional data that has been encoded into a single dimension using the Morton curve, often used in geospatial applications or for optimizing multi-dimensional database queries.
Calculates the Hilbert curve index for a list of unsigned integers.
The function has two modes of operation:
Syntax:
Arguments:
arg1
, arg2
(UInt
): Up to 2 unsigned integers or columns of unsigned integer type.Returns: A UInt64
code representing the Hilbert curve index.
Example:
Result:
Syntax:
Arguments:
range_mask
(Tuple
): A tuple specifying bit shifts for each dimension.arg1
, arg2
(UInt
): Up to 2 unsigned integers or columns of unsigned integer type.Returns: A UInt64
code representing the Hilbert curve index.
Example:
Result:
This mode is useful when dealing with arguments of different ranges or cardinality, such as IP addresses (0…FFFFFFFF) and country codes (0…FF).
For a single argument without a tuple, the function returns the argument itself as the Hilbert index:
Result:
The function can accept columns as arguments:
Result:
This function is particularly useful for encoding multi-dimensional data into a single dimension while preserving locality, often used in geospatial applications or for optimizing multi-dimensional database queries.
Decodes a Hilbert curve index back into a tuple of unsigned integers, representing coordinates in multi-dimensional space.
The function has two modes of operation:
Syntax:
Arguments:
tuple_size
(Integer
): Integer value no more than 2.code
(UInt64
): UInt64 code.Returns: A tuple of the specified size containing UInt64
values.
Example:
Result:
Syntax:
Arguments:
range_mask
(Tuple
): A tuple specifying bit shifts for each dimension.code
(UInt64
): UInt64 code.Returns: A tuple containing UInt64
values.
Example:
Result:
This mode is useful when dealing with arguments of different ranges or cardinality, such as IP addresses (0…FFFFFFFF) and country codes (0…FF).
For a single argument without a tuple, the function returns the argument itself as a single-element tuple.
Result:
The function can accept a column of codes as the second argument:
Result:
This function is particularly useful for decoding multi-dimensional data that has been encoded into a single dimension using the Hilbert curve, often used in geospatial applications or for optimizing multi-dimensional database queries.