yabase/intid

Integer helpers for short URL-safe identifiers.

The byte-oriented codecs in yabase/facade are the right tool when the input is opaque bytes (hashes, public keys, raw payloads). For the very common short-ID case — DB autoincrement ids, sequence numbers, hash truncations — callers want Int -> compact string directly. Without these helpers every project re-implements the same Int -> big-endian bytes -> trim-leading-zero shim.

encode_int_* emits canonical form: no leading zero characters beyond what the value itself requires (encode_int_base58(0) == "1", the alphabet’s zero character; encode_int_base58(58) == "21", no leading "1").

decode_int_* is tolerant of leading zero characters (decode_int_base58("0042") and decode_int_base58("42") both return the same Int), so input from external sources that zero-pads is accepted without ceremony.

decode_int_* rejects the empty string with Error(InvalidLength(0)) rather than treating it as zero. Callers can therefore distinguish “no ID was supplied” from “the ID is zero” — important for URL routing, form parsing, and database lookups. The byte-oriented decoders in yabase/facade retain the Ok(<<>>) round-trip behavior for empty input.

Negative inputs are rejected

Every encode_int_* function in this module returns Result(String, CodecError) and rejects negative inputs with Error(NegativeValue(value)). The integer codecs only define a canonical representation for non-negative values, so silently dropping the sign would break the decode(encode(n)) == n round-trip whenever n < 0 (closed #84, reopened as #100).

If your caller path can produce negatives (offsets that subtracted past zero, Posix timestamps from before 1970, deliberate -1 sentinels), map them to a sign-preserving wire format like zigzag or to a domain-specific error at the boundary:

case intid.encode_int_base32_crockford(n) {
  Ok(s) -> Ok(s)
  Error(intid.NegativeValue(_)) -> Error(MyDomainError.NegativeId(n))
  Error(other) -> Error(MyDomainError.CodecFailed(other))
}

Bounded decode

decode_int_* accepts inputs of any length, so the decoded Int can exceed any fixed integer width — Erlang Int is a bignum. Realistic backing stores cap IDs at 64 bits (SQLite INTEGER, Postgres bigserial, MySQL BIGINT), so feeding an unbounded decode_int_* result into one of those columns crashes the driver as soon as a user supplies a slightly-too-long string. For the same reason, JavaScript-target callers cap at 53 bits (Number.MAX_SAFE_INTEGER).

Use decode_int_*_bounded(input:, max:) whenever the decoded value flows into a fixed-width sink. The bounded variants return Error(Overflow) if the decoded Int exceeds max. Common caps are exported as int64_max (signed 64-bit, 2^63 - 1) and int53_max (JS-safe integer, 2^53 - 1).

Types

Issue #74: every decode_int_* function in this module returns Result(Int, CodecError). Without this re-export, callers who only import yabase/intid cannot type-annotate a wrapper around a decode call without reaching into yabase/core/error — a module the README does not mention. The alias keeps the type identity (it’s the same CodecError the underlying codec functions already use) so error values flow through unchanged.

pub type CodecError =
  error.CodecError

Values

pub fn decode_int(
  encoding encoding: encoding.Encoding,
  value value: String,
) -> Result(Int, error.CodecError)

Decode a string back to an Int using the supplied Encoding, dispatching to the matching decode_int_* helper.

Empty input returns Error(InvalidLength(0)) so callers can distinguish “no ID was supplied” from “the ID is zero” — the same contract as the per-base helpers.

Returns Error(UnsupportedForInt(name)) for encodings that have no integer codec wired up.

pub fn decode_int_base10(
  input: String,
) -> Result(Int, error.CodecError)

Decode a Base10 (decimal) string back to an Int.

pub fn decode_int_base10_bounded(
  input input: String,
  max max: Int,
) -> Result(Int, error.CodecError)

Decode a Base10 (decimal) string back to an Int, rejecting values greater than max with Error(Overflow).

pub fn decode_int_base16(
  input: String,
) -> Result(Int, error.CodecError)

Decode a Base16 (hexadecimal) string back to an Int. Accepts both uppercase and lowercase input via base16.decode’s case-insensitive alphabet.

Odd-length inputs are accepted and internally zero-padded on the left to the next byte boundary before being passed to base16.decode ("1" is treated as "01", "7E9" as "07E9"). This makes the function tolerant of either output shape from the encode_int_base16* family — decode_int_base16(encode_int_base16(n)) == Ok(n) and decode_int_base16(encode_int_base16_compact(n)) == Ok(n) both hold for every non-negative Int. The byte-oriented base16.decode/1 keeps its strict even-length contract for callers reaching for the low-level codec directly. (#99)

pub fn decode_int_base16_bounded(
  input input: String,
  max max: Int,
) -> Result(Int, error.CodecError)

Decode a Base16 (hexadecimal) string back to an Int, rejecting values greater than max with Error(Overflow).

pub fn decode_int_base32_crockford(
  input: String,
) -> Result(Int, error.CodecError)

Decode a Crockford Base32 string back to an Int.

pub fn decode_int_base32_crockford_bounded(
  input input: String,
  max max: Int,
) -> Result(Int, error.CodecError)

Decode a Crockford Base32 string back to an Int, rejecting values greater than max with Error(Overflow).

pub fn decode_int_base32_crockford_check(
  input: String,
) -> Result(Int, error.CodecError)

Decode a checksummed Crockford Base32 string back to an Int, verifying the trailing check symbol.

pub fn decode_int_base32_crockford_check_bounded(
  input input: String,
  max max: Int,
) -> Result(Int, error.CodecError)

Decode a checksummed Crockford Base32 string back to an Int, rejecting values greater than max with Error(Overflow).

pub fn decode_int_base32_rfc4648(
  input: String,
) -> Result(Int, error.CodecError)

Decode a Base32 (RFC 4648) string back to an Int.

pub fn decode_int_base32_rfc4648_bounded(
  input input: String,
  max max: Int,
) -> Result(Int, error.CodecError)

Decode a Base32 (RFC 4648) string back to an Int, rejecting values greater than max with Error(Overflow).

pub fn decode_int_base36(
  input: String,
) -> Result(Int, error.CodecError)

Decode a Base36 string back to an Int.

pub fn decode_int_base36_bounded(
  input input: String,
  max max: Int,
) -> Result(Int, error.CodecError)

Decode a Base36 string back to an Int, rejecting values greater than max with Error(Overflow).

pub fn decode_int_base58(
  input: String,
) -> Result(Int, error.CodecError)

Decode a Base58 (Bitcoin alphabet) string back to an Int, rejecting non-canonical wire forms with Error(NonCanonical).

The Bitcoin Base58 alphabet uses "1" as the zero character, so the byte-oriented base58_bitcoin.decode prepends one 0x00 byte for every leading "1" in the input. When that byte string is read back as a big-endian integer the leading zero bytes disappear, which means "5Q", "15Q", "115Q", … all decode to the same Int. That collapses the bijection that ID callers (URL shorteners, idempotency keys, database lookups) rely on: two different wire strings can name the same row, breaking deduplication and cache invariants.

The fix is to require the input to be byte-equal to the canonical encoding (encode_int_base58(decoded) == input). The only legal leading "1" is the single-character input "1", which is the canonical encoding of 0. Any other input that starts with "1" is Error(NonCanonical). Closes #101.

pub fn decode_int_base58_bounded(
  input input: String,
  max max: Int,
) -> Result(Int, error.CodecError)

Decode a Base58 (Bitcoin alphabet) string back to an Int, rejecting values greater than max with Error(Overflow).

pub fn decode_int_base58_flickr(
  input: String,
) -> Result(Int, error.CodecError)

Decode a Base58 (Flickr alphabet) string back to an Int.

pub fn decode_int_base58_flickr_bounded(
  input input: String,
  max max: Int,
) -> Result(Int, error.CodecError)

Decode a Base58 (Flickr alphabet) string back to an Int, rejecting values greater than max with Error(Overflow).

pub fn decode_int_base58check(
  input: String,
) -> Result(Int, error.CodecError)

Decode a Base58Check string back to an Int, verifying the 4-byte SHA-256 checksum.

Issue #73: returns the payload as an Int, ignoring the version byte (which encode_int_base58check always sets to 0). Callers that need to inspect the version byte should reach for yabase/base58check.decode/1 directly.

pub fn decode_int_base58check_bounded(
  input input: String,
  max max: Int,
) -> Result(Int, error.CodecError)

Decode a Base58Check string back to an Int, rejecting payload values greater than max with Error(Overflow). The checksum is verified before the bounds check, so a corrupted input fails as InvalidChecksum rather than Overflow.

pub fn decode_int_base62(
  input: String,
) -> Result(Int, error.CodecError)

Decode a Base62 string back to an Int.

pub fn decode_int_base62_bounded(
  input input: String,
  max max: Int,
) -> Result(Int, error.CodecError)

Decode a Base62 string back to an Int, rejecting values greater than max with Error(Overflow).

pub fn decode_int_bounded(
  encoding encoding: encoding.Encoding,
  value value: String,
  max max: Int,
) -> Result(Int, error.CodecError)

Decode a string back to an Int using the supplied Encoding, rejecting values greater than max with Error(Overflow). The runtime sibling of the per-base decode_int_*_bounded helpers.

pub fn encode_int(
  encoding encoding: encoding.Encoding,
  value value: Int,
) -> Result(String, error.CodecError)

Encode an Int to a string using the supplied Encoding, dispatching to the matching encode_int_* helper. Negative inputs surface as Error(NegativeValue(value)) exactly as the per-base helpers do; see the module note on “Negative inputs are rejected” for the rationale and the boundary-handling pattern.

Returns Error(UnsupportedForInt(name)) for encodings that have no integer codec wired up (every byte-only codec: Base2, Base8, Base32(Hex|Clockwork|ZBase32), Base45, every Base64 / Base85 variant, Base91, Bech32). For Base58Check, the existing encode_int_base58check/1 helper uses the fixed version byte 0x00; reach for the generic facade with encoding.base58_check(version) to pin a different version.

pub fn encode_int_base10(
  value: Int,
) -> Result(String, error.CodecError)

Encode a non-negative Int as a Base10 (decimal) string. Returns Error(NegativeValue(value)) for negative inputs; see the module note on “Negative inputs are rejected”.

Behaviour matches int.to_string for the typical case (positive integers) and the rest of the intid family for the switch-case bench harnesses described in #78. Routing through base10.encode keeps the contract uniform with the other encode_int_* functions: a non-negative Int in, a string in the alphabet out, no padding.

pub fn encode_int_base16(
  value: Int,
) -> Result(String, error.CodecError)

Encode a non-negative Int as a Base16 (uppercase hexadecimal) string. Returns Error(NegativeValue(value)) for negative inputs; see the module note on “Negative inputs are rejected”.

Routing through base16.encode keeps the contract uniform with the rest of the encode_int_* family. The output uses the canonical RFC 4648 §8 uppercase alphabet (0-9 A-F) — callers who need lowercase for interop with sha256sum-style tools can post-process with string.lowercase or use base16.encode_lowercase after int_to_bytes_be themselves.

Byte-aligned vs compact

This function is byte-aligned: the output length is always an even number of hex characters because the encoding pads the integer’s big-endian representation to a whole byte boundary (encode_int_base16(1) == Ok("01"), encode_int_base16(2025) == Ok("07E9")). This is the right shape for ID interop with byte-oriented systems (databases, HTTP headers, content-addressable storage). The other encode_int_* functions in this module are compact — they drop leading zero characters (encode_int_base58(1) == Ok("2"), encode_int_base36(1) == Ok("1"), encode_int_base10(1) == Ok("1")). Issue #99 surfaced the asymmetry; if you want the compact form for base16, use encode_int_base16_compact/1 instead. decode_int_base16/1 accepts either form (it does not require an even-length input).

pub fn encode_int_base16_compact(
  value: Int,
) -> Result(String, error.CodecError)

Encode a non-negative Int as a Base16 (uppercase hexadecimal) string with leading zero characters stripped — the compact counterpart to encode_int_base16/1.

Byte-aligned vs compact

encode_int_base16/1 is byte-aligned (always emits an even number of hex characters: encode_int_base16(1) == Ok("01"), encode_int_base16(2025) == Ok("07E9")). This function is compact — leading "0" characters are dropped so the output matches the shape of the rest of the encode_int_* family (encode_int_base58(1) == Ok("2"), encode_int_base36(1) == Ok("1"), encode_int_base10(1) == Ok("1")). Examples:

encode_int_base16_compact(0)      // Ok("0")
encode_int_base16_compact(1)      // Ok("1")
encode_int_base16_compact(255)    // Ok("FF")
encode_int_base16_compact(2025)   // Ok("7E9")
encode_int_base16_compact(0xdeadbeef) // Ok("DEADBEEF")

Use this when you want column-aligned mixed-base output or round-trip-by-text comparisons across the encode_int_* family. Keep encode_int_base16/1 for byte-oriented sinks where the even-length contract matters.

decode_int_base16/1 accepts the compact form unchanged (decode_int_base16(encode_int_base16_compact(n) |> result.unwrap_or("")) round-trips for every non-negative Int), because the underlying base16.decode is tolerant of any-length input — odd-length inputs are zero-padded to the next byte boundary before decoding.

Returns Error(NegativeValue(value)) for negative inputs; see the module note on “Negative inputs are rejected”.

Added in #99.

pub fn encode_int_base32_crockford(
  value: Int,
) -> Result(String, error.CodecError)

Encode a non-negative Int as a Crockford Base32 string. Returns Error(NegativeValue(value)) for negative inputs; see the module note on “Negative inputs are rejected”.

pub fn encode_int_base32_crockford_check(
  value: Int,
) -> Result(String, error.CodecError)

Encode a non-negative Int as a Crockford Base32 string with a trailing checksum symbol (Douglas Crockford’s optional check character).

Issue #73: same shape as encode_int_base32_crockford but with the typo-resistance guard the underlying codec already supports. Use the matching decode_int_base32_crockford_check to recover the integer; the decoder verifies the symbol and returns Error(InvalidChecksum) if the input was mistyped.

Returns Error(NegativeValue(value)) for negative inputs; see the module note on “Negative inputs are rejected”.

pub fn encode_int_base32_rfc4648(
  value: Int,
) -> Result(String, error.CodecError)

Encode a non-negative Int as a Base32 (RFC 4648) string. Returns Error(NegativeValue(value)) for negative inputs; see the module note on “Negative inputs are rejected” for the rationale and the recommended boundary-check pattern.

pub fn encode_int_base36(
  value: Int,
) -> Result(String, error.CodecError)

Encode a non-negative Int as a Base36 string. Returns Error(NegativeValue(value)) for negative inputs; see the module note on “Negative inputs are rejected”.

pub fn encode_int_base58(
  value: Int,
) -> Result(String, error.CodecError)

Encode a non-negative Int as a Base58 (Bitcoin alphabet) string. Returns Error(NegativeValue(value)) for negative inputs; see the module note on “Negative inputs are rejected”.

pub fn encode_int_base58_flickr(
  value: Int,
) -> Result(String, error.CodecError)

Encode a non-negative Int as a Base58 (Flickr alphabet) string. Returns Error(NegativeValue(value)) for negative inputs; see the module note on “Negative inputs are rejected”.

pub fn encode_int_base58check(
  value: Int,
) -> Result(String, error.CodecError)

Encode a non-negative Int as a Base58Check string (Bitcoin’s double-SHA-256 checksum format).

Issue #73: this is the int-typed counterpart of yabase/base58check.encode/2. Version is fixed at 0 (Bitcoin mainnet P2PKH) — callers that need a different version should reach for yabase/base58check.encode/2 directly with their own BitArray payload.

Returns Error(NegativeValue(value)) for negative inputs; see the module note on “Negative inputs are rejected”. The underlying yabase/base58check.encode only errors on out-of-range version bytes, and this helper hard-codes a valid one, so the surfaced error is always NegativeValue in practice.

pub fn encode_int_base62(
  value: Int,
) -> Result(String, error.CodecError)

Encode a non-negative Int as a Base62 string. Returns Error(NegativeValue(value)) for negative inputs; see the module note on “Negative inputs are rejected”.

pub const int53_max: Int

Largest value that round-trips losslessly through a JavaScript number (2^53 - 1, Number.MAX_SAFE_INTEGER). Use as the max argument to decode_int_*_bounded when the decoded value is passed across a JS-target boundary or serialized as JSON for a JavaScript consumer.

pub const int64_max: Int

Largest value that fits in a signed 64-bit integer (2^63 - 1). Use as the max argument to decode_int_*_bounded when the decoded value flows into a column declared BIGINT (Postgres, MySQL) or INTEGER (SQLite).

Search Document