postgresql – Conversion of Salesforce IDs for performance reasons

I write code for a data warehouse that extracts information from SalesForce. The format of the SalesForce credentials is as follows:

001n000000UELLJAA5

Where each digit is a value in base 62 indicated by:

a-z (26 values, case sensitive)
A-Z (26 values, case sensitive)
0-9 (10 values)

26 + 26 + 10 = a total of 62 values ​​per digit

I want to convert them to a format that can be efficiently indexed by Postgres for thousands of rows.

I arrived here:

select id, array_to_string (array_agg (ascii), & # 39;; & # 39; 0) :: decimal
DE (
SELECT id, ascii (regexp_split_to_table (id, & # 39;)) from example_schema.example_table
) X
group by login

but this seems super inefficient and can also be difficult to reverse in some cases because there is no separator between the codes.

  • Uses a decimal to type as an effective identifier / identity?
  • Is there a simpler way to convert these values ​​to a numeric value that can still be used as the primary key and is a reversible operation if necessary?

Here is the query plan:

GroupAggregate (cost = 109862.92..130267.92 rows = 742000 width = 51)
Group key: account.id
-> Sort (cost = 109862.92..111717.92 rows = 742000 width = 23)
Sort key: account.id
-> Result (cost = 0.00..14875.99 rows = 742000 width = 23)
-> ProjectSet (cost = 0.00..3745.99 rows = 742000 width = 51)
-> Seq Scan on account (cost = 0.00..30.42 lines = 742 width = 19)

Sample output:

001n000000ShgGbAAJ 7465659871103104834848484848110494848
001n000000SIZE7AAP 4848491104848484848488373906955656580
001n000000Sj3NCAAZ 489065656778511068348484848481104948
001n000000SJK1sAAH 48484911048484848484883747549115656572

They seem to be down, which will cause problems to go back.