Use MD5 function to create unique IDs

  • data-coding
  • idea
  • sql
  • spring
  • I first encountered this function when trying to join two tables together using about eight separate fields. Not ideal.
  • The natural inclination is to create your own ID by simply concatenating a bunch of fields together. These columns are bad because they kind of look like data but operate as an ID. It’s important to have a column whose sole function is to be a unique identifier for that row.
  • Instead, use MD5 functions to create unique IDs on AWS. IDs that are obviously IDs reduce confusion among junior analyst and end users by removing semi-comprehensible data strings throughout your database.
  • At any rate, be aware that MD5 is no longer considered strong as a hash function, should it contain sensitive information
  • More info on how this works is in Learn about cryptographyLearn about cryptography

    Entropy is a measure of randomness

    Hashing functions

    A cryptographic hash function maps data of arbitrary size to a fixed size
    An example of a hash function is SHA-1, which is used in Git references
    At a high level, a hash function can be thought of as a hard-to-invert random-looking (but deterministic) function.
    A hash function has the following properties:

    Deterministic: the same input always generates the same output.
    Non-invertible: it is hard to f...
select md5('Amazon Redshift')
# ---
# f7415e33f972c03abd4f3fed36748f7a (1 row)