After some privacy concerns, Facebook has made many changes to its APIs, one of them being "identifiers" per application. Indeed, it means that the canonical user ID of Alice for Facebook is never shared by the Facebook APIs, but that each application sees a unique value that never changes.
To formalize this through two functions:
mask (user_id, app_id)
unmask (masked_id, app_id)
These functions should satisfy certain properties:
reversible: userid == unmask (mask (user_id, appid), appid)
secret: given y = mask (user_id, app_id), it should be 'difficult' to get the user_id of origin.
deterministic: mask (userid, appid) == mask (userid, appid), where userid == userid and appid == app_id
collision resistant: for all user IDs, given the same app_id, hide (user_id1, app_id) == mask (user_id2, app_id) should not be the case
After much reading, I proposed two possible implementations:
Stateful: Stores a triple (user_id, app_id, hash (user_id, app_id)).
Stateless: for each app_id, generate and store a secret, then encrypt it (user_id, secret) on the fly. encryption should be a deterministic figure, such as AES-SIV.
These two approaches have fairly significant disadvantages. The stateful approach induces an incredible cost of persistence –
Is there another approach that I might miss here? I think it's a deceptively easy problem, eager to get a glimpse.