I am currently working on a database design problem. I work in the financial domain and I deal with data related to benchmarks and securities. The problem with such databases are that each security has multiple identifiers in the real world. They could be Uniquely identified by a SEDOL, CUSIP or ISIN or the many other type of identification standards available.
This lead me to the question of designing the tables using a surrogate keys but I was still not sure how my team would react as they had already design loads based on real world identifiers and named it security_id. Unfortunately security_id could be a SEDOL, a CUSIP or ISIN or whatever depending on what region or benchmark the security belongs to. That isn't really a reliable identifier. And hence to avoid annoying the guy in my team who does data loading I redesigned my queries to do some intelligent identifier resolutions. This cost us a lot. Queries that would perform in barely 500ms now take about 1500ms.
So I had explain the advantages of a surrogate key. I couldn't do that alone as I am a younger resource, older people needn't take my suggestions seriously. So I kept relying on my complex queries till one day we had a review with some experts in Vertica. And as soon as they saw the query they asked us why we didn't have a unique identifier. That clearly solved my problem at least. I didn't have to convince anyone anymore. The experts made my team aware of the advantages of having a surrogate key. And how that could benefit query performance.
So what is this thing called surrogate keys?
A surrogate key in a database is a unique identifier for either an entity in the modeled world or an object in the database. The surrogate key is not derived from application data. That is taken from (Wikipedia - Surrogate key).
Now I'd like to quote something I read in stack overflow, which I felt is a very good answer that sums it up for me.
Using a natural key is rarely the best option. I would probably go for the surrogate key approach as in your first example.
The following are the main disadvantages of the natural key approach:
Source: (StackOverflow - Surrogate Keys v/s Natural Keys)
Comments
Post a Comment