Tag Archives: DBMS

Abusing Foreign Keys

Recently I came across a database design which made me think that upto what extent one should be using Foreign keys. A colleague created a new table in an existing schema and used maximum number of foreign key constraints possible to make sure no junk data gets entered into this table. On one hand it is a good idea to use foreign key constraints to make sure your data is clean all the time, on the other hand it might introduce some unwanted conditions on your code.

Let me take an example. You have 2 tables employee and salary. Employee has a primary key employee_id which you use as foreign key in your salary table. Sounds smooth, if we have salary entry, we must have the employee, and if employee is moving out(record getting deleted), delete the salary entry. Lets say I have another table called print_history which stores who all has printed salary slips for an employee. Now this table has 2 employee Ids, one for the manager who printed the slip (say Manager_id) and second is the employee whose slip was printed (say emp_id). Now we know that logically these 2 Ids should be in employee table, but is it necessary to add the constraint. It will work fine while insertion of the records, but say there is a case when an employee record is getting deleted, should we enforce deletion of print_history record? should we stop employee record from being deleted if there is an print_history entry? (I know soft deletes might help, but still, why to add unnecessary  constraints?

Deleting duplicate rows from a table

Recently faced an issue of having duplicate rows in a table of production database. Normally to delete or insert data into table, primary key is used, but due to bad database design somehow duplicate rows got added. Exact same data, each column was same.

A little bit googling solved the issue.

delete from temp_table where rowid not in (select min(rowid) from temp_table group by column1, column2);

Note that this might not straight away work for a table with huge data. A simple way to solve that issue is to create a temporary table, say temp_table2, copy data for duplicate rows from temp_table to temp_table2. Now clean up temp_table2 using above delete command and delete all the duplicate rows form temp_table. Insert back all the remaining, cleaned up, rows from temp_table2 to temp_table.