Relational databases tend to be furnished with many foreign key constraints. These prevent some types of invalid data from being added to tables – you can think of them as an ASSERT if you’re more familiar with C# and the like. Normally, the performance impact is minimal, based on the following arrangement:
- The column being referenced by the foreign key will almost always be the primary key of its table.
- As such it will be indexed so that, when adding new records to the referencing table, it will be possible to check that the new values being added are valid relatively quickly, by looking for them in the referenced table via the index.
However, there is a potential performance pitfall when deleting from tables. We came across a situation where a very innocuous-looking piece of code was deleting a relatively small number of rows from a table (which contained over a million rows). This was taking hours to complete. The cause was a foreign key from one column in the table to the ID column (the primary key) of the same table.
When adding rows, this foreign key wouldn’t be a problem because the check would be made on values in the ID column, which is indexed. However, when deleting, the check would be made against values in the referencing column, which didn’t have an index. For example, if a row with an ID of 10 was being deleted, a check would have to be made that there were no other rows which had a value of 10 for referencing column.
Adding an index to the referencing column enabled the deletions to complete in seconds rather than hours.