Several techniques are often deployed in well-designed databases to minimise the amount of invalid data. Common ones include:
- Check Constraints– checks that can be made on a single row in a single table, for example:
StartDate <= EndDate
Height > 0
Age < 199
- Unique Indexes– to make sure we don’t have duplicate entries in a table, for example:
- National Insurance Number
- Car Registration Number
- Triggers – a last resort for cases where “it’s complicated” or we have to check data from more than one table.
Foreign keys are also often used to check data from one table against that in another to ensure its referential integrity. For example:
- If a Product has a Category ID of 27, then there is a Category with an ID of 27.
- If an Employee relates to a Department, the Department exists.
If we want to make additional checks, for example that the Category is of a type appropriate for the Product, or that the Department hasn’t been soft-deleted, these generally have had to be implemented via code in triggers. (One alternative, of implementing cross-table validation using check constraints, can have serious performance implications.)
Check constraints, unique indexes and foreign keys are good both from a performance viewpoint and also because they provide essentially code-free validation. Doing validation in triggers is far from ideal for several, well-documented reasons (such as performance and the need to replicate the same validation in more than one trigger).
So wouldn’t it be good if we could somehow replace trigger-based validation code with foreign keys, which are, after all, all about checking the data in two tables? The following two examples outline an approach in which the use of foreign keys is extended beyond the normal “does that ID exist?” check.
Example 1
Table InvoiceItem has two columns, ExpenditureExpenseCodeID and IncomeExpenseCodeID, both of which reference the Primary Key ExpenseCodeID column of the ExpenseCode table:
However, each ExpenseCode can be marked as being suitable for income only, expenditure only or both via a nullable IsIncome column (where a value of Null means both).
One option would be to put each type in to a separate table (e.g. ExpenseCodeIncome, ExpenseCodeExpenditure or ExpenseCodeBoth) but that would be a bit unwieldy (not only because ‘both’-type rows would have to be stored in all three tables).
Instead, validation code is typically added to InvoiceItem’s trigger to ensure that only appropriate ExpenseCodes are chosen for IncomeExpenseCodeID and ExpenditureExpenseCodeID. (The same validation has to be carried out in ExpenseCode’s trigger in case an attempt is made to change its IsIncome column.)
In the new approach, we have two additional columns in the ExpenseCode table, computed as follows:
ExpenseCodeIDForIncome = Case When IsNull (IsIncome, 1) = 1 Then ExpenseCodeID Else -ExpenseCodeID End ExpenseCodeIDForExpenditure = Case When IsNull (IsIncome, 0) = 0 Then ExpenseCodeID Else -ExpenseCodeID End
So we might have:
ExpenseCodeID | Description | IsIncome | ExpenseCodeID ForIncome | ExpenseCodeID ForExpenditure |
1 | I-3875 | 1 | 1 | -1 |
2 | E-9872 | 0 | -2 | 2 |
3 | 4431 | Null | 3 | 3 |
We could then change the foreign keys from the InvoiceItem table; instead of referencing ExpenseCodeID, the one from IncomeExpenseCodeID would reference ExpenseCodeIDForIncome and the one from ExpenditureExpenseCodeID would reference ExpenseCodeIDForExpenditure. In both cases, if the record referenced in ExpenseCode was not appropriate, the calculated column would have a negative value and an error would result.
Example 2
This one is slightly more complicated. We have two tables, Product and Category. Each Product has a Category.
Both can be soft-deleted but we do not allow Products to be associated with a deleted Category unless the Product is also deleted. Normally, this requirement would be implemented by code in triggers, which would have to be applied to both tables. However, consider a computed column on each table:
- In Category:
CategoryIDForValidation = Case When Deleted = 0 Then CategoryID Else –CategoryID End
- In Product:
CategoryIDForValidation = Case When Deleted = 0 Then CategoryID End
Then, we can set up a foreign key between these two columns which will only be violated if the record in Category is soft-deleted while the record in Product isn’t. Here’s a table listing the possible combinations:
CategoryID | Deleted | CategoryID ForValidation | ProductID | CategoryID | Deleted | CategoryID ForValidation | Allowed? |
10 | No | 10 | 20 | 10 | No | 10 | Yes |
11 | No | 11 | 21 | 11 | Yes | NULL | Yes |
12 | Yes | -12 | 22 | 12 | No | 12 | No |
13 | Yes | -13 | 23 | 13 | Yes | NULL | Yes |
The foreign key won’t attempt to link any records where CategoryIDForValidation is NULL in Product and will succeed where both records are undeleted. However, if the Category has been soft-deleted by the Product hasn’t (third row), an error will be thrown.
Summary
This technique provides a way of replacing trigger-based validation with foreign key validation in some, but not all, situations and has the following benefits:
- Improved performance
- Elimination of duplicated code