January 26, 2015 by Alex Korling
Why You Should Use Referential Integrity in Your Web Application
When you build a database backed application it’s paramount that you keep your data in a sane state. The vast majority of the value our customers gain from our applications is in the data itself. Without it we’re left with a collection of nice, but empty pages. We need to be aware of this since it’s easy to create cracks in which errors slip through, leaving us with a mess of orphaned records and other messy data.
To set the stage, let’s imagine that we have an Rails application containing Customers and Memberships. A simple model which could be implemented like this:
class Customer < ActiveRecord::Base has_many :memberships validates :name, presence: true end class Membership < ActiveRecord::Base belongs_to :customer validates :customer, presence: true end
We launch our application and we get a lot of customers. Soon, we experience some really strange errors.
NoMethodError: undefined method `name' for nil:NilClass
Where did this come from? Well, our administrators, and the customers themselves, occasionally remove customer accounts. This has left us with Membership records without any associated Customer records.
We clean our database and change our model to the following:
class Customer < ActiveRecord::Base has_many :memberships, dependent: destroy validates :name, presence: true end
This change makes sure that all associated Membership rows will get removed from the database when the callback is fired. We no longer have to worry about any orphaned records.
Or do we?
A while after we made the change we need to remove some very old customer records that are no longer of interest. We run the following code:
Suddenly, the original problem reappears. How can this be?
The reason is that the delete_all message doesn't instantiate any objects and thus won't fire any after_delete callbacks. Yet again, we find ourselves having orphaned records in the database.
Foreign Keys to the Rescue
Don't you worry. The solution's been sitting next to you all along. The safeguard you've been looking for is a database feature called foreign keys. They ensure referential integrity which means that this problem will never reappear. We are free to introduce bugs that behave in this manner but the database will throw errors and we'll detect the problem immediately.
Rails 4.2 introduces support for adding and removing foreign keys using migrations. The syntax is the following:
# Add a foreign key to memberships that references customers add_foreign_key :memberships, :customers # Remove the foreign key on `memberships.customer_id` remove_foreign_key :memberships, :customers
Wait a minute, if I use foreign keys then I can't remove customers that have memberships. The database won't allow it! To solve this you have two options. Either make two calls to the database, one that removes all memberships connected with the customer and another that removes the customer. This doesn't sound too bad, but for every new table that we connect to the user, we'll have to make yet another database call. If we follow this procedure the amount of database call could get out of hand quickly. The other way is to use cascading deletes: they remove all the connected rows as well as removing the customer - and all in one call. It almost sounds to good to be true and sometimes it really is.
Imagine a scenario where you have data that mustn't be removed. An example of such data is a customer's invoices. To solve this we need to make sure that all relevant data such as customer name, address etc. is stored as a duplicate directly on the invoice row and then allow the customer foreign key to be null. Using on delete null will now clear the customer_id from the invoice table and all data will still be there.
Another option is to use dependent: restrict_with_exception or restrict_with_error, which will make Rails throw an exception or error if you try to remove the associated record.
One final thing that I must stress; if you use cascading deletes you need to make sure that all foreign keys are created in the right direction. Otherwise you might end up in a scenario where one seemingly innocent delete removes a catastrophic amount of data from a lot of different tables.
If you want to add foreign keys to an existing project I recommend the gem Immigrant.
The database is made with referential integrity in mind and you should utilize this.
Rails defaults to using SQLite in dev and it doesn't support foreign keys. You should use PostgreSQL to make sure that your code works as expected in development to minimize the risk of nasty surprises when you push to production.