Data integrity with Validations

As soon as we start saving data into our tables from external sources, be it from our users, CSVs, APIs, or wherever, we have to start worrying about whether that data is valid. Did they fill out all of the required fields? Did they choose a username that was already taken? Did they enter an age less than 0? Did they vote twice?

If invalid records sneak into our database, and our code assumes our data is valid, then we’re going to have all kinds of problems. Then, we have to start writing our code very defensively, with tons of if/elsif/else/end statements scattered everywhere to guard against invalid data causing errors in otherwise functional code.

For example, in this project (after rails sample_data and bin/server), visit the details page of a movie; say, /movies/2. If you just ran rails sample_data, you should see the details page for The Godfather.

Now, in another tab, visit /rails/db and delete the director of this movie, Francis Ford Coppola.

Now, visit our the details page we’ve worked so hard on, /movies/2 again. undefined method 'name' for nil:NilClass?! What’s up with that? It was fine a second ago!

Should you go change Line 51 of app/views/movie_templates/show.html.erb to resolve the issue? All too often, this is what students do. But you don’t have a code problem; you have a data problem.

Even worse, go back to /movies. The entire index page is broken, due to a handful of orphaned movies within the .each loop!

Fortunately, we have several tools at our disposal to solve the immediate problem at hand:

  • Use the debugging REPL embedded in the Better Errors page to delete the movies that are causing problems, or assign them valid director_ids and save.
  • Probably better: run rails sample_data again to reset everything.

Defensive code

Sometimes, we really do want to allow spotty data, and we just have to be prepared for it when writing our code. In this case, maybe we want to allow movies to be created without the creator knowing who the director is, and so they are allowed to leave director_id blank.

In that case, you have to write your code defensively, with if/else/end statements. For example, we could do something like this:

<% if @the_movie.director != nil %>
  <%= @the_movie.director.name %>
<% else %>
  Uh oh! We weren't able to find a director for this movie.
<% end %>

This will get very tiresome to repeat over and over, so this too might be best encapsulated as an instance method:

class Movie < ApplicationRecord
  def director
    my_director_id = self.director_id

    matching_directors = Director.where({ :id => my_director_id })
    
    the_director = matching_directors.at(0)

    return the_director
  end

  def director_name_or_uh_oh
    if self.director != nil
      return self.director.name
    else
      return "Uh oh! We weren't able to find a director for this movie."
    end
  end
end

Now, we can use this method wherever we need a director name or apology:

<%= @the_movie.director_name_or_uh_oh %>

Data integrity with validations

Many times, we don’t want to allow spotty data; we want our data to be uniform and consistent. In this case, it’s best to not to allow invalid data that doesn’t meet our criteria to enter our database in the first place; then, we don’t need to worry about writing lots of defensive conditionals downstream in our code.

If so, then ActiveRecord offers a tremendously useful feature: validations.

Validations are a way that we can specify constraints on what we allow in each column in a model. Crucially, if our validation rules aren’t met, then the .save method won’t insert the record into the database. It will not transact with the database; and, it will return false (until now, it has always returned true).

Here’s an example: to start to address the movies/director_id issue, let’s say we don’t want to allow any rows to enter our movies table with a blank director_id attribute.

Then, we can declare a validation in the Movie model with the validates method:

class Movie < ApplicationRecord
  validates(:director_id, { :presence => true })
end
  • The first argument to validates is a Symbol; the name of the column we are declaring constraints for.
  • The second argument to validates is a Hash. The keys in the Hash are the constraints we are applying. There is a fixed list of constraints built-in to ActiveRecord that we can choose from. By far, the three most common are :presence, :uniqueness, and :numericality.
  • The value associated to each key is either true, which means you want the simplest use of the validation; or, the value can be a whole other Hash containing configuration options, like minimum and maximum values for :numericality.

Let’s give our simple validation a test. Open a rails console and create a new instance of a Movie:

[1] pry(main)> m = Movie.new
   (0.4ms)  SELECT sqlite_version(*)
=> #<Movie:0x00007f93ab387430
 id: nil,
 title: nil,
 year: nil,
 duration: nil,
 description: nil,
 image: nil,
 director_id: nil,
 created_at: nil,
 updated_at: nil>

Now, before assigning any attributes at all, try to do a .save:

[2] pry(main)> m.save
=> false

Ah ha! It returned false, and there’s no SQL output like usual. If you look at m, you’ll see that it still has no id, created_at, or updated_at — it didn’t sneak into the database:

[3] pry(main)> m
=> #<Movie:0x00007f93ab387430
 id: nil,
 title: nil,
 year: nil,
 duration: nil,
 description: nil,
 image: nil,
 director_id: nil,
 created_at: nil,
 updated_at: nil>

Read more about validations here:

https://guides.rubyonrails.org/active_record_validations.html