Rails and Active Record provide plenty of magic to get web applications up and running quickly, which is great. Associations make it easy to write short, readable code that produces complex SQL behind the scenes. That magic can also lead to some inefficiencies that slow down your app. I’ll try to explain why and how to do it better in this post.
A common task in Rails is to retrieve data and return it as JSON to your front end. This can be easy in Rails, but if you’re not careful you can run into a couple problems. One is that Active Record magic can lead to running too many inefficient queries. The other less obvious problem is that instantiating Active Record objects is expensive. I’ve put together a little example to demonstrate this and to show how to improve some typical Rails code.
For this demo, I’m going to create a new Rails app, create some models, then use the Rails console to insert sample data and measure how long it takes to retrieve the data.
Setup App and Models
First, let’s create the app and models. We’ll build a little school system.
Our school has Students, Classrooms, Subjects and Courses. A Course has one subject, classroom and teacher, and many students. For the sake of keeping this short, I’ve stuck with the default SQLite database.
Let’s add some extra associations to the Course model to indicate that a Course has many Students:
Now let’s populate some data. We’ll create a School class to contain our helper methods to create data and run the tests.
The create_sample_school
method creates 100 courses, each with a classroom, teacher and subject, and 2000 students (20 per course). This gives us a decent set of data to do some performance testing with. Let’s run it in the rails console.
Do some performance testing
Let’s pretend our school needs to print out all the course information and student listing on the school’s bulletin board. They decided to do this by logging each course with its subject, teacher, classroom and student list to the console. Here’s our method for doing that:
I like to measure how long a ruby method runs by simply comparing the time at the start and end. The print_school_slowly
method uses standard Active Record associations to get the data we need. Let’s run it a few times:
1/4 of a second to retrieve the data… Remember that’s being added on to the time a user is waiting for a page to load, and a typical web application does much more complex stuff than this. This is too slow. Let’s speed it up.
A well-known way to speed this up is to use includes
. This will introduce eager loading of the associations so we don’t have to run additional queries behind the scenes.
We’ve improved by 0.1 seconds, nice. We can do better, though. For each course, we’re instantiating the Course, Subject, Teacher and Classroom objects and 20 Student objects. Like I said earlier, instantiating ActiveRecord objects is expensive. In this case, we don’t really use them for anything other than grabbing a string property. Instead, we can use ActiveRecord’s pluck
method. It returns the fields from the database without instantiating ActiveRecord objects.
We’ve cut the time in half again. That was all time being spent instantiating ActiveRecord objects.
Finally, I’m not happy that we are still querying the student_courses table 100 times (once per course). Let’s prevent that. This is where the code starts getting a little uglier, so you have to decide whether you’d rather trade off performance for elegance. In my opinion, performance almost always wins.
Now we’re down to about 0.019 seconds. What I did here was instead of querying the student_courses table for each course, I just ran one query to get all students before the loop and built up a hash where the keys are course IDs and the values are arrays of student names. Pulling those from memory in the loop is faster than hitting the database every time.
The final version is about 15 times faster than the first version. To summarize how we did it, we reduced the number of database queries and reduced the number of Active Record objects we instantiated. In real life scenarios, these techniques can result in huge performance boosts. You can get better page load times and process backgrounds jobs faster. Even though the code is a little uglier, it’s well worth making a user’s experience better.