How ActiveRecord builds SQL under the hood
You're not building SQL. You're building a tree. Understand Arel, bind parameters, lazy evaluation and the real difference between includes, preload and eager_load.
How ActiveRecord builds SQL under the hood
Most Rails devs write:
User.where(active: true).order(:name).limit(10)
And think: "Cool, that's going to become SQL."
It will.
But the path to becoming SQL is one of Rails' most beautiful pieces of engineering — and almost nobody understands it.
You're not building SQL. You're building a tree.
Each ActiveRecord query method (where, order, joins, limit, select) doesn't execute SQL.
They build an object called ActiveRecord::Relation.
rel = User.where(active: true)
rel.class # => User::ActiveRecord_Relation
That object carries inside it the intent of the query.
You can chain more:
rel.order(:name).limit(10).where(role: "admin")
Still a Relation. Nothing hit the database yet.
When SQL actually runs
The query is lazy.
It only becomes SQL when you force materialization:
- iterate (
.each,.map) - count (
.count,.size) - pull data (
.to_a,.first,.find_each) - inspect (
.exists?,.pluck(:id))
Before that, it's just described, not executed.
That's powerful because it allows composition:
def active_users
User.where(active: true)
end
def by_role(scope, role)
scope.where(role: role)
end
by_role(active_users, "admin").limit(5).to_a
Single SQL at the end. Not three.
Under the hood: Arel
ActiveRecord doesn't build SQL via string concatenation (thankfully).
It uses Arel — a SQL AST manipulation library.
users = Arel::Table.new(:users)
query = users
.project(users[:id], users[:name])
.where(users[:active].eq(true))
.order(users[:name])
query.to_sql
# SELECT users.id, users.name FROM users
# WHERE users.active = 't' ORDER BY users.name
Every where, order, joins in AR is translated to Arel.
Arel keeps a tree of nodes.
When you ask for SQL, it walks the tree and generates the string.
Why this matters
Because now you understand things like:
User.where("name = ?", params[:name]) # bad
User.where(name: params[:name]) # good
The first interpolates straight into a string.
The second goes through Arel, which knows how to sanitize each value.
SQL injection lives in the first. Not the second.
And:
User.where("name = '#{params[:name]}'") # criminal
Don't. Ever.
Joins, includes, preload, eager_load
Four different methods for something that looks the same.
User.joins(:posts) # INNER JOIN. Doesn't load posts.
User.includes(:posts) # Picks between preload and eager_load
User.preload(:posts) # 2 separate queries (IN clause)
User.eager_load(:posts) # LEFT OUTER JOIN, 1 query
The difference is how SQL is generated.
preload avoids N+1 with a secondary query:
SELECT * FROM users WHERE active = true;
SELECT * FROM posts WHERE user_id IN (1, 2, 3, ...);
eager_load avoids N+1 with a single join:
SELECT users.*, posts.*
FROM users LEFT OUTER JOIN posts ON posts.user_id = users.id
WHERE users.active = true;
includes is "Rails decides for you". Usually preload, but turns into eager_load if you add a where on a join column.
Each strategy has different cost on a large database.
N+1: why it happens
User.all.each do |user|
puts user.posts.count
end
ActiveRecord generates:
SELECT * FROM users;
SELECT count(*) FROM posts WHERE user_id = 1;
SELECT count(*) FROM posts WHERE user_id = 2;
SELECT count(*) FROM posts WHERE user_id = 3;
...
One SELECT per user.
100 users = 101 queries.
Fix with includes:
User.includes(:posts).each do |user|
puts user.posts.size # .size instead of .count
end
.count forces a query. .size uses the already-loaded collection.
One more detail that only makes sense when you understand what's happening underneath.
Bind parameters
When you pass values via hash, AR doesn't interpolate.
It sends a bind parameter to the database driver:
SELECT * FROM users WHERE id = $1 LIMIT 1
-- binds: [["id", 42]]
This:
- prevents SQL injection
- lets Postgres reuse the execution plan (prepared statement cache)
Each User.find(id) call benefits from the same cached plan.
Performance you get without doing anything — as long as you use the right API.
The chasm between ActiveRecord and SQL
ActiveRecord is powerful.
But it's an abstraction.
And like every abstraction, it leaks at some point:
- complex queries turning into inefficient SQL
joinsduplicating rows when you expected uniquedistinctneeded where you didn't expect itpluckvsselectperformance differences
People who only know AR suffer.
People who know SQL + AR know when to drop down to find_by_sql or raw Arel.
Look at the generated SQL
Whenever in doubt:
User.where(active: true).joins(:posts).to_sql
Or in console:
ActiveRecord::Base.logger = Logger.new(STDOUT)
Or in regular Rails: SQL shows up in the development log.
The rule is simple: if you don't understand the SQL, you don't understand the query.
The big shift
ActiveRecord isn't magic.
It's a tree builder that accepts chainable methods and, at the right moment, translates everything to SQL via Arel.
When you understand:
- where the query executes
- how to compose without materializing early
- which methods hit the database
- what the final SQL really looks like
…you stop writing Active Record by superstition.
And start writing it by intent.
Conclusion
Every slow Rails app I've ever seen started somewhere in ActiveRecord.
Not because AR is bad.
Because nobody looked at the SQL.
Reading SQL is a basic skill for senior Rails devs.
And reading ActiveRecord is understanding that it's just a pretty layer on top.
Doesn't replace the database.
Never did.
