How has_one_attached works in Active Storage

Active Storage Internals: How has_one_attached DSL Works

In this post, we'll explore the internals of has_one_attached method in Active Storage. It covers two interesting patterns, i.e. proxy and change objects. We'll trace the control flow from the model DSL to persistence and uploads, and explain how files are created, detached, and purged.

9 min read

People usually read fiction before going to bed. I have a strange habit of reading the Rails source code at night. Not because it puts me to sleep, but for some reason I find the process of opening the Rails codebase, picking some feature, and just reading Ruby for an hour or two strangely calming. It lets me forget everything and just get in the flow for a few hours.

Anyways, after I published the post on Active Storage Domain Model, I stayed up until 2 am last night reading the Active Storage codebase to figure out how the has_one_attached method worked. It uses a couple of interesting patterns, and thought I'd share everything I learned. So since it's a rainy Saturday, I've been writing since morning, and what follows is my current understanding of how a single document gets attached to a Rails model. There might be gaps in my understanding, so please let me know if you find any mistakes.

This post explores the internals of Active Storage for the has_one_attached API. We’ll trace the path from the DSL on your model, through the proxy object, into the “change objects” that coordinate persistence and uploads, and finally into purge/detach behavior and background jobs. We'll stick to has_one_attached in this post, and cover has_many_attached in a future post.

I hope that you'll have the Rails source code open on the side while reading the post for browsing surrounding code, more context and better understanding. Otherwise, most of the post won't make any sense and you'll give up halfway. You've been warned.

Here're a couple of tips that I've found useful when reading the Rails source code.

How I Read Rails Source Code
Here’re two techniques I’ve found really helpful for reading the Rails codebase, without getting overwhelmed. If you want to dive into the Rails source code for a deeper understanding, but feel intimidated by the sheer size of the codebase and don’t know where to begin, this post is for you.

The Model DSL: Code Generation and Lifecycle

As you already know, the has_one_attached API specifies the association between a 'single' file attachment and the Rails model. In the following example, we're declaring that a user can have an avatar attached to it.

class User < ApplicationRecord
  has_one_attached :avatar
end

As with many things in Rails (see: sharp knives), this one line is very powerful. Declaring has_one_attached :avatar adds following capabilities to the User model:

  • Associations: avatar_attachmentavatar_blob
  • A reader: avatar returns a proxy (ActiveStorage::Attached::One)
  • A writer: avatar= records a “change” object (ActiveStorage::Attached::Changes::CreateOne) for later persistence/upload
  • Lifecycle callbacks:
    • after_save to persist attachment and blob in the database
    • after_commit to upload the actual file to the storage service (S3, GCS, etc.)

Highly recommend taking some time to read the source code for this method. Key design decisions include:

  • The writer does not immediately touch the database or upload files. It records an intent (attachment_changes[name] = CreateOne/DeleteOne) to be materialized by callbacks.
  • Persistence (association rows) is handled in the transaction (after_save).
  • Uploads happen only after the transaction commits (after_commit). This prevents orphaned uploads if the DB transaction rolls back.

Side Note: What're Generated Associations Methods?

You'll see a strange method named generated_association_methods in the above method.

def has_one_attached(name, dependent: :purge_later, service: nil, strict_loading: false)
  generated_association_methods.class_eval <<-CODE, __FILE__, __LINE__ + 1
    def #{name}
      # ...
    end

    def #{name}=(attachable)
      # ...
    end
  CODE
end

It's an internal core API in Active Record. It stores all the dynamically defined methods Rails creates for your model’s associations. It’s not something you usually call directly, but understanding what it does is helpful if you’re digging into Rails internals or metaprogramming.

Every time you define a new association, Rails defines its helper methods inside this module instead of directly on the model. Because the module is included into the model class, the methods still behave as if they were defined on the class itself.

Try this in your Rails app:

Post.generated_association_methods.instance_methods
app(dev):002>
=> [:tagged_post_ids, :tagged_posts=, :tagged_post_ids=, :tag_ids, :tag_ids=, :tagged_posts, :uploaded_images, :tags=, :uploaded_image_ids, :tags, :uploaded_image_ids=, :uploaded_images=]

Okay, let's carry on with our code reading.

The Proxy Object

💡
Proxy means “a figure that can be used to represent something else”. The proxy design pattern is used to create a wrapper to cover the main object’s complexity from the client.

Here's the reader method that was added on the model. The reader returns a proxy instance (ActiveStorage::Attached::One).

def #{name}
  @active_storage_attached ||= {}
  @active_storage_attached[:#{name}] ||= ActiveStorage::Attached::One.new("#{name}", self)
end

For the call has_one_attached :avatar, this translates to:

class User < ApplicationRecord
  def avatar
    @active_storage_attached ||= {}
    @active_storage_attached[:avatar] ||= ActiveStorage::Attached::One.new("avatar", self)
  end
end

Basically, it creates a hash and inserts the proxy object for the avatar name and returns it for subsequent read calls.

The proxy object Attached::One has a simple API:

  • attach(attachable) attaches an attachable to the record (returns the attachment or nil depending on record validity)
def attach(attachable)
  record.public_send("#{name}=", attachable)
  if record.persisted? && !record.changed?
    return if !record.save
  end
  record.public_send("#{name}")
end

Note that if the record is persisted and unchanged, attach triggers a synchronous save; otherwise attachment persistence is deferred to the next save. It returns the attachment proxy. If save fails, it returns nil.

  • attachment returns the underlying attachment, but you don't really have to use this as all the attachment methods are available on the model.
def attachment
  change.present? ? change.attachment : record.public_send("#{name}_attachment")
end
  • attached? returns true if an attachment has been made.
def attached?
  attachment.present?
end
  • purgepurge_laterdetach are delegated to the underlying “change” objects
delegate :purge, to: :purge_one
delegate :purge_later, to: :purge_one
delegate :detach, to: :detach_one

private
  def purge_one
    Attached::Changes::PurgeOne.new(name, record, attachment)
  end
  
  def detach_one
    Attached::Changes::DetachOne.new(name, record, attachment)
  end

This change object pattern is quite interesting, so let's explore it.

Change objects Encapsulate attach/detach/purge Behavior

The has_one_attached method added a writer method on the model.

def #{name}=(attachable)
    attachment_changes["#{name}"] =
      if attachable.nil? || attachable == ""
        ActiveStorage::Attached::Changes::DeleteOne.new("#{name}", self)
      else
        ActiveStorage::Attached::Changes::CreateOne.new("#{name}", self, attachable)
      end
end

For the has_one_attached :avatar, this translates to:

def avatar=
    attachment_changes["avatar"] = 
      if attachable.nil? || attachable == ""
        ActiveStorage::Attached::Changes::DeleteOne.new("avatar", self)
      else
        ActiveStorage::Attached::Changes::CreateOne.new("avatar", self, attachable)
      end
end

This method builds one of two “change” objects depending on what you assign.

  1. When you do user.avatar = attachable, or user.avatar.attach(attachable), Rails stores that assignment in the attachment_changes hash with "avatar" as the key and a CreateOne proxy as the value.
  2. If instead you assign nil or an empty string (""), Rails stores a DeleteOne proxy under the same key.

This change object is then used later in the save cycle to actually create or remove the attachment.

The change objects implement the following responsibilities:

  • Build or find a blob (CreateOne)
  • Build an attachment row (in-memory)
  • Persist association rows on after_save, upload on after_commit
  • Reset local state for detach/purge

CreateOne (attach)

The constructor eagerly identifies the blob content (without saving) to fill content type, byte size, checksum:

def initialize(name, record, attachable)
  @name, @record, @attachable = name, record, attachable
  blob.identify_without_saving
end

The attachable can be one of the following:

  • ActiveStorage::Blob: Instance of a blob
  • ActionDispatch::Http::UploadedFile: A user-uploaded file (most common)
  • Rack::Test::UploadedFile: Convenient File for testing
  • Hash: Data containing at least an open IO object and a filename
  • String: Signed ID of a blob. Useful for direct uploads where the client-side needs to refer to the blob that was created ahead of the upload itself on form submission.

Hooks to Save and Upload Files

When the record is finally saved, the after_save callback calls the save method on the CreateOne change object to persist the attachment and the blob in the database (the file is not uploaded yet!).

# model.rb
after_save { attachment_changes[name.to_s]&.save }

# create_one.rb
def save
  record.public_send("#{name}_attachment=", attachment)
  record.public_send("#{name}_blob=", blob)
end

…and the binary upload happens only after commit:

# model.rb
after_commit(on: %i[ create update ]) { attachment_changes.delete(name.to_s).try(:upload) }

# create_one.rb
def upload
    # ...
end

Blob Creation, Identification, and Upload

In blob construction, you need to understand the difference between two terms: “unfurl” vs “upload”:

  • “Unfurl” computes checksum, sets byte_size, extracts/sets content type, and marks identified.
# blob.rb

def unfurl(io, identify: true)
  self.checksum     = compute_checksum_in_chunks(io)
  self.content_type = extract_content_type(io) if content_type.nil? || identify
  self.byte_size    = io.size
  self.identified   = true
end
  • “Upload” sends bytes to the configured service and writes metadata headers. Normally, you do not have to call this method directly at all. Use the create_and_upload class method instead.
def upload(io, identify: true)
  unfurl io, identify: identify
  upload_without_unfurling io
end

Separating these steps enables the CreateOne flow:

  • Identify/build the blob before saving (so validations, metadata, content type are available early)
  • Persist associations in-transaction
  • Upload only after commit

This also allows the “don’t re-extract type” path by passing identify: false.

Now let's quickly explore how you delete / dis-associate the attachment and remove the file from the storage.

Delete (unset)

Writing user.avatar = nil or "" uses DeleteOne, which simply nulls the association at save-time. Remember the original setter method that was added by the has_one_attached.

# model.rb
def #{name}=(attachable)
  attachment_changes["#{name}"] =
    if attachable.nil? || attachable == ""
      ActiveStorage::Attached::Changes::DeleteOne.new("#{name}", self)
    else
      ActiveStorage::Attached::Changes::CreateOne.new("#{name}", self, attachable)
    end
end

after_save { attachment_changes[name.to_s]&.save }

DeleteOne is a simple change object. Here's the whole class. When the user record is saved, the callback will call the DeleteOne#save method, which sets the avatar_attachment association to nil.

# frozen_string_literal: true

module ActiveStorage
  class Attached::Changes::DeleteOne # :nodoc:
    attr_reader :name, :record

    def initialize(name, record)
      @name, @record = name, record
    end

    def attachment
      nil
    end

    def save
      record.public_send("#{name}_attachment=", nil)
    end
  end
end

Whether the blob is purged upon replacing/removing depends on the :dependent option on the attachment reflection (default :purge_later; see below).

Detach

“Detach” removes the attachment row but does not purge the blob. When you call user.avatar.detach, it's actually calling it on the ActiveStorage::Attached::One proxy object, which delegates it to the Attached::Changes::DetachOne. I leave the exercise of figuring out this flow as a challenge to the reader.

# detach_one.rb

def detach
  if attachment.present?
    attachment.delete
    reset
  end
end

def reset
  record.attachment_changes.delete(name)
  record.public_send("#{name}_attachment=", nil)
end

The test suite documents the invariant: detach keeps the blob and object on the service:

@user.avatar.detach
assert_not @user.avatar.attached?
assert ActiveStorage::Blob.exists?(blob.id)
assert ActiveStorage::Blob.service.exist?(blob.key)

Purge and Purge Later

Purge will remove the actual file from the storage service. It will destroy the blob and attachment and then delete the file on the service.

For the one-attached proxy, purge/purge_later delegate to a change object that calls through to the ActiveStorage::Attachment API, then clears state. The flow (not behavior) works exactly like the detach method.

# purge_one.rb

def purge
  attachment&.purge
  reset
end

def purge_later
  attachment&.purge_later
  reset
end

def reset
  record.attachment_changes.delete(name)
  record.public_send("#{name}_attachment=", nil)
end

The attachment model coordinates whether a blob should be purged when an attachment is destroyed, based on the :dependent option.

Purging ensures the blob (and remote object) are gone unless it’s shared:

@user.avatar.purge
assert_not @user.avatar.attached?
assert_not ActiveStorage::Blob.exists?(blob.id)
assert_not ActiveStorage::Blob.service.exist?(blob.key)

# shared blob survives purge of one attachment
@user.avatar.purge
assert ActiveStorage::Blob.exists?(blob.id)
assert ActiveStorage::Blob.service.exist?(blob.key)

My Mental Model of the System

Proxy and Change Object Patterns in Active Storage
Proxy and Change Object Patterns in Active Storage
  • Write to record.avatar= to stage a change (create or delete).
  • Save the record to persist attachment rows (in-transaction).
  • The actual upload happens after commit.
  • Use record.avatar.attach(attachable) most of the time, for immediate attach (with conditional save).
  • Use detach to drop the attachment row and leave the blob alone.
  • Use purge/purge_later to delete both the attachment and the blob (unless shared elsewhere).

That's a wrap. I hope you found this article helpful and you learned something new.

As always, if you have any questions or feedback, didn't understand something, or found a mistake, please leave a comment below or send me an email. I reply to all emails I get from developers, and I look forward to hearing from you.

If you'd like to receive future articles directly in your email, please subscribe to my blog. Your email is respected, never shared, rented, sold or spammed. If you're already a subscriber, thank you.