Question about integrating Shrine with Hanami Entities

I’m writing a ROM/Hanami integration for Shrine, and I’m a bit stuck trying to figure out how to best integrate with Hanami entities.

First, I’ll try to illustrate how Shrine handles attachments with code:

class Entities
  class Photo < Hanami::Entity
    include Shrine::Attachment(:image)
  end
end
attacher = Shrine::Attacher.new
attacher.attach(file)
attacher.file #=> #<Shrine::UploadedFile @id="path/to/file" ...>

# returns serialized attachment data that should be persisted
attacher.column_data
#=> '{"id":"...","storage":"...","metadata":{...}}'

photo = photo_repo.create(image_data: attacher.column_data)

# loads attachment from the data attribute
photo.image #=> #<Shrine::UploadedFile @id="path/to/file" ...>

The important bit is, when the record is loaded from the database and you retrieve the attachment, Shrine parses the attachment column data and creates a Shrine::UploadedFile object.

Now, this parsing and loading can add a bit of overhead, especially when storing additional processed files alongside the main file. To remedy this, when working with mutable structs, Shrine lazily memoizes the results, so that parsing and loading Shrine objects happens only once.

With ROM::Struct objects we can still use that approach, but with Hanami::Entity objects we can’t, because Hanami entities are frozen. So, my questions is how to idiomatically handle this kind of performance optimization in Hanami?

I’ve investigated several options:

  1. Load the attachment in #initialize, before the instance gets frozen. What I don’t like about this approach is that users are then paying the performance penalty of loading the attachment regardless of whether they will use it.
  2. Initialize a hash instance variable in #initialize, and use it later for memoization. This approach should work, because that hash won’t get frozen. Would this approach suit that Hanami philosophy?

To illustrate what I mean by option 2:

class Photo < Hanami::Entity
  include Shrine::Attachment(:image)

  def initialize(*args)
    super
    @attachments = {}
  end

  def image
    @attachments[:image] ||= super
  end
end

@janko Thanks for taking care of this. I really appreciate that! :green_heart:

Let’ me clarify the role of entities: they are the result of a database operation (read/write), and they can potentially expose user defined business logic methods.

With Hanami/ROM entities it’s impossible to lazy load associated entities, like in activerecord (e.g. user.photo). A developer must explicitly load an associated entity, if that is needed. The reasons are:

  1. We want to avoid accidental n+1 queries.
  2. Lazy loading exposes entities to potential side effects failures. Fetching extra data from the database can fail, this leads entities to no longer be deterministic: once the lazy load may succeed, another time may not. For a full explanation of this point, please watch my talk at RailsClub 2017 from this point.

Having attachments tight to an entity, and let the entity itself to be responsible to fetch remotely the attachment, leads to a similar problem described above: the entity exposes methods that have side effects, and side effects can make that entity behavior brittle due to the potential failures related to the remote system state, network, and so on…

This approach seems to be cargo culting activerecord lazy loading. My question for you is: can we do better? Can we think of a system that once the entity and the attachment are loaded returns an object that cannot fail because of side effects?

Alternatively, if you still want to preserve lazy loading as feature for shrine, can the gem target (or introduce) something different from entities?

@jodosha Thanks for the thorough reply!

In this context, the “attachment” that’s added to the entity wouldn’t be doing any side effects or doing anything with the network. The intention is only parsing the attachment data that’s stored in the entity column attribute.

Sure, you would be able to do photo.image.download, but in that case the storage interaction is handled inside the Shrine::UploadedFile object, which doesn’t have any knowledge of entities.

To clarify, at the moment, Shrine’s “entity” integration doesn’t do any memoization or modify any instance variables (unlike the “model” integration that’s used for ActiveRecord/Sequel), precisely because of the entity immutability. So, the Hanami integration definitely works now with Shrine 3.0 beta, and I think for the majority of users the performance will be fine.

The question of memoization is just an extra optimization, which can be important if the same attachment is accessed multiple times or there are many files in a single attachment. From your reply, I feel like it would be best to leave this optimization to the user. For example, if a user is retrieving the attached file in their views, they can cache it to an instance variable in the view object.

To clarify, the Shrine::Attachment module that’s included in the entity class is just a convenience layer:

photo.image #=> #<Shrine::UploadedFile>

# which can be broken down to

attacher = photo.image_attacher
attacher.file #=> #<Shrine::UploadedFile>

# which can be further broken down to

attacher = Shrine::Attacher.from_entity(photo, :image)
attacher.file #=> #<Shrine::UploadedFile>

This “attacher” object loads the attached file on initialization, and keeps it cached. So, a Hanami user can already choose how much they want to decouple attachments from entity classes. They can use the Shrine::Attachment module to reduce boilerplate, or they can skip it and use Shrine::Attacher directly for a more explicit approach.

So, I think we’re good. ROM and Hanami integrations won’t have any memoization when loading the attached file through the entity, the users can do this optimization in their presenter/view objects as needed.

1 Like