Integrating ROM into Hanami 2.2

It’s time to build Hanami 2.2!

The main focus for this version will be the integration of ROM as our DB layer.

In this post, I outline the user experience we’d like to build for this, along with notes on critical bits of implementation.

I also invite your feedback! If there are any aspects of this proposal that are not clear, or which you think could be changed for a better user experience, please let me know!


Principles

Before we go into details, let me outline some principles that underpin our goals here:

  1. It’s just ROM. We want to elevate ROM for Hanami users. Not to hide it away under another level of abstraction, but rather to provide the world’s best ROM experience as part of our full stack framework. All ROM features should be available to the user.
  2. Streamline the essentials, while establishing DB layer separation. Repos are your app’s front door to persistence, while relations are the home for your low-level query logic. Structs wrap the values you pass around your app. All are given special top-level folders by default (repos/, relations/, structs/), while the db/ directory exists as your home for the rest of your persistence logic, which should be kept separate from the rest of your app. Along with this, provide a full-powered CLI experience to make it easy to drive all aspects of the DB layer.
  3. Preserve slice independence. Slices can already work as wholly independent parts of a Hanami app. They can have their own routes, actions, views, assets, and business logic. We should make it possible to build an independent DB layer within each slice too, while still allowing whole-of-app conveniences and sharing where possible.

Proposal

tl;dr

config/
  db/
    migrate/*
    structure.sql
app/
  db/
    relation.rb   # TestApp::DB::Relation < Hanami::DB::Relation
    repo.rb       # TestApp::DB::Repo < Hanami::DB::Repo
    struct.rb     # TestApp::DB::Struct < Hanami::DB::Struct
  relations/
    cafes.rb      # TestApp::Relations::Cafes < TestApp::DB::Relation
  repos/
    cafe_repo.rb  # TestApp::Repos::CafeRepo < TestApp::Repo
  structs/
    cafe.rb       # TestApp::Structs::Cafe < TestApp::DB::Struct

To go over the details, we’ll start from the bottom up.

A config/db/ directory holds migrations and DB structure dump

config/
  db/
    migrate/*
    structure.sql

The contents of config/db/ directory will be familiar to Hanami (and Rails) users. The one adjustment here is that we place it under config/ rather than in the app root. This allows us to preserve this same structure within slices as well as the app (more on this further below).

Two directories hold ROM runtime components: app/relations/ and app/db/

app/
  db/
    relation.rb   # TestApp::DB::Relation < Hanami::DB::Relation
  relations/
    cafes.rb      # TestApp::Relations::Cafes < TestApp::DB::Relation

Relations are the primary way of interacting with database tables, and in many apps, these will be the only ROM runtime components users need, so we recognise their prominence by giving them a convenient top-level directory. We will still encourage repos as the primary DB interaction layer (more on these later), but relations are important in building any app’s DB layer, so we don’t want the user to dive too deeply to get at them.

The base relation class will be generated into an app/db/ directory, whose purposes is to keep all other ROM runtime components. The idea here is to provide some structural reinforcement directory that the persistence layer is separate from the domain layer, courtesy of this dedicated directory and matching namespace.

Although relations have their own top-level directory, the base class is in app/db/ to reinforce that relations still belong to this DB layer, despite their placement outside of it for convenicence.

We’ll use the base app relation class (TestApp::DB::Relation) to avoid the awkward standard ROM subclassing syntax for relations (MyRelation < ROM::Relation[:sql]). The bast relation class will ensure the appropriate adapter type for the relation is set based on the app’s ROM config. To start with, we will support SQL only. Users needing other adapters can still fall back to the existing ROM syntax.

Other ROM runtime components will go into this app/db/ directory. More on this below.

Other ROM runtime components live in subdirectories under app/db/

ROM provides a range of other components that can be used when building up a persistence layer: changesets, commands and mappers. We’ll support generating and (where required) loading these components from the following directories:

  • app/db/changesets/
  • app/db/commands/
  • app/db/mappers/

ROM will be configured explicitly by Hanami

Instead of relying on ROM’s built-in auto-registration feature, we’ll explicitly locate and register ROM’s runtime components (from app/relations/ and app/db/) as part of ROM setup. This will ensure we can use the file structures noted above.

ROM will be configured by a built-in :db provider

As part of its prepare step, the app/slice will detect whether it contains a relations/ or db/ directory, and when it does, it will register a :db provider that will configure ROM.

This will:

  • Set ROM’s inflector to be the slice’s inflector
  • Create an :sql ROM configuration
  • Provide a database URL by checking settings.database_url, and if that method does not exist, by falling back to ENV["DATABASE_URL"]
  • Explicitly load the ROM files (as noted above) rather than using .auto_registration
  • Register the ROM config object as "db.config"
  • Register the ROM container itself as "db.rom"

ROM relations will be available directly in the Hanami app container

As part of with ROM setup above, all the relations in relations/ will be registered directly in the app/slice container. So e.g. app/relations/cafes.rb will be available as app["relations.cafes"].

This is important to ensure users have a consistent experience when interacting with the Hanami containers. If we don’t do this, a large part of the persistence subsystem directory becomes a big special case that we then have to explain. In addition, we know from experience that working directly with relations can sometimes be useful, and being able to use all the standard Hanami facilities to do this (like the container and Deps mixin) makes this much more intuitive than having to fetch them via e.g. app["db.rom"].relations[:cafes].

A change in dry-system will be needed to make it possible to resolve relations from the container. To resolve a relation, the :db provider needs to have been started first, since that is what registers the relations in the first place. Currently, dry-system can implicitly start a provider at the time of lazy component resolution, but only if the provider shares the same name as the root key of the component. To address this, we’ll add provider aliases to dry-system. Thus, when Hanami’s :db provider is registered along with an alias of :relations, it will be started whenever a "relations.*" component is lazily resolved for the first time.

Custom ROM changesets, commands and mappers will not be available directly via the Hanami app container. These are either registered by ROM against specific relations (commands and mappers), or expected to be accessed by class constants (changesets), so they do not make sense to access via the container.

The user may provide their own :db provider to customize or replace the standard ROM configuration

The built-in :db provider will use a provider source, which the user can also explicitly use to create their own :db provider at config/providers/db.rb and provide their own custom configuration. This gives the user a way to tailor the ROM setup while still allowing the Hanami built-in provider to do the standard work.

# Note `.configure_provider` instead of the standard `.register_provider`
Hanami.app.configure_provider :db do
  configure do |config|
    config.database_url = "sqlite::memory"
  end
end

When a concrete config/providers/db.rb file exists in the app, Hanami will use this provider instead of registering its own.

This also means that a user may choose to opt out entirely from using ROM by having a :db provider that sets up something entirely different. For example:

# Note standard usage of register_provider, instead of configure_provider above

Hanami.app.register_provider :db do
  # Wholly distinct provider logic goes here; will not use standard :db provider
end

Slices will have independent ROM setups by default, using a single shared database

There will be two possible arrangements for working with the database via slices:

  1. A fully independent ROM setup per slice
  2. Slices sharing the ROM setup from the app

Option (1) will be the default: full independence to each slice’s ROM setup.

In this arrangement, slices (just like the app) will have relations/ and db/ directories to contain ROM’s runtime components:

slices/[slice]/relations/
slices/[slice]/db/

When Hanami detects a slice with either of these directories, it will register a distinct :db provider for that slice. It will work just like the app-level :db provider, with a couple of differences:

  • If an app-level "db.config" component is present, it will copy all the basic configuration (i.e. everything except the registered ROM runtime files: relations, etc.) from this object into a new config object for the slice’s own ROM setup. This will allow whole-app configurations to be made in one place (the app-level provider) and then shared across all the slices.
  • When registering ROM runtime components, it will only look inside the slice’s own relations/ and db/ directories.

In this arrangement, the user does need to do the extra work of defining relations for every table the slice needs to access. However, it means each slice can be made to know as little (or as much) about the database as the user decides, and it allows for slice-specific database logic to be isolated inside the slice, alongside all its other logic.

An important implementation concern for this: even with multiple per-slice ROM setups like this, if they’re all pointing to the same database, we’ll want these to share a single pool of database connections. This will require some special handling in our provider code and/or in the gem we introduce to hold our library-layer DB integration code.

A slice may use a distinct database for its independent ROM setup

A slice may not want to use the same database as the rest of the app. If the slice has its own distinct settings.database_url, then that URL will be used instead of the app-level database URL. The ENV will also be checked for a slice-specific ENV["SLICE_NAME__DATABASE_URL"] (e.g. "BILLING__DATABASE_URL" for a "billing" slice).

In this arrangement, the slice must have its own config/db/ directory to contain that database’s migrations and structure dumps.

We will not generate config/db/ within slices by default, but we can raise a warning if a slice is configured with a distinct database but is missing this directory (and provide a generator command to generate this directly).

If there are multiple slices that share a database_url that is different from the app-level database_url, then only the slice containing a config/db/ directory will be used for the purpose of running migrations and dumping schema. These slices will also share a single pool of connections to that database, just like the approach for the app-level database mentioned above.

Slices may alternatively share the app-level ROM setup and relations, etc.

The second option for slices to work with the database is for slices to share the single ROM setup configured in the app.

In this arrangement, relations and other ROM components would be defined in app/relations/ and app/db/ only, and would need to cater to usage from all slices. Slices, then, would have only their own repo classes that would use these ROM components (more on this in the section below).

This will require the "db.rom" component to be imported from the app into every slice. This can be partially achieved using existing Hanami features: by adding "db.rom" to the app’s config.shared_app_component_keys

However, a more complete implementation of this arrangement will require all the ROM components registered in the app container (i.e. everything from app/relations/) to be imported into each slice. To achieve this, we will introduce a single, more directed app-level setting. Something like:

module MyApp
  class App < Hanami::App
    config.db.import_from_parent = true
  end
end

I’d like this setting to play two purposes: when configuring it the app (like above), it will be copied over into every slice, meaning we can enable the DB sharing for all slices with just a single line. But it will also allow specific individual slices to opt out of the arrangement, inside their own slice config:

module MySlice
  class Slice < Hanami::Slice
    config.db.import_from_parent = false
  end
end

The exact name of this setting is still TBD. We want to find something that feels natural in both of the above contexts. Perhaps adding an alias like export_to_slices will make it feel more natural when used on the app.

App and slices use repos

All of the ROM setup above is in aid of enabling repos for apps and slices.

These will be generated with a single base repo class per slice.

# app/db/repo.rb

module MyApp
  module DB
    class Repo < Hanami::DB::Repo
    end
  end
end

We will provide this Hanami::DB::Repo parent class (which itself will inherit from ROM::Repository) in order to provide slice-specific configuration of these repos. This will include:

  • Injecting the slice’s "db.rom" component as a dependency; this is necessary for ROM repos to function in the first place.
  • Specifying a struct namespace for the slice’s repos (more on this in the section below)

A slice’s repo classes (subclasses of Repo) are expected to live in a repos/ subdirectory, and be named using a singular noun followed by “repo”:

app/repo.rb
app/repos/cafe_repo.rb
app/repos/review_repo.rb

The trailing “repo” name makes it so these are more self-descriptive as components when injected as deps:

# Allows this dep to be referred to naturally as `cafe_repo`
# within the class
include Deps["repos.cafe_repo"]

Unlike the ROM components in relations/ and db/, repo classes will not depend on living within this a structure. Repos are the live in the app layer, as the interface to the persistence layer (also why we will call them Hanami::Repo as opposed to Hanami::DB::Repo), so while we will provide repos/ as a conventional location, the user may choose to create repo classes wherever they like.

An empty repo class will look like:

module MyApp
  module Repos
    class CafeRepo < MyApp::DB::Repo
    end
  end
end 

If the repo’s name matches a relation name, then we will automatically configure that relation as the repo’s root relation. If not, we will not configure a root for the repo.

App and slices provide structs

Along with repos, the app and slices will provide struct classes for encapsulating the values returned by their repos.

Their base repo class will automatically configure a struct_namespace to match a Structs module inside the app or slice, the equivalent of:

module MyApp
  module DB
    class Repo < Hanami::DB::Repo
      struct_namespace MyApp::Structs
    end
  end
end

If this module does not yet exist, we will define it dynamically, so that repos work consistently in all contexts.

Structs will also have a base class:

# app/db/struct.rb

module MyApp
  module DB
    class Struct < Hanami::DB::Struct
    end
  end
end

This base struct class is in the DB namespace for two reasons:

  1. To reinforce that these structs are dynamically defined based on the results of DB queries (it is a ROM::Struct in the end)
  2. To leave the non-namespaced MyApp::Struct name available in case the app author wishes to use it for their own explicit, non-DB-defined struct classes

Each struct should be named after a singular noun and put in the structs/ subdirectory:

# app/structs/cafe.rb

module MyApp
  module Structs
    class Cafe < MyApp::DB::Struct
    end
  end
end

Standard generators will create db directories, repos, and structs

Generating a new Hanami app (as well as a new slice within an app) will generate the full complement of the files described above:

  • config/db/
  • config/db/migrate/
  • app/db/relation.rb
  • app/db/repo.rb
  • app/db/struct.rb
  • app/repos/
  • app/structs/

When generating a new app with hanami new, these options will let you tailor the output:

  • --skip-db to skip DB setup
  • --database=sqlite to use an sqlite database
  • --database=postgres to use a postgres database (the default)

When generating a slice with hanami generate slice, these options will be available:

  • --skip-db to skip DB setup
  • --app-db to share the app’s DB setup
  • --slice-db to use a distinct database setup, generating config/db/ in the slice
  • --database=sqlite to use an sqlite database (relevant only for --slice-db)
  • --database=postgres to use a postgres database (the default) (relevant only for --slice-db)

Generators for individual components will also be available.

$ bundle exec hanami generate migration [name]
$ bundle exec hanami generate relation [name]
$ bundle exec hanami generate struct [name]
$ bundle exec hanami generate changeset [name]
$ bundle exec hanami generate command [name]
$ bundle exec hanami generate mapper [name]
$ bundle exec hanami generate repo [name]

Like all other generators, these will accept a --slice argument to generate the component in the given slice.

hanami db CLI commands will provide convenient access to database operations

The spread of hanami db commands will be the same as Luca shared in his original proposal, with one adjustment to the arguments: where we previously had --database=app to target the command to the database configured in the app, and e.g. --database=admin to target the database configured in the admin slice, instead, we’ll have --app and --slice=slice_name to target the databases configured in the app and specified slice respectively. I want to reserve the --database argument for the future, when we may support multiple databases within an app/slice.

Create:

$ bundle exec hanami db create # Creates all database
$ bundle exec hanami db create --app # Creates app database only
$ bundle exec hanami db create --slice=billing # Creates "billing" slice database

Drop:

$ bundle exec hanami db drop # Drops all databases
$ bundle exec hanami db drop --app # Drops app database
$ bundle exec hanami db drop --slice=billing # Drops "billing" slice database

Migrate:

$ bundle exec hanami db migrate # Migrates all databases
$ bundle exec hanami db migrate --app # Migrates app database
$ bundle exec hanami db migrate --slice=billing # Migrates "billing" slice database

Rollback:

$ bundle exec hanami db rollback # Rolls back app database
$ bundle exec hanami db rollback 3 # Rolls back app database
$ bundle exec hanami db rollback --app # Rolls back app database
$ bundle exec hanami db rollback 3 --app # Rolls back app database
$ bundle exec hanami db rollback --slice=billing # Rolls back "billing" slice database
$ bundle exec hanami db rollback 2 --slice=admin # Rolls back "billing" slice database

Prepare:

$ bundle exec hanami db prepare # Prepares all databases
$ bundle exec hanami db prepare --app # Prepares app database
$ bundle exec hanami db prepare --slice=billing # Prepares "billing" slice database

Structure dump:

$ bundle exec hanami db structure dump # Dumps the structure for all databases
$ bundle exec hanami db structure dump --app # Dumps the structure for the app database
$ bundle exec hanami db structure dump --slice=billing # Dumps the structure for the "billing" slice database

Structure load:

$ bundle exec hanami db structure load # Loads the structure for all databases
$ bundle exec hanami db structure load --app # Loads the structure for the app database
$ bundle exec hanami db structure load --slice=billing # Loads the structure for the "billing" slice database

Version:

$ bundle exec hanami db version # Prints all database versions
$ bundle exec hanami db version --app # Prints app database version
$ bundle exec hanami db version --slice=billing # Prints "billing" slice database version

Note: many of these CLI commands are already implemented (but not activated) in hanami-cli. They’ll need adjusting for the --app/--slice arguments and overall database setups noted above.

Rake task compatibility
For compatibility with Ruby hosting vendors targeting Rails, we’ll expose a db:migrate Rake task (hidden from rake -T output, if possible):

$ bundle exec rake db:migrate # Migrates all database
$ HANAMI_APP=true bundle exec rake db:migrate # Migrates the app database
$ HANAMI_SLICE=billing bundle exec rake db:migrate # Migrates the "billing" slice database

A new hanami-db gem will contain the custom code and rom dependency

We will create a new hanami-db gem for any specific code needed to support the features above (aside from the code that needs to live in the hanami gem directly). This will also be the gem that holds the parent classes like Hanami::Repo, Hanami::DB::Relation, etc.

This gem would also manage the dependency on the rom gems, to ensure compatibility with ROM changes over time.

Removing this gem from your app’s Gemfile will also disable all of the DB integrations.

Not now: streamlined setup of multiple databases for a single ROM instance

ROM already supports connecting to multiple databases, but we won’t provide a streamlined way of configuring this for Hanami 2.2. This can be provided as a standalone enhancement in a future release.

Instead, we will look to provide simple “escape hatches” for the user who needs to use multiple databases like this.

For example, we can structure our :db provider so that the user can add an after(:prepare) hook where that can directly configure multiple gateways. We should at least have a test in our test suite to demonstrate that this works.

At minimum, the user will be able to provide their own independent :db provider to configure ROM however they need it.

The CLI commands listed above won’t handle multiple databases by default, so any app requiring this will have to provide their own wrappers in the meantime.

Any incidental support for multiple databases will be considered “nice to haves” and will not be essential for the v2.2 release.

Not now: changes to ROM

All the arrangements above should be possible with the currently available 5.3.x versions of ROM.

My preference is to avoid ROM changes wherever possible, and instead focus exclusively on the Hanami integration.


Further reading

The proposal above is an evolution of this Luca’s original proposal for the Hanami DB layer: https://discourse.hanamirb.org/t/hanami-persistence-proposal-for-2-x/782**strong text**

5 Likes

Overall, I’m really happy with your general direction here. I think this plan will be a major step forward in advancing the state of the art of persistence that has been pretty stagnant in Rails.

DATABASE_URL

A slice may not want to use the same database as the rest of the app. If the slice has its own distinct settings.database_url, then that URL will be used instead of the app-level database URL. The ENV will also be checked for a slice-specific ENV["SLICE_NAME__DATABASE_URL"] (e.g. "BILLING__DATABASE_URL" for a "billing" slice).

Slice isolation in development is tricky, because for me it entails a departure from the production environment. In production, each slice is running in an isolated pod so they can just use DATABASE_URL as usual. So essentially what we’re doing is developing the app as a modular monolith and deploying it as isolated services.

In development, I have multiple slice db configurations in place so that I can develop each slice without having to adjust my environment variables around. This problem is handled by SLICE_NAME__DATABASE_URL.

An additional problem we ran into with this pattern is dealing with test databases. We ended up implementing an optional config/database.yaml file that was instantiated as Persistence::Specification and accounted for changing the connection to the test variant.

    def test_override(uri)
      dbname = uri.path.chomp("/")

      if dbname.end_with?("_dev", "_development")
        dbname.sub(/_dev(elopment)?$/, "_test")
      else
        "#{dbname.chomp("_test")}_test"
      end
    end

    def coalesce_database_url(slice)
      if slice.settings.respond_to?(:database_url)
        slice.settings.database_url
      elsif slice.app != slice
        coalesce_database_url(slice.app)
      else
        ENV.fetch("DATABASE_URL")
      end
    end

    # Final URI accounting for all overrides
    #
    # 1. Start with config/database.yaml URI
    # 2. Fall back to DATABASE_URL
    # 3. Change DB name for test environment
    # 4. Apply any component overrides
    #
    # @return [URI::Generic]
    def to_uri
      uri = current_spec.fetch(:uri) { coalesce_database_url(@slice) }.then { URI(_1) }

      if current_spec in { database: }
        uri.path = "/#{database}"
      end

      if uri.path.chomp("/").empty?
        uri.path = "/#{@slice.app.slice_name}_#{Hanami.env}"
      end

      uri.path = test_override(uri) if Hanami.env?(:test)

      %i[hostname port scheme user password].each do |component|
        if current_spec.key?(component)
          uri.public_send "#{component}=", current_spec[component]
        end
      end

      uri
    end

This allows developers to account for individual variation in their local DB without having to construct a custom URI. The overrides are nice-to-have but the motivation is the test_override logic.

Our database.yaml is strictly development-only and not source controlled, although we also generate one inside docker-compose to help with CI.

Relations

As part of with ROM setup above, all the relations in relations/ will be registered directly in the app/slice container. So e.g. app/relations/cafes.rb will be available as app["relations.cafes"].

This is a good idea, one that I initially didn’t do but ended up adding it for the reasons you state: accessing them off of db.rom is annoying.

My persistence provider does

config.auto_registration(slice.root)
register "persistence.rom", ROM.container(config)
register("relations", memoize: true) { target["persistence.rom"].relations }

Returning to your Principle #1, “It’s just ROM”, this means that relation helpers will be written using a combination of rom-sql’s relational syntax with Sequel as an escape hatch.

This was a bit of an uphill struggle for my team, as nobody was deeply familiar with Sequel and ROM is under-documented.

I think Hanami should make an effort to document not just the simple cases of doing simple CRUD queries, but provide some insight into how to do more complex things as well, like writing a CTE.

Not having the long history of Rails’ query syntax being Q&A’d for decades is a disadvantage, but I’d say that the flexibility to use advanced PostgreSQL features is a major advantage, and one of the reasons I chose it.

I’m willing to help out on the documentation front here.

Repositories

I’ve moved away from Rails’ organizational structure over time, preferring to organize objects by business domain instead. I have observed that creating explicit directories for certain object patterns tends to discourage people from filling in blanks of whats missing, and instead shoehorning logic into places it doesn’t really belong.

But, I think the primary reason why you’re going in this direction for Repos is that you’re planning to build your own registration logic, and that would be significantly harder to do if they’re not located in a predictable place.

In my project, it would be named something more like app/cafes/repo.rb, and if I need to interact with multiple repos I will just rename them to be more explicit. I also find API stutter like app/repos/cafe_repo.rb kind of annoying, but I understand your motivation for doing that. Renaming the included dep can’t be the default posture.

Changesets & Mappers

I wrote one custom Changeset to implement upsert on a relation, these and Mappers are under-documented in ROM.

Mappers are unfortunate, being derived from transproc, which is deprecated, and having a very different interface from dry-transformer. Perhaps these should be deemphasized until ROM can be updated to incorporate the Dry version.

On the other hand, even dry-transformer is pretty hard to use. There’s room for improvement here that’s certainly out of scope for 2.2

Provider Customization

Is configure_provider a new concept? I don’t recognize it. This is very important, due to the way ROM (and Sequel) do extensions.

A sampling of things I’m doing:

config.plugin(:sql, relations: :instrumentation) do |plugin_config|
  plugin_config.notifications = target[:notifications]
end

config.gateways.each do |(_, gw)|
  gw.connection.pool.connection_validation_timeout = -1
end

config.plugin(:sql, relations: :auto_restrictions)
config.plugin(:sql, relations: :pick)

I think that last one is an extension I wrote to port Rail’s pick feature.

MyApp::Structs

This is another case where it makes sense to have a dedicated directory because contrary to the majority of the app logic, these entities should not be auto-registered into the container, because it’s ROM’s job to hydrate them.

I prefer to use MyApp::Entities because it’s not the fact they they’re a struct that is actually important; they are DB entities that happen to be structs. This might seem nit-picky, and I agree it sort of is, but I try where I can to discourage thinking about objects in terms of what they are rather than what they do.

There is also the potential confusion here between a Ruby Struct, a Dry::Struct, and a ROM::Struct, all of which are different things :tired_face:

1 Like

Thanks for your considered feedback, @alassek! And I’m really glad to here the majority of this hit the mark for you :smiley:

I hear you about this use case. The solution implicit in the current proposal—creating a .env.test containing a different DATABASE_URL (or slice-prefixed equivalent)—isn’t particularly satisfying, and does carry a risk of users targeting (or wiping) incorrect databases.

What do you think about the idea of taking your test_override logic (and thanks for sharing your code!) and building that into our standard database URL determination? This way the user could populate the development-mode DATABASE_URL only (in the standard .env file) and not have to worry about explicitly creating test-mode overrides.

We could have this logic apply only in the “test” and “development” Hanami envs too.

In terms of database config, for this release I’m keen to see if we can avoid introducing any kind of full-blown .yml-style config files, so we can learn from real world usage about what might be needed in this area. I think that by handling the test mode database URLs per the above, that’s probably good enough for now. What do you think?

100% agree. I’d love your help on documentation! I’ll reach out :slight_smile:

Actually, for repositories, this is not the case! They’re “portable” — you can put them anywhere you like. As I mentioned in the proposal:

In terms of a starter location, we have to pick something, and the organise-by-type approach is what we have so far for our standard components.

I agree with you that repos can work well when organised by concept, too, and we should make it possible to provide a full path for our CLI generator for repos, which should hopefully make it clearer these can go anywhere.

Agree that there’s big room for improvement on both of these. Of all of these advanced ROM components, I think only chagesets are worth documenting for Hanami usage, with custom mappers and commands being something we can just link over to ROM’s docs for.

Anyway, by choosing “vanilla ROM” for our Hanami integration, this should mean that improvements to ROM in this area we can just pick up for free over in Hanami apps.

Nice pickup. It’s indeed a nice concept, but it’s only a very light method wrapper around dry-system’s external provider sources feature. Here’s the code from my current rom-spike branch on Hanami:

def configure_provider(*args, **kwargs, &block)
  container.register_provider(*args, **kwargs, from: :hanami, &block)
end

So configure_provider exists merely to streamline the API for users needing to customise the Hanami’s first-party providers within their own app. Requiring users to write .register_provider :db, from: :hanami is ugly and unnecessary.

I even like .configure_provider as a more intention-revealing name for this use case, too. The users can expect the standard :db provider to be registered by default. What they’re doing in this case is configuring its behaviour.

Once you’re inside the .configure_provider block, it’s all just standard dry-system provider source customisation, so you can:

  • configure the provider (And I plan to provide a good range of built-in settings for common things)
  • Run lifecycle callbacks, like before(:prepare)
  • after(:prepare)
  • before(:start)
  • after(:start)

I’d hope this should be enough for users to tailor the standard ROM setup to their liking, but I’d definitely be keen to work with you to make sure your needs are covered, as a good litmus test.

Yep! And we’ll add this directory to the default config.no_auto_register_paths value.

This is an interesting point! To be honest, I’ve always used “entities” myself for naming these things. And in fact, we have some trace of this already with %w[entities] being the current default value for `config.no_auto_register_paths.

Why did I pick “structs” here?

  • In keeping with the “vanilla ROM” approach, structs is the name that ROM uses by default, via its ROM::Struct class as well as its struct_namespace feature.
  • If we expose Hanami::DB::Struct as the superclass for these, having the directory name match the class name is at least consistent with what we’re doing for the other existing default code directories (though I realise there’ll be value in breaking that pattern at some point).
  • “Entity” as a term carries a specific meaning in certain circles, like in DDD where an entity is meant to possess “inherent identity value”, versus value objects (aka structs) which do not. Are our repos always going to return “true” entities? Probably not, so maybe it’s better to pick a more inclusive name.

I don’t feel super strongly about this, so I’m open to considering an alternative default.

I take your point about the potential confusion with both Dry::Struct and ::Struct also existing. If we play it out, let’s say a Hanami user wanted to model some plain data objects of their own (not returned by repos or otherwise connected to the DB layer). In this case, with the MyApp::Struct name already taken by the base class inheriting from Hanami::DB::Struct, they’re kind of out of options for sensible base names. Choosing Entity here would allow Struct to be reserved for those non-DB struct use cases.

If we took the above path, I wonder if Hanami::DB::Entity also makes sense as the base class name that we expose too. @flash-gordon, @solnic — what do you reckon?

Either way, we’ll certainly make it possible for users to configure their own struct namespaces courtesy of the struct_namespace config directive available in repo classes.

We can port ROM 5.x to use dry-transformer and in the future figure out a simpler API for custom mappers. Also let’s not forget that a ROM mapper is just an object that implements #call(loaded_relation) and it’s supposed to return an array of objects, so implementing a custom mapper is trivial.

Yes exactly, I think we should stick to Struct.

I prefer “Entities”, but I don’t feel strongly about it. I especially trust @solnic’s take since he’s thought a lot about this. That said, “Entities” as a name represents what they are modeling in the real world, rather than “Structs” which is describing the data in programming terms.

Yes in DDD “Entity” has a specific meaning (i.e. it has a distinct identity), but that meaning will also very often apply for user taking data out of a RDBMS. If people want to pull identity-less objects out of the database, that’s a relatively rare event (in my experience), so they can map them to a Values namespace themselves. It may just be my lack of experience adhering to full DDD that may be revealing itself here.

Another option, which hasn’t been used much in ROM/Hanami-land, is “Record”. This obviously has overlap with Active Record, which we might want to avoid… but it’s also an important distinction, since many Ruby developers only know what “Active” records are, and haven’t experienced records that are independent of persistence.

Wikipedia even redirects from Struct to Record:
https://en.wikipedia.org/wiki/Struct redirects to https://en.wikipedia.org/wiki/Record_(computer_science) and calls it both.

But: I’m overall hesitant to add a new term that’s neither in ROM nor in Hanami’s history.

Yes, I’m willing to work on this if there’s a consensus about it. We had an incident on our team where DATABASE_URL was connected to a remote environment for diagnosis and the test suite was accidentally run against production, truncating our data. :cold_sweat:

Yes, there were multiple fuckups there for reasons I can’t go into, and the DATABASE_URL was only a contributing factor; but that’s usually how it goes when an incident happens.

I resisted introducing the YAML file until I felt it was the most expedient solution, but I agree with your position on it. I don’t really like it and would prefer to avoid adding that. If we had SLICE_NAME__DATABASE_URL support combined with an automatic test override convention, that would be good enough to satisfy our requirements. And I think our requirements are really advanced compared to the general case, so hopefully that should be good enough.

Fair. The need for generators to put it somewhere does force you to pick something. As long as Hanami doesn’t actively fight me like Rails then that’s all I would ask for. Allowing me to override the location in the generator is more than good enough.

I don’t feel strongly either, which is why I framed this as a preference. The point about Entity being A Thing in DDD is a good one that I hadn’t considered. (Then again, ActiveRecord is not really Active Record but we don’t need to be picking up their bad habits).

I feel like Struct conceptually is unfortunately overloaded, but that is also due to how generic a term it is. A generic term is preferable to a specific one used incorrectly or confusingly.

Good callout. Introducing mappers as just a lambda that does some data mutation would be much easier to grasp.

Naming really is the hardest thing in programming isn’t it?
I would stick to Struct name too, even though it carries it’s own weight - just like Entity does- that might not 100% reflect of what they will be.

My main concern would be how this integration handles test environment? Right now the custom setup for ROM with rom-factory works for multiple slices, but to me seems a bit clunky. Maybe I just can’t correctly string it all together, but if Hanami was able to provide a working ROM integration (with rom-factory working), while also setting up RSpec (which is already built in), it would be a huge boost. In Rails, always deleting test/ and switching to RSpec with factory-girl is a bit of a chore.
But is this something desired/planned by rest?

Funny thing, ROM documentation actually uses Entities namespace for structs :upside_down_face: I am personally in favour of “Entities” (after all that’s what Hanami 1 had), but I understand the arguments for “Structs”. However, I would not worry too much about DDD here. In most of the cases returned object will have the identifier + “entity” word is not trademarked by DDD.

I have some little doubts about Hanami::DB namespace. I get that Persistence is long, but I think DB is against Zeitwerk’s defaults (?) and people will always wonder whether it’s DB or Db.

  1. Slices sharing the ROM setup from the app

Not sure I understood this correctly, but does that mean I have to have app directory to make this setup work? So far, with a similar setup, I have my relations defined in lib/my_app/persistence/relations and, to be honest, I liked that. Relations are really low-level in my understanding an I’m not sure if keeping them in app (which suggests app layer) is good.

But probably there more than one way to think about relations.

The last sentence is a nice gateway to one final side-problem I see: education. The ideas of relations, repositories etc. will generally be hard to grasp for users. Not on the technical level, but more of a conceptual level. It would be great to develop some Hanami’s “recommended practices” regarding them, so people are not completely lost. Note: this does not necessarily be done by the core team, in my opinion. Might be a community effort as well.

Thanks for the good discussion about the “structs” vs “entities” naming above, folks. Let’s stick with structs for Hanami. This will also remain easy for a user to change if they prefer something different.

This is a great point, thank you for raising it, @krzykamil!

I think we should eventually aim for a fully integrated rom-factory experience, but I think for that to work well, it would need to go alongside some more structure that we supply around per-slice tests and how spec support code might target each slice. This would be a good post-2.2 exercise :slight_smile:

For 2.2 itself, I think we can aim to include a page in our documentation about rom-factory setup, along with code that people can paste into their apps.

Thanks for sharing this, @katafrakt :slight_smile:

I will be making sure that an appropriate inflection is configured and available everywhere we reference the DB constant, so I’m not particularly worried about that aspect. I think it would probably even be worth building into the dry-inflector defaults. Hanami Zeitwerk loaders are given the app’s inflector by default, so I’m not concerned about this causing issues with constant resolution.

Also, this provides an example of making acronyms work naturally for us, as opposed to simply living with awkward defaults.

In general with our default naming choices, I’m motivated by improving approachability (“DB” does feel less confronting to me than the much more formal-sounding “Persistence”), as well as picking short/snappy names that reflect terms used in real human conversation, not to mention that are easier to type out when referencing code or running shell commands.

Given all of this, how concerned are you still?

With point (2) here I’m referring to this section in my proposal:

So yes, this will mean that your ROM components will live in app/{relations, db}/ instead of lib/[your_app]/persistence/.

This is for reasons of consistency. We want relations and the other ROM components to work equally well in individual slices as well as in the app. For this to work, these files need to be in equivalent locations across each, and the one place the app and slices both share is their primary code directory: either app/ or slices/[slice_name]/. The top-level lib/ is a special place, and has no equivalent in slices.

In addition, the top-level lib/ does not have matching components registered into any containers. And because various ROM components are registered in the containers (like relations), keeping these files in app/ helps avoid any confusion over the role of the the top-level lib/ directory.

I hope that this can just be a matter of getting used to the new locations for you. I also hope that the new built-in setup will make a range of things easier for you and eliminate a bunch of boilerplate, so hopefully there’s some extra sugar to go along with the change :wink:

I do want to make sure the distinction between “app layer” and “db layer” is kept clear, and that’s why most of the ROM components live in a db/ subdirectory, as structural reinforcement of this separation. However, I did receive numerous pieces of feedback as I was preparing this proposal that having relations/ live at the top-level would be useful. And since (unlike the other ROM components) they can be usefully used on their own, this felt to me like a reasonable tweak. We’ll definitely make it clear in our documentation that repos are intended to be the primary interface to the DB.

:100:

We’ll do our best with the initial docs for 2.2, but I know that there’ll be much more to do even after these initial efforts. I’d love whatever support you and the rest of the community can lend towards making the concepts and recommended patterns clear.

Sure, but I feel that there is slight change in the philosophy going on. So far app was pretty much a regular slice, just placed in a different place in the directory structure and generated by default. Now it sounds like it’s becoming a super-slice on which other slices might depend.

Up until now I kind of thought that you either put everything in app or you put everything in separate slices, deleting app. I see this will change, which I think is fine, although will probably lead to lower adoption of slices.

The App slice has always been special in this way since 2.0. Configuration settings in App flow into the slices as defaults.

As @alassek also mentions above, this positioning (of the app serving as a “base layer” for shared components) has existed since 2.0.

By default, we share the following components from the app to all slices, courtesy of the app’s config.shared_app_component_keys: inflector, logger, notifications, rack.monitor, routes, settings.

What these have in common is that they’re all set up by the framework itself and registered in the app, without the user needing to lay down their own config or code files.

Now, what we’re seeing with the “shared ROM setup across slices” arrangement is really just an extension of this same approach, except in this case it does require the user to manage some files. But besides this, nothing else is different. This is another reason why (in this arrangement) those files live in app/, because they’re part of the set of app components shared to slices.

In terms of the impact on slice adoption, this arrangement will also not be the default. As I note in my proposal, the shared DB layer will require the user to opt in explicitly via config.

So, if users are choosing slices, then via our defaults we’ll be encouraging those slices to have their own independent DB layers.

2 Likes

@timriley, this is a massive a mount of information to absorb. :exploding_head: Kudos on the work you’ve done just to lay this out for us. I’m trying to absorb all that you and everyone else has covered here and I have some thoughts.

General

The strategy and directory structure you have outlined is eminently reasonable. I think most of us will have to work with it for some time before we can form ideas for refinement or improvement.

Regarding configuration of persistence per Slice

I love the possibility of managing persistence at the “per slice” level. I recognize the extra work this represents for you, and I am grateful for it. :pray:

I have some concern here that without guidance or some structure that misuse will lead to database coupling across slices, which will diminish the advantages of using slices to “modularize” (or encapsulate, if you prefer) segments of your application.

Separate DBs are cool, but they make one thing more difficult: querying across DBs. While cross-DB operations might be an anti-pattern we’re trying to avoid in our code, it is very useful for business reporting and other broad business information requirements (presumably all read-only, of course).

If I can offer an idea, perhaps as an intermediate step between an all-in-one database and multiple databases: SQL schemas. Schemas are SQL’s idiomatic solution to namespacing within a database. They are underutilized (in my experience), but could offer a nice way to isolate per-slice persistence without resorting to multiple databases. There are pros and cons, of course. Here are some I can think of:

Pros

  • The biggest benefit, in my mind, is the ability to build read-only systems for reporting, etc., across all schema. The same is possible across multiple databases in most DBMSs, but the ease/cost of doing so varies.
  • Could be automated based on slice name.
  • Could be enforced (so to speak) by:
    – Creating per-slice DB users with schema-specific permissions.
    – Some Hanami solution that prefixes all table operations with a schema name based on the slice name.
    – Or, perhaps, just with static analysis to generate warnings.

Cons

  • Not supported in SQLite. You would need multiple databases here, but SQLite also makes it easier than most DBMSs to query across databases.
  • It’s not clear to me how well Sequel supports multiple schema within a database. The documentation is sparse on this topic.

Jeremy Evans has discussed his use of multiple DB users in his apps, for example, to restrict DB access rights for the production user to limit the potential damage from various breaches (like SQL injection). I’ve never seen this done in Rails (not that I have a ton of experience), but I wonder if only varying the DB user name by slice might be easier to implement in conjunction with traditional ENV values. Like, could you configure the per-slice user in a setting, and draw the rest of the database URI from an ENV value (or something like that).

Or, going in a different direction, I wonder if varying only the DB user by environment (dev/test/prod) might help with the issues @alassek raised around DATABASE_URL management. In the latter case, a test or production user might be configured with access to all schemas. I’m kind of thinking out loud here. Not sure if this is workable.

Slice Generators

This is totally tangential. Are there similar switches to skip action/view stuff? Until now, I’ve just been creating slice directories manually to avoid action and view (and spec) generation, but generators would be nice for things like DB setup if I could skip actions and views.

Hanami cli db commands

This will be very cool! I didn’t see this in your DecafSucks bin/hanami script the last time I looked. Will this allow us to delete older migrations instead of saving them all? That would be nice. My migrations folder is already deep and I only have a dozen tables in my current project. Doesn’t Rails essentially dump the structure after every migration is run, and then apply future migrations against that structure file (and repeat)? Do I have that right? Would something like that be possible?

Struct v Entity

On the surface level, I like entity better. I realize that DDD is often cited for the idea that an entity inherently possess an identity, but I really don’t think that was Eric Evans’ original idea, or that we should avoid it because it’s “DDD-ish.” Heck, in my opinion, 80% of the value of DDD is derived from a reasonable set of definitions for common sense ideas that allow us to discuss and debate those ideas better.

Further, I haven’t created a DB table without an id column (with or without that actual name) since I stopped developing in MSAccess (okay, I still do Access every now and again :roll_eyes:).Can anyone give me an example of a real-world DB table that doesn’t have row identities? I sometimes use single-column, value-list tables, but with those the concept of “value” and “identity” is essentially merged. I can’t think of any other examples. Therefore, it seems to me that any “struct” created from a table with row identities will contain an identity by definition, whether or not its “identity” is recognized or used by the app.

With that said, I would like to dig a little deeper into the purpose of the structs (current name) returned from the DB from my beginner’s perspective. Will we add any behavior to them? For example, I’ve never hesitated to add additional methods to my Ruby Structs, but they are almost always just accessors that provide alternate representations of the attributes contained in the Struct. Like a Money struct that can return its value as dollars (with decimal cents) or as cents (an integer).

As soon as I add any behavior (like addition or subtraction or exchange for Money–maybe not great examples :pensive:), then I would make it a Class. In this case, I would think of the Money Class as a value type or value object. In my thinking, entities combine the concepts of identity, one or more value types (or primitives, if you prefer), and behavior.

I am just beginning to explore this idea of Values and Entities in my current Hanami project, and I confess that I don’t fully appreciate them yet, but it occurs to me that if the objects returned by the DB are just tuples of attributes that aren’t intended to encapsulate any behavior, then maybe struct or record is a better name regardless of whether they contain an identity, and we can reserve Value and Entity for objects that encapsulate behavior and are constructed with (or passed) their matching struct.

Hanami philosophy around app

As you point out, app already does double-duty as both the “starter” slice and the container for framework-level constructs needed by all slices (should a project implement slices). I would point out for others a third important difference (that I’ve raised elsewhere): that app is not permitted to import components from other slices. This is another important distinction. (While totally off topic for this 2.2 discussion, I yearn for this to be changed so that I can continue to use app to contain my actions and views, and reserve slices for my domain logic alone. A topic for another day . . . )

A closing offer

I hope the above is helpful. As soon as I post this, I will post a separate proposal: I would like to issue an open invitation to pair with anyone working on any Hanami issue, whether it be documentation, or any of the gems. I would like to contribute more, but I have little experience and no idea of where or how to start. I’ll put full info in the other post.

Thank You!

I think this is pretty explicit already simply by virtue of the DAG structure of Slices.

Relations are the object that represent your Database objects. They define what tables you access and how the data types are coerced into Ruby types, and how they relate to one another.

If multiple Slices share the same DB data, then they will either 1) share the literal Relation object via a parent Slice, or 2) define their own relationship to this data separately. I can see arguments for both.

I model all my data as use-cases. This avoids a common pitfall of Rails models where they accumulate data from all use-cases and you’re always generally exposed to this expansion in role. This happens because the Active Record pattern defines one class per DB table.

The major conceptual difference with ROM is that you should always model your data to the specific use it’s being applied. So even though it might be expedient to colocate certain data in the DB, this is entirely abstracted from the application layer. Thus the locality of this data can change without impacting business logic.

While many ORMs focus on objects and state tracking, ROM focuses on data and transformations. Users of ROM implement Relations, which give access to data. Then using the relations you can associate them with other relations and query the data using features offered by the datastore. Once raw data has been loaded, it gets coerced into configured data types, and from there can be mapped into whatever format is needed by the application domain, including plain ruby hashes, convenient ROM Structs, or custom objects.
The important concept above is that during the entire process there is no dirty tracking, no identity management, and no mutable state. Just pure data being loaded and mapped as a result of a direct request made from the application domain. Data can be persisted in ways that take advantage of the features provided by the datastore, and the application domain can receive that data in any form it needs.

rom-rb.org: “Why use ROM”

Funny you should mention this, because cross-datasource joins are an explicit design built into ROM.

It’s under-documented, and I don’t have practical experience doing it so it may not have good ergonomics. But the foundation is there to make this possible.

The advantage of using schemas would be total isolation without needing separate connections. (This is implementation-specific, of course. I am referring to PostgreSQL) Database access is handled at the connection level so in order to connect to multiple logical Databases you need multiple connection pools.

The downside is that querying across schemas is only slight easier; you can’t write a SQL query that will access all schemas at once, you have to manually UNION it all, or do it in the app layer. I think it would be less effort to combine the data in ROM, in which case schemas don’t benefit you here.

This is the use-case for schemas that makes the most sense to me: you need strict data isolation between slices but not process isolation. In my (admittedly weird) case I am running slices as separate API services so we just define separate logical databases for them and use entirely different credentials.

The current way to vary Database connections by Slice is with SLICE_NAME__DATABASE_URL, and you are capable of changing the username per slice there if you need. Overriding specific parts of the URI in a configurable way would require a new concept that doesn’t exist in the plan currently, and it would definitely increase the complexity of this.

My personal rule of thumb here: ROM structs are data only and do not contain business logic. This means that methods on the struct are limited to just presenting the data. For instance, I have a Certificate struct that represents an X.509 cert. I have a method on the struct that provides an OpenSSL::X509::Certificate from the Base64-encoded data. I have another method that extracts the expiration data and presents it as a Ruby Time object.

Your example of providing currency values in different denomination are a good abstract example of this, although in practice I would always do it with the money gem instead by transforming unit + currency into a Money object.

1 Like

MINI RANT: Seriously, you need to blog. You have a massive amount of practical experience that I would love to learn from. Example: you mentioned your Either helper in passing in the dry-rb Discourse, and I had to Google to find more complete information in the Rails Discourse, and then just last week you dropped it into a convo here–as in, here in this Discourse. :face_with_monocle: Just write a blog, man. Or a Udemy course or something. If you don’t want the hassle, I’m down to help. Seriously. Rant over.

I would love to learn more about the way you model data. I relate what you are saying about ActiveRecord to what Scott Bellware (of Eventide) calls the “monolithic database” (not a compliment). But I’m not clear on how the practices you describe help to avoid potential database coupling across slices (not just down the acyclic graph) if a single database is available to all slices without additional guardrails.

I was aware of this in the docs, but I haven’t used it at all or looked at the code. I’m guessing that it’s implemented in ROM, rather than the DBMS (Postgres needs a specific extension, postgres_fdw installed to do cross-database joins). Schemas let you do this all in one database. Who knows, in this scenario, maybe cross-schema queries could be implemented through this feature in ROM.

I hadn’t considered the effect of the schema approach on connection pools. It sounds like you are saying it would be a good thing?

This I don’t understand. I’m talking about doing JOINs across tables, not UNIONs. I meant tables in one slice/schema that hold primary key references to tables in another slice/schema. I expect this to be a common occurrence in my projects. With the proper permissions, a “report” or “super” user could perform such joins across schema.

I haven’t actually used schemas (because all of my previous Postgres experience was through ActiveRecord), but they look like just the right tool here. This is from the Postgres docs:

Unlike databases, schemas are not rigidly separated: a user can access objects in any of the schemas in the database they are connected to, if they have privileges to do so.

There are several reasons why one might want to use schemas:

  • To allow many users to use one database without interfering with each other.
  • To organize database objects into logical groups to make them more manageable.
  • Third-party applications can be put into separate schemas so they do not collide with the names of other objects.

Schemas are analogous to directories at the operating system level, except that schemas cannot be nested.

Referencing schema objects (like tables) is as easy as using a “qualified name consisting of the schema name and table name separated by a dot.”

In fact, we could even avoid the inconvenient dot references to schemas by exploiting Postgres’ Schema Search Path, which always looks first for a schema with the same name as the current DB user. This implies a simple convention that we could exploit:

Slice Name == Schema Name == Slice DB Username.

Nice! The “app” DB user could retain all of its super-user privileges–and even create the Slice DB User automatically when generating a slice. The “app” DB user, or a similar higher-privileged user would retain the ability to query across schemas to aggregate data for whatever business purpose arises.

Lol, yes you are right of course. You’re not the first to tell me this. Currently experiencing decision paralysis about how to publish, and not having any design chops whatsoever but enough taste to care about it.

See also Sandi Metz: Death Star Anti-Pattern

Complexity can be like a gravity well in your codebase that accumulates more complexity faster than you can refactor it. The solution to this problem is the Bounded Context idea from DDD, and Slices are a very nice implementation of this.

I previously tried this with Rails Engines and… it works but it’s extremely hard to do. Packwerk is another attempt.

My approach to modeling data is not unique, it’s basically just following the advice from @solnic on data-oriented behavior. He is working on an online course on the subject.

1 Like