View Source EctoSparkles.DataMigration behaviour (Bonfire v0.9.10-classic-beta.169)
A behaviour implemented by our data migrations (generally backfills).
Based on A microframework for backfill migrations in Elixir's Ecto, in turn based on David Bernheisel's template for deterministic backfills.
A data migration using this behaviour may look like this (which you can put simply put in Ecto migrations, eg. priv/repo/migrations/priv/repo/migrations/20231019004944_data_onboarding_step.exs
):
defmodule MyApp.Repo.Migrations.BackfillOnboardingStep do
alias EctoSparkles.DataMigration
use DataMigration
@impl DataMigration
def base_query do
# NOTE: This works in cases where:
# 1. The data can be queried with a condition that not longer applies after the migration ran, so you can repeatedly query the data and update the data until the query result is empty. For example, if a column is currently null and will be updated to not be null, then you can query for the null records and pick up where you left off.
# 2. The migration is written in such a way that it can be ran several times on the same data without causing data loss or duplication (or crashing).
from(u in "users", # Notice how we do not use Ecto schemas here.
where: is_nil(u.onboarding_step),
select: %{id: u.id}
)
end
@impl DataMigration
def config do
%DataMigration.Config{batch_size: 100, throttle_ms: 1_000, repo: MyApp.Repo}
end
@impl DataMigration
def migrate(results) do
Enum.each(results, fn %{id: user_id} ->
# hooks into a context module, which is more likely to be kept up to date as the app evolves, to avoid having to update old migrations
user_id
|> MyApp.Users.set_onboarding_step!()
end)
end
end
Summary
Callbacks
The core of the query you want to use to SELECT a map of your data.
The DataMigration.Runner
will take care of limiting this to a batch size, ordering
it by row ID, and restricting it to rows you haven't yet handled.
The query must select a map, and that map must have an :id
key for the
migration runner to reference as the last-modified row in your table.
The callback to operate on a result set from your query.
Implementers should raise
an error if you're unable to process the batch.
Callbacks
@callback base_query() :: Ecto.Query.t()
The core of the query you want to use to SELECT a map of your data.
The DataMigration.Runner
will take care of limiting this to a batch size, ordering
it by row ID, and restricting it to rows you haven't yet handled.
The query must select a map, and that map must have an :id
key for the
migration runner to reference as the last-modified row in your table.
@callback config() :: EctoSparkles.DataMigration.Config.t()
The callback to operate on a result set from your query.
Implementers should raise
an error if you're unable to process the batch.