Database Migrations with Alembic#
We use Alembic for database migrations. Alembic compares the database schema defined in the models to the current state of the database and generates revision files containing the changes needed to update the schema.
Docs: https://alembic.sqlalchemy.org/en/latest/index.html
Migration Workflow (Separated: DEV Test and PROD Deploy)#
Warning
Be extremely careful to never mix up DEV and PROD.
DATABASE_URL is what determines which database you are operating on.
Before any command that can change schema/data, verify:
1) DATABASE_URL points to the intended RDS instance (DEV or PROD)
2) alembic current matches the intended environment
A) DEV Migration + Testing (Steps 1–6)#
Make your changes to the models
Make your desired changes to the models in
sds_data_manager/lambda_code/SDSCode/database/models.py.Create a revision (DEV ONLY)
This compares the database RDS instance in aws to the DEV schema and generates a migration file with
upgradeanddowngrade.Important
Never create revisions on PROD.
export DATABASE_URL=dev_database_url alembic revision --autogenerate -m "description"
This generates a new file in
alembic/versions/. Always review the file before applying — Alembic cannot detect all changes. See what it misses here: https://alembic.sqlalchemy.org/en/latest/autogenerate.html#what-does-autogenerate-detect-and-what-does-it-not-detectPreview the SQL (Dry Run)
To see the SQL that would be run without actually applying it:
export DATABASE_URL=dev_database_url alembic upgrade head --sql
Apply migration to DEV
export DATABASE_URL=dev_database_url alembic current alembic upgrade head
Test on DEV
- Test your changes thoroughly to ensure the migration works as expected. Connect to the database on DataGrip and
verify the schema changes.
Downgrade DEV to verify downgrade path
export DATABASE_URL=dev_database_url alembic downgrade -1
Make sure the downgrade works as expected.
B) PROD Verification + Deployment (Steps 7–10)#
Commit and push migration file + code changes
Open PR, review, approve, merge
First re-check DEV after merge:
export DATABASE_URL=dev_database_url alembic current alembic upgrade head
Then repeat Step 3 (Dry Run), but on PROD:
Warning
PROD safety gate (must all be true): -
DATABASE_URL=prod_database_url- URL host/DB clearly match the PROD RDS instance -alembic currentis the expected PROD revisionexport DATABASE_URL=prod_database_url alembic current alembic upgrade head --sql alembic upgrade head
Post-deploy monitoring (DEV + PROD)
Confirm both environments are healthy and at expected revision: run
alembic currentand functional checks.
Setup (This is performed once)#
Ensure you are in the root of the sdc-data-manager repo.
alembic init alembic
Edit alembic/env.py and add the following at the top:
# Override the URL from environment variable
# Do not hardcode the DATABASE_URL in the config file for security reasons
config.set_main_option("sqlalchemy.url", os.environ["DATABASE_URL"])
Import the database schema and point target_metadata to Base.metadata:
from sds_data_manager.lambda_code.SDSCode.database.models import Base
target_metadata = Base.metadata
Connecting to the Database#
The DATABASE_URL follows this format:
driver://username:password@host:port/database_name
We have different databases for DEV and PROD, so you will need to set the DATABASE_URL environment variable before
running any Alembic commands.
To find the credentials:
1. Log into AWS account (dev or prod)
2. Go to Secrets Manager → Secrets → sdp-database-cred
3. Click Retrieve secret value
4. Construct the URL from those values
5. Verify the URL points to the intended RDS instance (DEV vs PROD)
Export the URL:
export DATABASE_URL=postgresql://username:password@host:port/database_name
Test the connection:
alembic current
Warning
Always double check which database your DATABASE_URL is pointing to before running any commands.
DATABASE_URL is the environment boundary between DEV and PROD.
Always test on DEV first before applying to PROD.
Downtime Considerations#
Most schema changes in PostgreSQL are non-blocking, but some operations will lock tables and cause downtime:
Adding a NOT NULL column without a default — locks the table
Dropping a column — locks the table briefly
Renaming a column or table — locks the table
Adding a non-concurrent index — locks the table for writes
Changing a column type — locks the table and may require a full rewrite
To minimize downtime on PROD, consider running heavy migrations during off-hours or using CONCURRENTLY
for index operations where possible.
Troubleshooting#
Can’t locate revision identified by ‘xxxx’#
The database has a revision recorded that no longer has a corresponding file. This usually means a migration file was deleted after being applied. Connect directly to the database and clear the version table:
DELETE FROM alembic_version;
Then stamp to a known good revision:
alembic stamp head
Target database is not up to date#
There are pending migrations. Run alembic current to see where the DB is, then alembic upgrade head
to apply.