Database Schema
OneSelect uses SQLite as its database backend. The schema is managed through Alembic migrations and consists of four main tables for managing users, projects, features, and comparisons.
Overview
The database schema supports: - User authentication (local and OAuth) - Multi-project organization - Bayesian scoring for features in two dimensions (complexity and value) - Binary and graded comparison modes - Soft deletion for comparisons (audit trail)
Database Initialization Process
OneSelect uses a manual initialization approach that requires explicit setup before first use. The database is not automatically created when the application starts - this provides better control over schema management and prevents accidental database creation in production environments.
How It Works:
- Schema Creation (Alembic Migrations)
- The database schema is defined in SQLAlchemy models (
app/models/) - Alembic migrations in
alembic/versions/contain the SQL operations to create tables - Running
poetry run alembic upgrade headexecutes all migrations and creates the database file - If the database file doesn't exist, Alembic creates it and applies all migrations from scratch
-
If the database exists, Alembic only applies new migrations that haven't been run yet
-
Initial Data Population
- After schema creation,
app/initial_data.pycreates the default admin user - The script checks if an admin user already exists to avoid duplicates
-
This is a separate step because schema and seed data have different lifecycles
-
First-Time Setup Workflow
Local Development:
# 1. Install dependencies
poetry install
# 2. Create database schema (creates oneselect.db if it doesn't exist)
poetry run alembic upgrade head
# 3. Create admin user
poetry run python app/initial_data.py
# 4. Start the application
poetry run uvicorn app.main:app --reload
Docker/Container Deployment:
# 1. Build container image
make container-build
# 2. Start services (database is created in /app/data/oneselect.db volume)
make container-up
# 3. Run migrations inside container
podman exec oneselect-backend poetry run alembic upgrade head
# 4. Create admin user inside container
podman exec oneselect-backend poetry run python app/initial_data.py
Note: Container deployments may use an entrypoint script to automate steps 3-4 on first launch.
- Application Behavior Without Database
- If you start the application (
uvicorn app.main:app) without creating the database first, API requests will fail with database connection errors - The application does not automatically run migrations on startup
-
This design prevents accidental schema changes in production
-
Detecting Database State
- No database file: Alembic will create a new SQLite file when running
upgrade head - Database exists, no schema: Alembic creates the
alembic_versiontable and applies all migrations - Database exists with old schema: Alembic reads
alembic_version, determines which migrations are pending, and applies only those -
Database fully up-to-date: Alembic reports "No migrations to apply"
-
Zero-Downtime Updates
Why Manual Initialization?
- Safety: Prevents accidental database creation or schema changes in production
- Control: Explicit migration step makes schema changes visible in deployment pipelines
- Flexibility: Allows dry-run testing with
alembic upgrade head --sql - Audit Trail: Clear record of when and how the database was initialized
- Zero-Trust: Application never assumes database structure - migrations must be run explicitly
Default Credentials (created by app/initial_data.py):
- Username: admin
- Password: admin (configurable via FIRST_SUPERUSER_PASSWORD environment variable)
- Email: admin@example.com (configurable via FIRST_SUPERUSER environment variable)
Entity Relationship Diagram
┌─────────────┐
│ users │
└──────┬──────┘
│
│ owner_id
│
▼
┌─────────────┐ ┌──────────────┐
│ projects │◄────────┤ features │
└──────┬──────┘ └──────┬───────┘
│ │
│ │
│ project_id │ feature_a_id
│ │ feature_b_id
▼ ▼
┌──────────────────────────────┐
│ comparisons │
└──────────────────────────────┘
Tables
users
Stores user accounts with support for local authentication and Google OAuth.
| Column | Type | Nullable | Default | Description |
|---|---|---|---|---|
id |
STRING (UUID) | NO | - | Primary key |
username |
STRING | NO | - | Unique username (used for login) |
email |
STRING | YES | - | User email (unique) |
hashed_password |
STRING | YES | - | Bcrypt hashed password (NULL for OAuth users) |
is_active |
BOOLEAN | YES | 1 |
Account active status |
is_superuser |
BOOLEAN | YES | 0 |
Admin privileges flag |
role |
STRING | YES | 'user' |
User role ('user' or 'root') |
display_name |
STRING | YES | - | Optional display name |
avatar_url |
STRING | YES | - | URL to user avatar image |
google_id |
STRING | YES | - | Google OAuth user ID (unique) |
auth_provider |
STRING | YES | 'local' |
Authentication provider ('local' or 'google') |
Indexes:
- Primary key on id
- Unique index on username
- Unique index on email
- Unique index on google_id
Notes:
- OAuth users have auth_provider='google', google_id set, and hashed_password NULL
- Local users have auth_provider='local' and hashed_password set
projects
Stores projects that contain features to be compared.
| Column | Type | Nullable | Default | Description |
|---|---|---|---|---|
id |
STRING (UUID) | NO | - | Primary key |
name |
STRING | NO | - | Project name |
description |
STRING | YES | - | Project description |
created_at |
DATETIME(TZ) | YES | CURRENT_TIMESTAMP |
Creation timestamp |
owner_id |
STRING (UUID) | NO | - | Foreign key to users.id |
total_comparisons |
INTEGER | NO | 0 |
Total number of comparisons made |
complexity_avg_variance |
FLOAT | NO | 1.0 |
Average variance across complexity scores |
value_avg_variance |
FLOAT | NO | 1.0 |
Average variance across value scores |
comparison_mode |
STRING | NO | 'binary' |
Comparison mode ('binary' or 'graded') |
Indexes:
- Primary key on id
- Index on name
Constraints:
- Foreign key: owner_id → users.id
- Check constraint: comparison_mode IN ('binary', 'graded')
Notes:
- comparison_mode='binary': Simple A vs B vs Tie comparisons
- comparison_mode='graded': 5-point scale (a_much_better, a_better, equal, b_better, b_much_better)
features
Stores features/requirements to be compared within a project.
| Column | Type | Nullable | Default | Description |
|---|---|---|---|---|
id |
STRING (UUID) | NO | - | Primary key |
name |
STRING | NO | - | Feature name |
description |
STRING | YES | - | Feature description |
project_id |
STRING (UUID) | NO | - | Foreign key to projects.id |
tags |
JSON | YES | - | Array of tag strings |
complexity_mu |
FLOAT | NO | 0.0 |
Bayesian mean score for complexity dimension |
complexity_sigma |
FLOAT | NO | 1.0 |
Bayesian standard deviation for complexity |
value_mu |
FLOAT | NO | 0.0 |
Bayesian mean score for value dimension |
value_sigma |
FLOAT | NO | 1.0 |
Bayesian standard deviation for value |
created_at |
DATETIME(TZ) | YES | CURRENT_TIMESTAMP |
Creation timestamp |
updated_at |
DATETIME(TZ) | YES | - | Last update timestamp |
Indexes:
- Primary key on id
- Index on name
Constraints:
- Foreign key: project_id → projects.id
Notes:
- Bayesian scores: mu represents the estimated score, sigma represents uncertainty
- Lower sigma = higher confidence in the score
- Scores are updated after each comparison using Bradley-Terry model
comparisons
Stores pairwise comparison results between features.
| Column | Type | Nullable | Default | Description |
|---|---|---|---|---|
id |
STRING (UUID) | NO | - | Primary key |
project_id |
STRING (UUID) | NO | - | Foreign key to projects.id |
feature_a_id |
STRING (UUID) | NO | - | Foreign key to features.id (first feature) |
feature_b_id |
STRING (UUID) | NO | - | Foreign key to features.id (second feature) |
choice |
STRING | NO | - | Comparison result ('feature_a', 'feature_b', or 'tie') |
dimension |
STRING | NO | - | Comparison dimension ('complexity' or 'value') |
strength |
STRING | YES | - | Graded comparison strength (NULL for binary) |
user_id |
STRING (UUID) | YES | - | Foreign key to users.id (who made comparison) |
created_at |
DATETIME(TZ) | YES | CURRENT_TIMESTAMP |
Creation timestamp |
deleted_at |
DATETIME(TZ) | YES | - | Soft deletion timestamp (NULL if active) |
deleted_by |
STRING (UUID) | YES | - | Foreign key to users.id (who deleted) |
Indexes:
- Primary key on id
- Composite index on (project_id, dimension, deleted_at) for efficient querying
Constraints:
- Foreign key: project_id → projects.id
- Foreign key: feature_a_id → features.id
- Foreign key: feature_b_id → features.id
- Foreign key: user_id → users.id
- Foreign key: deleted_by → users.id
- Check constraint: strength IS NULL OR strength IN ('a_much_better', 'a_better', 'equal', 'b_better', 'b_much_better')
Notes:
- Binary mode: strength is NULL, only choice is used
- Graded mode: strength indicates comparison intensity, choice is derived from strength
- Soft deletion: Comparisons are not physically deleted; deleted_at is set for audit trail
- Strength values:
- a_much_better: Feature A is significantly better (2.0x Bayesian update multiplier)
- a_better: Feature A is better (1.0x multiplier)
- equal: Features are roughly equal (configurable multiplier, default 0.8x)
- b_better: Feature B is better (1.0x multiplier)
- b_much_better: Feature B is significantly better (2.0x multiplier)
Alembic Migrations
The schema is version-controlled using Alembic. The current migration is:
001_init: Initial schema creation with all tables
Running Migrations
# Upgrade to latest schema
poetry run alembic upgrade head
# View migration history
poetry run alembic history
# Rollback one version
poetry run alembic downgrade -1
Common Queries
Get all features with scores for a project
SELECT
f.id,
f.name,
f.description,
ROUND(f.complexity_mu, 3) AS complexity_score,
ROUND(f.complexity_sigma, 3) AS complexity_confidence,
ROUND(f.value_mu, 3) AS value_score,
ROUND(f.value_sigma, 3) AS value_confidence
FROM
features f
WHERE
f.project_id = 'YOUR-PROJECT-ID'
ORDER BY
f.complexity_mu DESC;
Count active comparisons by dimension
SELECT
dimension,
COUNT(*) as comparison_count
FROM
comparisons
WHERE
project_id = 'YOUR-PROJECT-ID'
AND deleted_at IS NULL
GROUP BY
dimension;
Find users with most comparisons
SELECT
u.username,
COUNT(c.id) as comparison_count
FROM
users u
INNER JOIN comparisons c ON u.id = c.user_id
WHERE
c.deleted_at IS NULL
GROUP BY
u.id, u.username
ORDER BY
comparison_count DESC;
Get comparison history with feature names
SELECT
c.created_at,
fa.name as feature_a,
fb.name as feature_b,
c.choice,
c.dimension,
c.strength,
u.username
FROM
comparisons c
INNER JOIN features fa ON c.feature_a_id = fa.id
INNER JOIN features fb ON c.feature_b_id = fb.id
LEFT JOIN users u ON c.user_id = u.id
WHERE
c.project_id = 'YOUR-PROJECT-ID'
AND c.deleted_at IS NULL
ORDER BY
c.created_at DESC
LIMIT 50;
Database File Location
The SQLite database file is located at:
- Development: ./oneselect.db (project root)
- Docker: /app/data/oneselect.db (persisted in volume)
Backup and Maintenance
Manual Backup
# Create backup
sqlite3 oneselect.db ".backup oneselect_backup_$(date +%Y%m%d).db"
# Or use cp
cp oneselect.db oneselect_backup_$(date +%Y%m%d).db
Database Size and Integrity
# Check database size
ls -lh oneselect.db
# Run integrity check
sqlite3 oneselect.db "PRAGMA integrity_check;"
# Optimize database (vacuum)
sqlite3 oneselect.db "VACUUM;"
Performance Considerations
- Indexes: All foreign keys and frequently queried columns are indexed
- Composite Index:
(project_id, dimension, deleted_at)on comparisons table optimizes progress queries - Soft Deletion:
deleted_at IS NULLfilters are fast due to index - JSON Tags: SQLite's JSON1 extension is used for feature tags (available by default in Python's sqlite3)
Future Schema Changes
Alembic supports two approaches for creating migrations: manual and automatic (autogenerate).
Manual Migrations
Manual migrations require you to explicitly specify all schema changes in Python code. This is the default when you create a new migration:
This creates a new file in alembic/versions/ with empty upgrade() and downgrade() functions. You must manually write the schema changes:
def upgrade() -> None:
# Add your schema changes here
op.add_column('features', sa.Column('priority', sa.Integer(), nullable=True))
op.create_index('ix_features_priority', 'features', ['priority'])
def downgrade() -> None:
# Add reverse operations here
op.drop_index('ix_features_priority', table_name='features')
op.drop_column('features', 'priority')
Pros: - Full control over migration operations - Can include data migrations and complex transformations - Explicit and clear what changes are being made
Cons: - Tedious and error-prone for large schema changes - Easy to forget constraints or indexes - Must manually write both upgrade and downgrade operations
Automatic Migrations (Autogenerate)
Alembic can automatically detect schema changes by comparing your SQLAlchemy models against the current database schema:
# Generate migration with automatic change detection
poetry run alembic revision --autogenerate -m "add_priority_field"
How Autogenerate Works:
- Model Inspection: Reads your SQLAlchemy model definitions from
app/models/ - Database Inspection: Connects to the database and reads the current schema using SQLAlchemy's reflection capabilities
- Diff Generation: Compares the two schemas and generates migration operations for detected differences
- Code Generation: Creates a migration file with
upgrade()anddowngrade()functions
What Autogenerate Can Detect:
✅ Detected automatically: - New tables - Removed tables - New columns - Removed columns (if configured) - Column type changes - Column nullable changes - New indexes - Removed indexes
⚠️ Requires manual intervention: - Column renames (appears as drop + add) - Table renames (appears as drop + add) - Changes to check constraints - Changes to column defaults in some cases - Data migrations or transformations - Complex multi-step operations
Important Limitations:
Autogenerate is not perfect and you should always review the generated migration file:
# After running autogenerate, review the file
cat alembic/versions/XXXX_add_priority_field.py
# Edit if needed, then apply
poetry run alembic upgrade head
Example Workflow:
# 1. Update your model in app/models/feature.py
class Feature(Base):
# ... existing fields ...
priority = Column(Integer, nullable=True, index=True) # NEW FIELD
# 2. Generate migration automatically
poetry run alembic revision --autogenerate -m "add_feature_priority"
# 3. Review generated file (alembic/versions/XXXX_add_feature_priority.py)
# The file will contain something like:
def upgrade() -> None:
op.add_column('features', sa.Column('priority', sa.Integer(), nullable=True))
op.create_index(op.f('ix_features_priority'), 'features', ['priority'], unique=False)
def downgrade() -> None:
op.drop_index(op.f('ix_features_priority'), table_name='features')
op.drop_column('features', 'priority')
# 4. If it looks correct, apply it
poetry run alembic upgrade head
Best Practices
- Always Review Autogenerated Migrations: Even with autogenerate, check the generated code before applying
- Use Autogenerate for Simple Changes: Column additions, type changes, new indexes
- Use Manual for Complex Changes: Data migrations, multi-step transformations, renames
- Test Migrations: Run
upgradeanddowngradein a test environment before production - One Logical Change Per Migration: Don't combine unrelated schema changes
- Meaningful Messages: Use descriptive migration messages:
alembic revision -m "add_user_avatar_support"
Common Migration Operations
# Add a column
op.add_column('table_name', sa.Column('column_name', sa.String(), nullable=True))
# Remove a column
op.drop_column('table_name', 'column_name')
# Change column type
op.alter_column('table_name', 'column_name', type_=sa.Integer())
# Make column non-nullable (with default for existing rows)
op.execute("UPDATE table_name SET column_name = 'default' WHERE column_name IS NULL")
op.alter_column('table_name', 'column_name', nullable=False)
# Add an index
op.create_index('ix_table_column', 'table_name', ['column_name'])
# Add a foreign key
op.create_foreign_key('fk_table_other', 'table_name', 'other_table', ['other_id'], ['id'])
# Rename a column (SQLite requires table recreation)
# For SQLite, this is complex - see Alembic docs for batch operations
# Data migration example
op.execute("UPDATE features SET priority = 0 WHERE priority IS NULL")
Configuration for Autogenerate
In alembic/env.py, you can configure what autogenerate detects:
context.configure(
connection=connection,
target_metadata=target_metadata,
compare_type=True, # Detect column type changes
compare_server_default=True, # Detect default value changes
include_schemas=True, # Include schema names
)