Why Deterministic Seeding Matters
Random test data is the enemy of reproducible tests. When your seed data changes on every run, you get:
- Flaky tests — assertions that pass on Tuesday and fail on Wednesday - Debugging nightmares — "it worked on my machine" because the seed was different - Snapshot mismatches — UI tests that break because the mock name changed from "John" to "Maria"
Deterministic seeding means: given the same schema and the same seed value, you get the exact same data every time. Same names, same emails, same foreign key relationships. Your tests become rock-solid and your CI pipeline stops lying to you.
Schema-First Approach
The Antimass Labs SQL Seeder takes a schema-first approach. You define your table structure, and it generates valid INSERT INTO statements that respect your constraints.
Let's start with a realistic e-commerce schema:
CREATE TABLE categories (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
slug VARCHAR(100) UNIQUE NOT NULL
);
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(200) NOT NULL,
price DECIMAL(10, 2) NOT NULL CHECK (price > 0),
category_id INTEGER REFERENCES categories(id),
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
product_id INTEGER REFERENCES products(id),
quantity INTEGER NOT NULL CHECK (quantity > 0),
total DECIMAL(10, 2) NOT NULL,
status VARCHAR(20) DEFAULT 'pending',
ordered_at TIMESTAMP DEFAULT NOW()
);Generated INSERT Statements
Feed this schema into the SQL Seeder tool. It analyzes your foreign key relationships and generates inserts in the correct dependency order — categories first, then products, then orders:
-- Categories (no dependencies)
INSERT INTO categories (id, name, slug) VALUES
(1, 'Electronics', 'electronics'),
(2, 'Furniture', 'furniture'),
(3, 'Clothing', 'clothing');
-- Products (depends on categories)
INSERT INTO products (id, name, price, category_id, created_at) VALUES
(1, 'Wireless Headphones', 79.99, 1, '2026-01-15T10:30:00Z'),
(2, 'Standing Desk', 599.99, 2, '2026-01-16T14:20:00Z'),
(3, 'Mechanical Keyboard', 149.99, 1, '2026-01-17T09:15:00Z'),
(4, 'Cotton T-Shirt', 29.99, 3, '2026-01-18T11:45:00Z'),
(5, 'Ergonomic Chair', 449.99, 2, '2026-01-19T16:00:00Z');
-- Orders (depends on products)
INSERT INTO orders (id, product_id, quantity, total, status, ordered_at) VALUES
(1, 1, 2, 159.98, 'completed', '2026-02-01T08:00:00Z'),
(2, 3, 1, 149.99, 'pending', '2026-02-02T12:30:00Z'),
(3, 5, 1, 449.99, 'shipped', '2026-02-03T15:45:00Z');Foreign Key Resolution
The SQL Seeder handles the most error-prone part of database seeding: foreign key resolution. When it generates a value for products.category_id, it only picks from IDs that exist in the categories table. This guarantees:
- No orphaned records - No constraint violations - No manual cross-referencing
The algorithm walks your schema's dependency graph (a topological sort of the foreign key references) and generates tables in the correct order. Circular references are detected and flagged.
Using the Generated SQL
Copy the generated SQL and pipe it directly into your Postgres instance:
# Direct pipe
psql -h localhost -U postgres -d testdb < seed.sql
# Or use it in a migration script
cat << 'EOF' > migrations/seed_test_data.sql
BEGIN;
-- Paste generated SQL here
COMMIT;
EOF
# Run in your CI pipeline
psql $DATABASE_URL -f migrations/seed_test_data.sqlDeterministic Seeds in CI/CD
For true determinism, use the same seed value across runs. The SQL Seeder's seed parameter ensures byte-identical output:
1. Generate your seed SQL once using a fixed seed value
2. Commit the .sql file to your repository
3. Run it as a pre-test step in your CI pipeline
This gives you reproducible test fixtures that any developer on your team can replicate locally. No more "works on my machine" — the data is version-controlled alongside your tests.
Combined with the Data Generator for generating the initial fixture data, you get a complete pipeline: generate realistic data → export as schema-aware SQL → seed your test database.