How to Seed Postgres Test Data Deterministically

Generate deterministic INSERT statements for PostgreSQL test databases. Learn schema-first seeding with foreign key resolution, realistic data types, and reproducible test fixtures using the Antimass Labs SQL Seeder.

Why Deterministic Seeding Matters

Random test data is the enemy of reproducible tests. When your seed data changes on every run, you get:

- Flaky tests — assertions that pass on Tuesday and fail on Wednesday - Debugging nightmares — "it worked on my machine" because the seed was different - Snapshot mismatches — UI tests that break because the mock name changed from "John" to "Maria"

Deterministic seeding means: given the same schema and the same seed value, you get the exact same data every time. Same names, same emails, same foreign key relationships. Your tests become rock-solid and your CI pipeline stops lying to you.

Schema-First Approach

The Antimass Labs SQL Seeder takes a schema-first approach. You define your table structure, and it generates valid INSERT INTO statements that respect your constraints.

Let's start with a realistic e-commerce schema:

sql
CREATE TABLE categories (
  id SERIAL PRIMARY KEY,
  name VARCHAR(100) NOT NULL,
  slug VARCHAR(100) UNIQUE NOT NULL
);

CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  name VARCHAR(200) NOT NULL,
  price DECIMAL(10, 2) NOT NULL CHECK (price > 0),
  category_id INTEGER REFERENCES categories(id),
  created_at TIMESTAMP DEFAULT NOW()
);

CREATE TABLE orders (
  id SERIAL PRIMARY KEY,
  product_id INTEGER REFERENCES products(id),
  quantity INTEGER NOT NULL CHECK (quantity > 0),
  total DECIMAL(10, 2) NOT NULL,
  status VARCHAR(20) DEFAULT 'pending',
  ordered_at TIMESTAMP DEFAULT NOW()
);

Generated INSERT Statements

Feed this schema into the SQL Seeder tool. It analyzes your foreign key relationships and generates inserts in the correct dependency order — categories first, then products, then orders:

sql
-- Categories (no dependencies)
INSERT INTO categories (id, name, slug) VALUES
  (1, 'Electronics', 'electronics'),
  (2, 'Furniture', 'furniture'),
  (3, 'Clothing', 'clothing');

-- Products (depends on categories)
INSERT INTO products (id, name, price, category_id, created_at) VALUES
  (1, 'Wireless Headphones', 79.99, 1, '2026-01-15T10:30:00Z'),
  (2, 'Standing Desk', 599.99, 2, '2026-01-16T14:20:00Z'),
  (3, 'Mechanical Keyboard', 149.99, 1, '2026-01-17T09:15:00Z'),
  (4, 'Cotton T-Shirt', 29.99, 3, '2026-01-18T11:45:00Z'),
  (5, 'Ergonomic Chair', 449.99, 2, '2026-01-19T16:00:00Z');

-- Orders (depends on products)
INSERT INTO orders (id, product_id, quantity, total, status, ordered_at) VALUES
  (1, 1, 2, 159.98, 'completed', '2026-02-01T08:00:00Z'),
  (2, 3, 1, 149.99, 'pending', '2026-02-02T12:30:00Z'),
  (3, 5, 1, 449.99, 'shipped', '2026-02-03T15:45:00Z');

Foreign Key Resolution

The SQL Seeder handles the most error-prone part of database seeding: foreign key resolution. When it generates a value for products.category_id, it only picks from IDs that exist in the categories table. This guarantees:

- No orphaned records - No constraint violations - No manual cross-referencing

The algorithm walks your schema's dependency graph (a topological sort of the foreign key references) and generates tables in the correct order. Circular references are detected and flagged.

Using the Generated SQL

Copy the generated SQL and pipe it directly into your Postgres instance:

bash
# Direct pipe
psql -h localhost -U postgres -d testdb < seed.sql

# Or use it in a migration script
cat << 'EOF' > migrations/seed_test_data.sql
BEGIN;
-- Paste generated SQL here
COMMIT;
EOF

# Run in your CI pipeline
psql $DATABASE_URL -f migrations/seed_test_data.sql

Deterministic Seeds in CI/CD

For true determinism, use the same seed value across runs. The SQL Seeder's seed parameter ensures byte-identical output:

1. Generate your seed SQL once using a fixed seed value 2. Commit the .sql file to your repository 3. Run it as a pre-test step in your CI pipeline

This gives you reproducible test fixtures that any developer on your team can replicate locally. No more "works on my machine" — the data is version-controlled alongside your tests.

Combined with the Data Generator for generating the initial fixture data, you get a complete pipeline: generate realistic data → export as schema-aware SQL → seed your test database.

Try the SQL SeederOpen the tool and start using it — no signup required.
← Back to all tutorials