Database Seeding

Seeding populates a database with known data so your app has something to work against during development, demos, and automated tests. Good seeds are reproducible — running them twice yields the same result — and idempotent, so a second run never duplicates rows or crashes on unique constraints. This page shows how to build factories with Faker, run seeds through a standalone Nest application context or a CLI command, and keep the whole thing safe to re-run.

Why seed through the application context

You could write raw SQL fixtures, but then you lose your entities, validation, password hashing, and services. The cleaner approach is to boot a standalone Nest application context with NestFactory.createApplicationContext. This wires up dependency injection — repositories, config, providers — without starting an HTTP server, so your seed code looks exactly like your application code.

// src/database/seed.ts
import { NestFactory } from '@nestjs/core';
import { SeedModule } from './seed.module';
import { SeederService } from './seeder.service';

async function bootstrap() {
  const app = await NestFactory.createApplicationContext(SeedModule, {
    logger: ['error', 'warn', 'log'],
  });

  try {
    await app.get(SeederService).run();
    console.log('Seeding complete.');
  } catch (err) {
    console.error('Seeding failed:', err);
    process.exitCode = 1;
  } finally {
    await app.close();
  }
}

bootstrap();

The SeedModule simply imports your TypeOrmModule configuration and the feature modules whose entities you want to populate.

// src/database/seed.module.ts
import { Module } from '@nestjs/common';
import { TypeOrmModule } from '@nestjs/typeorm';
import { ConfigModule } from '@nestjs/config';
import { User } from '../users/user.entity';
import { Post } from '../posts/post.entity';
import { SeederService } from './seeder.service';

@Module({
  imports: [
    ConfigModule.forRoot({ isGlobal: true }),
    TypeOrmModule.forRoot({
      type: 'postgres',
      url: process.env.DATABASE_URL,
      entities: [User, Post],
      synchronize: false,
    }),
    TypeOrmModule.forFeature([User, Post]),
  ],
  providers: [SeederService],
})
export class SeedModule {}

Building factories with Faker

A factory is a function that returns a partial entity filled with realistic fake data. Centralising creation in factories keeps tests and seeds consistent and makes overriding specific fields trivial.

// src/database/factories/user.factory.ts
import { faker } from '@faker-js/faker';
import { User } from '../../users/user.entity';

export function makeUser(overrides: Partial<User> = {}): User {
  const user = new User();
  user.email = faker.internet.email().toLowerCase();
  user.name = faker.person.fullName();
  user.avatarUrl = faker.image.avatar();
  user.bio = faker.lorem.sentence();
  return Object.assign(user, overrides);
}

Seed a fixed value (the same seed produces the same data every run) when you need deterministic output across CI runs:

import { faker } from '@faker-js/faker';
faker.seed(1337); // reproducible across machines and runs

Writing an idempotent seeder service

The seeder is a normal @Injectable() provider. The key technique for idempotency is upsert: insert a row, or do nothing if it already exists, keyed on a stable unique column. TypeORM’s upsert and findOneBy make this straightforward.

// src/database/seeder.service.ts
import { Injectable, Logger } from '@nestjs/common';
import { InjectRepository } from '@nestjs/typeorm';
import { Repository } from 'typeorm';
import { faker } from '@faker-js/faker';
import { User } from '../users/user.entity';
import { Post } from '../posts/post.entity';
import { makeUser } from './factories/user.factory';

@Injectable()
export class SeederService {
  private readonly logger = new Logger(SeederService.name);

  constructor(
    @InjectRepository(User) private readonly users: Repository<User>,
    @InjectRepository(Post) private readonly posts: Repository<Post>,
  ) {}

  async run(): Promise<void> {
    faker.seed(1337);

    // Stable admin row: created once, never duplicated.
    await this.users.upsert(
      { email: '[email protected]', name: 'Admin', bio: 'Site owner' },
      { conflictPaths: ['email'], skipUpdateIfNoValuesChanged: true },
    );

    const existing = await this.users.count();
    if (existing >= 25) {
      this.logger.log(`Already seeded (${existing} users); skipping.`);
      return;
    }

    const fakeUsers = Array.from({ length: 25 - existing }, () => makeUser());
    const saved = await this.users.save(fakeUsers);

    for (const author of saved) {
      const posts = Array.from({ length: 3 }, () =>
        this.posts.create({
          title: faker.lorem.sentence(),
          body: faker.lorem.paragraphs(2),
          author,
        }),
      );
      await this.posts.save(posts);
    }

    this.logger.log(`Seeded ${saved.length} users and ${saved.length * 3} posts.`);
  }
}

Wire it to an npm script so anyone can run it with one command:

npm run seed

Output:

[Nest] LOG [SeederService] Seeded 24 users and 72 posts.
Seeding complete.

Run it a second time and the guard kicks in:

[Nest] LOG [SeederService] Already seeded (25 users); skipping.
Seeding complete.

Add the script and a Faker dependency to package.json:

npm install --save-dev @faker-js/faker

{
  "scripts": {
    "seed": "ts-node -r tsconfig-paths/register src/database/seed.ts"
  }
}

Seeding for automated tests

Tests need a clean, isolated dataset on every run. Truncate the relevant tables, then call the same factories you use elsewhere. Wrapping each test in a transaction that rolls back is even faster, but explicit truncation is the most portable starting point.

// test/seed-test-db.ts
import { DataSource } from 'typeorm';
import { makeUser } from '../src/database/factories/user.factory';

export async function seedTestDb(ds: DataSource) {
  await ds.query('TRUNCATE "post", "user" RESTART IDENTITY CASCADE');
  const repo = ds.getRepository('User');
  await repo.save([makeUser({ email: '[email protected]' })]);
}

Tip: Never run seeders against production. Guard seed.ts with a check such as if (process.env.NODE_ENV === 'production') throw new Error('Refusing to seed production') and keep destructive truncation out of any code path that touches a live database.

Seed strategies compared

Approach	Idempotent?	Best for	Trade-off
`upsert` on unique key	Yes	Reference data, admin accounts	Requires a stable unique column
Count guard + `save`	Yes	Bulk demo data	Coarse-grained; all-or-nothing
`TRUNCATE` then insert	Yes (resets each run)	Test databases	Destructive — dev/test only
Blind `insert`	No	One-off scripts	Fails or duplicates on re-run

Best Practices

Seed through a Nest application context so factories reuse your real entities, validation, and services.
Make every seeder idempotent — use upsert on a unique key or a count guard so re-running never duplicates rows.
Call faker.seed(n) with a fixed value for deterministic, reviewable data across machines and CI.
Keep factories small and composable; accept an overrides argument so tests can pin specific fields.
Separate reference seeds (always present) from demo seeds (development volume) so production can run only the former.
Guard against running seeds — especially truncating ones — in production, and gate it behind NODE_ENV.
Run seeds after migrations have brought the schema up to date, never with synchronize: true.