Mongoose Relations & Population

MongoDB is a document database, so it has no joins in the relational sense. Mongoose models relationships in two ways: by embedding one document inside another, or by referencing it with a stored ObjectId that you resolve on demand. Choosing between them, and knowing how to hydrate references with populate, is the single most important modelling decision in a Mongoose-backed NestJS app. This page covers defining refs, populating them, embedding subdocuments, declaring virtual relationships, and the trade-offs that decide which approach to reach for.

Referencing documents with ObjectId refs

A reference stores the _id of a related document plus a ref pointing at the target model. In a @nestjs/mongoose schema you declare the property type as mongoose.Schema.Types.ObjectId and set ref to the model name. Typing the property as the related class keeps your TypeScript honest while leaving the stored value an ObjectId.

import { Prop, Schema, SchemaFactory } from '@nestjs/mongoose';
import mongoose, { HydratedDocument } from 'mongoose';
import { User } from './user.schema';

export type PostDocument = HydratedDocument<Post>;

@Schema({ timestamps: true })
export class Post {
  @Prop({ required: true })
  title: string;

  @Prop({ type: mongoose.Schema.Types.ObjectId, ref: 'User', required: true })
  author: User;

  @Prop({ type: [{ type: mongoose.Schema.Types.ObjectId, ref: 'Comment' }] })
  comments: mongoose.Types.ObjectId[];
}

export const PostSchema = SchemaFactory.createForClass(Post);

The author field holds a single reference; comments holds an array of them. Until you populate, querying a post returns the raw ObjectId values, not the referenced documents.

Resolving references with populate

populate performs a follow-up query (or aggregation $lookup) to swap each ObjectId for the full document it points to. You can populate one path, several paths, or nested paths, and you can project which fields come back.

import { Injectable } from '@nestjs/common';
import { InjectModel } from '@nestjs/mongoose';
import { Model } from 'mongoose';
import { Post, PostDocument } from './post.schema';

@Injectable()
export class PostsService {
  constructor(
    @InjectModel(Post.name) private readonly postModel: Model<PostDocument>,
  ) {}

  findOneDeep(id: string) {
    return this.postModel
      .findById(id)
      .populate('author', 'name email')          // only name + email
      .populate({
        path: 'comments',
        select: 'body author',
        populate: { path: 'author', select: 'name' }, // nested populate
      })
      .exec();
  }
}

Output:

{
  "_id": "665f0a...",
  "title": "Modelling relations in Mongoose",
  "author": { "_id": "664a1c...", "name": "Ada", "email": "[email protected]" },
  "comments": [
    { "_id": "6660b2...", "body": "Great post!", "author": { "name": "Linus" } }
  ]
}

Every populated path is at least one extra query. Populating large arrays across many documents is a classic N+1 trap. Select only the fields you need and consider an aggregation pipeline when you must join across thousands of rows.

Embedding subdocuments

When related data is owned by the parent, has bounded size, and is always loaded with it, embed it. Subdocuments are nested schemas stored inline; there is no second query because the data already lives in the same document.

import { Prop, Schema, SchemaFactory } from '@nestjs/mongoose';
import { HydratedDocument } from 'mongoose';

@Schema({ _id: false })
export class Address {
  @Prop({ required: true })
  street: string;

  @Prop({ required: true })
  city: string;

  @Prop()
  postcode?: string;
}
export const AddressSchema = SchemaFactory.createForClass(Address);

export type CustomerDocument = HydratedDocument<Customer>;

@Schema({ timestamps: true })
export class Customer {
  @Prop({ required: true })
  name: string;

  @Prop({ type: AddressSchema })
  billingAddress: Address;

  @Prop({ type: [AddressSchema], default: [] })
  shippingAddresses: Address[];
}
export const CustomerSchema = SchemaFactory.createForClass(Customer);

Setting _id: false on the embedded schema avoids generating an ObjectId for each subdocument, which you usually do not need. Embedded arrays support the full Mongoose array API (push, pull, positional updates) and are validated as part of the parent.

Virtual relationships

A virtual is a computed property that is not stored in MongoDB. Virtual populate lets the “one” side of a one-to-many reference resolve its children without storing an array of ids on the parent — the foreign key lives only on the child. This keeps parent documents small and avoids unbounded arrays.

@Schema({
  timestamps: true,
  toJSON: { virtuals: true },
  toObject: { virtuals: true },
})
export class User {
  @Prop({ required: true })
  name: string;
}
export const UserSchema = SchemaFactory.createForClass(User);

// Posts reference User via `author`; expose them as a virtual on User.
UserSchema.virtual('posts', {
  ref: 'Post',
  localField: '_id',
  foreignField: 'author',
});

Now userModel.findById(id).populate('posts') returns every post whose author matches, even though User stores no posts field. You must enable virtuals in toJSON/toObject for the field to appear in serialized responses.

Virtuals are not queryable. You cannot filter or sort by a virtual field in a find(); use the real foreign key or an aggregation for that.

Embedding vs referencing

Factor	Embed	Reference
Read pattern	Loaded together, atomically	Loaded independently
Cardinality	Bounded, “few”	Unbounded or many-to-many
Update frequency	Changes with parent	Changes on its own
Duplication risk	High if shared	None — single source of truth
Query cost	One read	Extra `populate` query each
Document growth	Risk of 16 MB limit	Parent stays small

As a rule: embed “contains” / “owns” relationships (an order’s line items), reference “links to” relationships shared across many parents (the user who placed many orders).

Best Practices

Default to embedding for data that is owned by, bounded in size, and always read with its parent; reference everything else.
Always project with a select argument when populating to avoid pulling whole documents over the wire.
Watch for N+1 queries when populating arrays across many parents; reach for an aggregation $lookup at scale.
Use virtual populate for one-to-many so the foreign key lives on the child and parent documents stay small and bounded.
Enable toJSON: { virtuals: true } (and toObject) when you expect virtual fields in API responses.
Index every ObjectId ref field you populate or filter on; unindexed populates do collection scans.
Never embed unbounded arrays that grow forever — you risk hitting the 16 MB document limit.