Monday, 31 December, 2018 UTC


Summary

Today we dive into MongoDB relationships between documents. To do it we use Mongoose and the populate feature. As always, the code that we cover here is available in the express-typescript repository. You’re welcome to give it a star.
MongoDB relationships between documents
The fact is that MongoDB is a NoSQL database. It means that it is non-relational, among other things. To implement a kind of relations between documents, we use references by IDs or embed documents directly.
In the previous part of the tutorial, we saved a reference to the author of the post as authorId. Let’s look into how can we handle that using features of Mongoose like the populate function. To start, we explain relationships in a traditional meaning with examples of implementing them using MongoDB and Mongoose.
One-To-One (1:1)
We can use it to describe MongoDB relationships between two entities where the element of type A may be linked to just one element of type B and vice versa. Imagine having an address schema:
import * as mongoose from 'mongoose';

const addressSchema = new mongoose.Schema({
  city: String,
  street: String,
});
In our application there is a big chance that just one user is living at a particular address, meaning a combination of a street and a city. Therefore, it makes sense to relate a specific object of address to just one user, and the user to just one address. User-address is a decent example of a One-To-One relationship, and because of that, we can use the fastest way of creating MongoDB relationships: embedding documents.
import * as mongoose from 'mongoose';
import User from './user.interface';

const addressSchema = new mongoose.Schema({
  city: String,
  street: String,
});

const userSchema = new mongoose.Schema({
  address: addressSchema,
  email: String,
  name: String,
  password: String,
});

const userModel = mongoose.model<User & mongoose.Document>('User', userSchema);

export default userModel;
The address object got created along with the user and had been given a distinct ID. We can go to the mLab interface and look it up. There, you can see that it is embedded straight into the user document.
The One-To-One relationship in this example means that a user has just one address and this address belongs to only one user. Since this is the case, it makes sense to embed the address straight into the user document.
One-To-Many (1:N)
One-To-Many can be used to describe MongoDB relationships in which one side can have more than one relationship with the other, while the reverse relationship can only be one-sided. Let’s implement it in our application with blog posts and authors, assuming that a blog post can just have one author. It is a One-To-Many relationship because a user can be the author of many blog posts, but a blog post can only have one author. We could embed the blog post into the author document, but it would be difficult to maintain. A better idea is to refer to a user inside the post document.
import * as mongoose from 'mongoose';
import Post from './post.interface';

const postSchema = new mongoose.Schema({
  author: {
    ref: 'User',
    type: mongoose.Schema.Types.ObjectId,
  },
  content: String,
  title: String,
});

const postModel = mongoose.model<Post & mongoose.Document>('Post', postSchema);

export default postModel;
The 
ref: 'User'
 refers to the “User” document because we named it like that here:
mongoose.model<User & mongoose.Document>('User', userSchema);
Thanks to defining such a reference, now you can assign the user id to the author property of a post.
private createPost = async (request: RequestWithUser, response: express.Response) => {
  const postData: CreatePostDto = request.body;
  const createdPost = new this.post({
    ...postData,
    author: request.user._id,
  });
  const savedPost = await createdPost.save();
  response.send(savedPost);
}

Populating the data with Mongoose

A great thing about it is that you can very easily replace the id with the actual data of an author with the populate function that Mongoose implements.
private createPost = async (request: RequestWithUser, response: express.Response) => {
  const postData: CreatePostDto = request.body;
  const createdPost = new this.post({
    ...postData,
    author: request.user._id,
  });
  const savedPost = await createdPost.save();
  await savedPost.populate('author').execPopulate();
  response.send(savedPost);
}
Small issue with that is the fact that it also returns the password hash of the user. Fortunately, the populate function accepts additional options.
The first option is to choose the properties that you want to select:
populate('author', 'name')
In the example above only the name of the author and his id is attached. The second way is to exclude the properties that you want to omit:
populate('author', '-password')
You can do the same thing in other parts of the application, for example when fetching all the posts:
private getAllPosts = async (request: express.Request, response: express.Response) => {
  const posts = await this.post.find()
    .populate('author', '-password');
  response.send(posts);
}
An important thing to notice here is the difference between 
savedPost.populate('author').execPopulate()
  and 
this.post.find().populate('author', '-password')
. In the first example, we call populate on an instance of a Document. To execute it, we need to call execPopulate. In the second example, we use populate on an instance of a Query. To execute it, we need to call exec and it is done indirectly when the async/await mechanism executes the then function.
If you want to know more, check out Query.prototype.then and Explaining async/await. Creating dummy promises

The direction of the reference

A good question to ask is: can we attach blog post IDs in the user documents, instead of the other way around? Surely we can. When deciding the direction of the reference, we need to ask ourselves a few questions:
How many references will we have? Imagine storing logs for different machines in your server room. The Log documents would be tiny and have just a few parameters like a massage and a time. The Machine documents would have an array of either embedded logs or just ID references to the Log documents. The issue is that the maximum size of a MongoDB document is 16MB. If you have an array of all the IDs, you might eventually run out of space for the Machine document. What you can do is instead of having an array of all the logs, keep the id of a machine in the Log document. It is an example of parent-referencing.
The second question is, what queries would be the most often used in our application. In our Blog post – Author implementation it is effortless to retrieve the data of the author if you have the blog post because we store his id in the Post document. On the other hand, it is more difficult to retrieve all the posts of a single user, given his id. To do that, you need to query all of the blog posts and compare the ID of the author. Let’s implement it.
import * as express from 'express';
import NotAuthorizedException from '../exceptions/NotAuthorizedException';
import Controller from '../interfaces/controller.interface';
import RequestWithUser from '../interfaces/requestWithUser.interface';
import authMiddleware from '../middleware/auth.middleware';
import postModel from '../post/post.model';

class UserController implements Controller {
  public path = '/users';
  public router = express.Router();
  private post = postModel;

  constructor() {
    this.initializeRoutes();
  }
  
  private initializeRoutes() {
    this.router.get(`${this.path}/:id/posts`, authMiddleware, this.getAllPostsOfUser);
  }

  private getAllPostsOfUser = async (request: RequestWithUser, response: express.Response, next: express.NextFunction) => {
    const userId = request.params.id;
    if (userId === request.user._id.toString()) {
      const posts = await this.post.find({ author: userId });
      response.send(posts);
    }
    next(new NotAuthorizedException());
  }
}

export default UserController;
In our simple example, only the person logged in can get a list of all his posts.

Embedding vs. referencing

Having One-To-Many MongoDB relationships does not mean that using embedding instead of referencing is a bad idea. You can surely do that, for example when you have multiple addresses for a person. The main advantage of that is you don’t have to perform any additional database traversing to get the embedded details. You can’t access the details as standalone entities though, which we want to do with documents like blog posts.
In general, embedding the data works well for small subdocuments that do not change a lot. Referencing by id is slower because you need additional queries, but it works well for large subdocuments that change often and are often excluded from the result.
Many-To-Many (N:M)
Another MongoDB relationships that you may implement is Many-To-Many. It happens when two entities might have many relationships between each other. An example of that is when:
  • a blog post can have multiple authors
  • a user can be the author of many blog posts.
A way to implement it is to change the author property in the Post schema to the authors array.
const postSchema = new mongoose.Schema({
  authors: [
    {
      ref: 'User',
      type: mongoose.Schema.Types.ObjectId,
    }
  ],
  content: String,
  title: String,
});
The above is an elementary example, but there is a lot of additional things that we can implement.

Two-Way referencing

An example of an additional technique you can use in both Many-To-Many relationships, as well as with One-To-Many, is two-way referencing. It works by storing references in both entities that refer to each other. An example of it is storing references both in the Post and the User:
const postSchema = new mongoose.Schema({
  authors: [
    {
      ref: 'User',
      type: mongoose.Schema.Types.ObjectId,
    },
  ],
  content: String,
  title: String,
});
const userSchema = new mongoose.Schema({
  address: addressSchema,
  email: String,
  name: String,
  password: String,
  posts: [
    {
      ref: 'Post',
      type: mongoose.Schema.Types.ObjectId,
    },
  ],
});
This implementation has some advantages. Thanks to referencing posts in the user document, we can quickly find every post that a user wrote without the need to do an additional query that traverses all the posts in your database. The downside to this is that every time you create and modify documents you need to take care of the consistency of the data and update both sides of the reference.
private createPost = async (request: RequestWithUser, response: express.Response) => {
  const postData: CreatePostDto = request.body;
  const createdPost = new this.post({
    ...postData,
    authors: [request.user._id],
  });
  const user = await this.user.findById(request.user._id);
  user.posts = [...user.posts, createdPost._id];
  await user.save();
  const savedPost = await createdPost.save();
  await savedPost.populate('authors', '-password').execPopulate();
  response.send(savedPost);
}
For the sake of simplicity, the version of the application in the express-typescript repository will have the One-To-Many relationship implemented between posts and users.
Summary
In this article, we covered how to create MongoDB relationships between documents. We also learned what their types and different implementations are and how it impacts the performance of the application in different actions such as creating, updating and finding documents. In the next part of the tutorial, we will dive deeper into aggregations, so stay tuned!
The post TypeScript Express tutorial #5. MongoDB relationships between documents appeared first on Marcin Wanago Blog - JavaScript, both frontend and backend.