Tracking request in Node.js

Introduction

In this blog post, we will take a look at how we will be tracking request in Node.js.

Motivation (WHY?)

If we look at postal services like Fedex, UPS etc., they will provide us with a tracking number using which we can track the package status from their website.

Screen Shot 2017-11-04 at 5.41.27 PM

It is such an important functionality, using which each customer can track their package status at any time.

Similarly, in a Service Oriented Architecture, we will have a plethora of services talking to one another using HTTP protocol following REST or SOAP architecture. When a client initiates a Web request using browser, request traverses through all these distributed services to fulfill a particular request. Diagrammatically it can be shown as below:

Screen Shot 2017-11-04 at 5.44.32 PM

Similar to how we track packages, we need to track what happened to a particular request r1 initiated by a client. Typically when the web application receives the request, it will generate a unique identifier called correlationId and track any information in its application logs by tagging this correlationId. Also when the web application makes any service calls it will pass along this correlationId as a Request HEADER key/value pair. Subsequently, those services will use the same correlationId while they log their information. This unique identifier is thus carry forward to all the services which are involved in fulfilling a single request r1.

Screen Shot 2017-11-04 at 5.56.46 PM

Passing along the same correlationId for a given request is very important for a variety of reasons:

  • When a given request fails, we can exactly know the reason why it failed and which service caused the failure.
  • We can tell how long a given request took and can measure exactly how long each services in turn took to process a given request
  • If we have all information logged for a given request r1 we can write aggregate queries to figure out:
    • How many request was served by a given endpoint in a min/hr/day range?
    • How many 2xx/4xx/5xx occurred for a given endpoint in a min/hr/day range?
    • What is the average/95th percentile response time taken for a given api to respond for a given request r1?
    • etc.,

It is very clear that tracking a given request r1 using correlationId in Service Oriented Architecture is critical to understand how our systems are behaving.

Implementation (HOW?)

Solution in JAVA

We will take a simple example of providing a stock price web application which retrieves stock price based on the company name. The flow is as follows:

  1. Client (Browser) makes a call by passing company name. for eg: Walmart
  2. Web application receives company name and then calls  StockService to retrieve the stock ticker symbol. (Walmart => WMT)
  3. Web application gets the stock price by passing stock ticker symbol to StockService and renders the webpage with the result. (WMT => price)

In the above example, we will focus how logging is implemented in the web application layer, since the process is the same across other layers in the stack.

UML diagram can be shown as follows:

Stock Price Sequence (2)

In this example, web application is written in Java and it includes WebController and StockBizImpl in the above diagram. We will not worry about the implementation of Stock Service(Separate Service).

To satisfy our logging requirements in JAVA web application we need to do:

  • Generate a unique correlationId which is tagged along in every log we generate in the web application
  • Pass this unique correlationId when we make 2 service calls. One to retrieve stock ticker symbol and the other to retrieve stock price.

Solution 1:

A quick solution would be to create a class called Context which will hold the generated correlationId. This object will be created by WebController and then each function which needs to either log or make a service call, will accept a new function parameter called context. For eg: public void process(String value, Context ctx, ...)​. This solution works but as we can see, the function signature gets tainted with Context ctx everywhere in the code and also we need to keep passing Context ctx wherever it’s required which is not elegant.

Solution 2:

How do we solve it in a way, that any portion of the code which is handling a given request somehow automatically gets the Context object without having to explicitly pass it as a parameter? THREADLOCAL to the rescue.

Java provides a ThreadLocal object using which you can set/get/remove thread scoped variables3 key functions:

  • set(T value) – It is generic over type T
  • get()
  • remove()

Assuming we are developing a J2EE web application running within Tomcat, blocked thread pool implementation., Tomcat will have a worker thread pool of size n and each thread will be allocated to handle a given request and once completed the thread goes back to the thread pool. Typically, we will register a filter which will create a Context object and set in ThreadLocal. This object can be accessed in the entire request lifecycle within the same thread and when the control comes back to filter, it will remove the context from ThreadLocal since threads will be reused by Tomcat as they go back to the thread pool. Solution is a Filter acting as a Decorator and ThreadLocal acting as the container of the object within the given thread of execution.

Stock_Price_Sequence__1__png_and_Stock_Price_Sequence.png

Solution in Node.js

We will take a simple example of generating a correlationId in one event loop and subsequently retrieve the previously set correlationId value in another event loop which is part of the same callback chain.

Solution 1:

Similar to Java solution 1, we can append the Context to request object under request.app.ctx with a new value for correlationId.  In the code wherever we need the correlationId from the Context object in the entire callback chain for a given request, we will simply use closure and close over the request object. Like in Java solution 1, it is not elegant as we have to close over the request object to access request.app.ctx wherever we need the ctx object in the request lifecycle.

Solution 2:

If we look at the above Java solution, the 3 key methods set(T value), get() and remove() do not have to pass in thread id to do CRUD on thread local objects. It is completely abstracted away by the implementation of ThreadLocal which makes the solution elegant. If we are looking at the signatures of set, get and remove from the lens of a functional programmer, we can clearly see that they are side-effecting operations. To me, both OOP and FP offer elegant solutions for various kinds of problems. In this use case, even though they are side-effecting operations, it does not clutter the code to pass the Context object all around and provides a very simple and elegant solution.

As we all know, Node.js is single threaded following Continuation-passing style.  How do we solve this problem in this environment?

We will look at a simple Node.js code to understand what we are trying to achieve:

Screen Shot 2017-11-07 at 8.24.55 AM

What we would like to achieve is set an object with key:value pair  and register some callback events using setTimeout for example in one event loop and then we should be able to retrieve the previous set value by passing  the key in the subsequent event loop when the callback actually gets fired.

We have taken the cue from the Java implementation. In the above Node.js implementation, we have come up with setters and getters using set(key, value) and get(key)​respectively and the object values stored using set should be  automatically made available to the entire callback chain in a transparent manner and should be able to be retrieved using getDecorator and ThreadLocal solution in Java becomes Monkey Patching and Use of Closures in Javascript.

What is MonkeyPatching ?

A monkey patch is a way for a program to extend or modify supporting system software locally (affecting only the running instance of the program).

A simple example of monkey patching setTimeout and the callback we pass to it can be shown as follows:

Screen Shot 2017-11-07 at 8.27.55 AM
As we can see above, both setTimeout and callback that we pass to it is being monkey patched. Since the client has already tied their code with setTimeout / callback the only way to augment the behavior of setTimeout / callback is by monkey patching.

Solution using Monkey patching and Closures

Screen_Shot_2017-11-05_at_10_12_19_PM

namespace.js

Screen_Shot_2017-11-05_at_10_17_14_PM

As we can see above, setTimeout and callback passed to it is monkey patched. In the patched setTimeout function, we are closing over the active context value at that point in time t. In the patched callback function, we are setting the closed active value what was at time t while entering, run the actual callback function, resetting the active to an empty object while exiting. We start everything off by running namespace.run(fn) where run function creates a brand new active object, sets the active object while entering, executes fn (in this case start) and then resets the active value while exiting.

Continuation Local Storage

An open sourced module called Continuation Local Storage solves the above problem for Node.js in a powerful and elegant way. You can go over the source code link to understand more about the module.

General code flow for passing around context in a generic way using Continuation Local Storage can be shown as below: (Took the below example from Continuation Local Storage )

Screen Shot 2017-11-06 at 10.28.56 AM

Continuation Local Storage  is a very powerful and complex module. Since in Node.js, IO are non-blocking operations, this effectively means Continuation Local Storage  module needs to apply monkey patching to a variety of functions. The module at a very high level does the following:

  • Handling variations of logic for different node versions
  • Monkey patch functions from the below objects:
    • net.Server.prototype
    • http.Agent.prototype
    • childProcess.ChildProcess.prototype
    • childProcess
    • process
    • global
    • dns
    • fs
    • zlib
    • crypto
    • Promise
    • etc…
  • Support CLS for EventEmitter
  • Handling error scenarios
  • Provides namespace using which we can start a new execution using run, and for read and write operation, we can use get and set respectively.

Modules

Screen Shot 2017-11-06 at 10.46.12 AM

Summary

Tracking a request in Node.js using correlationId is not very easy compared to how it can be implemented in a language like Java. Node.js asynchronous IO makes it very hard as the code structure becomes completely callback-driven.  Continuation Local Storage is one of the most powerful and complex modules which solves this problem in a very elegant way for Node js. Even if we are not planning to use this module in a real world application, just by looking at the source code implementation we can learn so much about the complexity involved in solving such a simple use case in Node js and the author’s brilliance in the way of its implementation.

3 thoughts on “Tracking request in Node.js

Leave a comment