JSFeeds: objectpartners.com - JavaScript Modules: A Brief History

Friday, 24 May, 2019 UTC

JavaScript Modules: A Brief History

Summary

In this article, you will learn about the history of modules in JavaScript, what is possible today, and how to use JavaScript modules in your own applications.

Before modules

JavaScript was designed for use in a web browser. In order to use JavaScript, the HTML needs to include a <script> tag, and then the code is downloaded (if not included inline) and executed. Until recently, there was no method for modularizing JavaScript code to import into other JavaScript code. In the browser, such a system was not really necessary. JavaScript was used rather sparingly, and if you wanted to use a library such as jQuery, then all you had to do was include the <script> tag to load jQuery, load your own JavaScript code that consumed it afterwords in another <script> tag, and that’s it. The library was responsible for binding itself to the global context (window.$ for jQuery), and your code could then find and use it.

This method works, and still works today, but it has some drawbacks:

The order of <script> tags matters. The library needs to define its own global variables first, or the code that consumes it will throw an error.
There is no (easy) way to exclude library code that was never used, especially if it is minimized or obfuscated. If you want to use any part of a library or framework, you must load the entire thing.
If two libraries define the same global variable name, then you cannot use both without overriding one of them.
If you want to break up your own JavaScript code into multiple files, you must use this technique of binding to the global context in order to share state or functions.

Node.js

Node.js paved the way for using JavaScript outside of the web. It provides a runtime that can execute JavaScript code on a computer without needing a browser. Along with Node.js came a stronger need to make JavaScript code modular&emdash;without any HTML tags, how do you load a library before using it in your code?

Node.js needed a way to allow JavaScript code to declare its dependencies on other JavaScript code, not because modular code can make things conceptually easier (although that it a nice feature), but because JavaScript no longer ran within the context of a web page.

Goals for a module system

To better understand the motivation for a module system in JavaScript (or any language, for that matter), let’s outline some of the primary goals and problems that modules solve:

Code providers: Modules enable source code to declare what it is making available for other code to depend upon. Modules provide a mechanism for saying “if you load me into your code, this is what I provide”.
Code consumers: Modules enable source code to declare its dependencies. Instead of code assuming that variables exists in the global context, the code can require that other code is loaded and available before proceeding to use it.

CommonJS

CommonJS (CJS) was the first JavaScript module system that became really popular. This is the system that Node.js adopted, and with the rise of Node.js came the ubiquity of CommonJS.

The CommonJS project is not only a specification for JavaScript modules, but an entire vision for creating a standard library and code distribution platform for JavaScript. The success of this vision is seen today with the immense popularity of NPM.

Code providers

In CommonJS, every JavaScript file is a module. Out of the box, a CJS module does not provide any usable code for other modules to consume. In order to make something available, it needs to be exported.

Every CJS module has a global object called module with a property called exports. Any value you assign to module.exports becomes available for other modules to include in their code.

Here is a simple example:

// simple-math.js
 
const add = (x, y) => x + y
const multiply = (x, y) => x * y
 
module.exports = { add, multiply }

Code consumers

In CJS modules, there exists a global function require which is used to load the value of module.exports from other CJS modules. The require function takes as an argument a string which informs Node.js how to find the module. This string can be a relative or absolute path to a CJS JavaScript file, or it can be the name of a package inside the local node_modules directory, in which Node.js looks to the "main" field in the package.json file to find the path to the CJS module file.

Here is an example of an application using the add function from our simple-math.js CJS modules:

// my-app.js
 
const simpleMath = require('./simple-math.js')
 
console.log(simpleMath.add(3, 4)) // 7

How it works

Using Node.js, we can run my-app.js like so:

$ node my-app.js

Under the hood, Node.js executes its own JavaScript program which will read the contents of my-app.js and wrap them inside of a function like so:

// Node.js runtime
 
(function(exports, require, module, __filename, __dirname) {
    // my-app.js
 
    const simpleMath = require('./simple-math.js')
 
    console.log(simpleMath.add(3, 4))
})(
    module.exports,
    require,
    module,
    '/home/user/my-app/src/my-app.js',
    '/home/user/my-app/src'
)

Node.js then executes this function, where

exports is an alias to module.exports (for convenience).
require is a function which
1. Resolves the arguments to a file
2. Wraps the contents of the file in a function just like above
3. Executes the function and returns the value of module.exports
module is a pre-initialized object with some data regarding your module and an empty exports property.
__filename and __dirname are strings set to the absolute path of the module file and its containing directory, respectively.

Note that each require is called synchronously so that the entire dependency tree is resolved before proceeding. Additionally, the code of a module is executed at the time of require, so it is possible for modules to provide hidden side-effects to your application.

Benefits of CommonJS

CJS was standardized early, and Node.js has always provided support for it. Because of this, lots of tooling and packages are built around CJS.

For example, NPM hosts many thousands of JavaScript modules in CommonJS format (over 800,000 at the time of writing). With NPM, Node.js, and CommonJS, using jQuery is as simple as installing the package…

$ npm install jquery

…and requiring it in your code.

const $ = require('jquery')

Furthermore, since require is just a regular ol’ function, it can be used programmatically to dynamically load different modules. This is commonly used to switch between minified and non-minified files depending on whether the module is being loaded in production mode.

const $ = process.env.NODE_ENV === 'production'
    ? require('jquery/dist/jquery.min.js')
    : require('jquery/dist/jquery.js')

Drawbacks of CommonJS

Despite its ubiquity, CommonJS has some severe drawbacks.

First, CJS was developed and implemented independently, without support from the ECMA standards body (the organization that defines the JavaScript specification). This means that the language itself does not include anything regarding CJS, and there is no native support for CJS from browsers.

Second, the implementation details around CJS are based on a local environment and do not make much sense in a browser context (__dirname?). Furthermore, module resolution and loading must be done synchronously, which is bad for the browser: your script would need to be loaded, parsed, and executed before the first require is seen, and only then the browser would download that dependency, parse it, and execute it, through the entire dependency tree. All of this while blocking the rendering and interaction on the page.

Third, there is no way to statically analyze CJS modules and perform tree shaking on module code. This causes the same bundle size problems as before.

The module bundler

I’ll take a moment here to talk about the role of module bundlers.

Since there is no support for CommonJS modules in browsers, applications built using these modules in Node.js had to be bundled into the old format of individual files loaded manually via <script> tags. This is where module bundlers come in.

The primary goal for the bundler is to start at some entrypoint file and recursively resolve all of its require() statements, packing all of the imported code into a single file. This single file, called a bundle, could then be loaded in the browser as old fashioned non-modular JavaScript, with all of its dependencies built in.

Since module bundlers need to read all of the JavaScript code that ends up in the bundle, there is an opportunity for the bundler to modify this code in useful ways. Webpack takes this idea to the max, which is why it has become the most popular JavaScript module bundler.

Other module syntaxes

There are two other module syntaxes that have attempted to improve upon the shortcomings of CommonJS and have gained some traction:

Asynchronous Module Definition (AMD): This solves the synchronous problem of CommonJS by providing a function instead of an object for module. AMD module contents are only executed when the module is actually used, not at the time of require.
Universal Module Definition (UMD): This syntax is actually a “superset-syntax” that provides interoperability between JavaScript that uses CJS, AMD, or no modules at all. It’s clever, but not an optimal solution.

For the remainder of this article, we will be focused on the ECMAScript Modules Syntax (ESM), a change the the JavaScript language specification, standardized by the ECMA standards body, which adds native support for modules.

ESM

The ESM standard defines two new keywords: import and export. Here is our simple-math module and application module rewritten using ESM module syntax:

// simple-math.js
 
const add = (x, y) => x + y
const multiply = (x, y) => x * y
 
export { add, multiply }

// my-app.js
 
import * as simpleMath from './simple-math.js'
 
console.log(simpleMath.add(3, 4)) // 7

An important difference between ESM and other module syntaxes is that import and export are keywords implemented by the language itself, not JavaScript objects/functions provided by a runtime.

Benefits of ESM

Since ESM is specified by the ECMA standards body, it is an official part of the JavaScript language. As such, many modern browsers are offering support for this syntax natively.

ESM allows for dynamic importing, using Promises.

ESM also enables dependency tree-shaking. For example, if I modify my-app.js to this…

// my-app.js
 
import { add } from './simple-math.js'
 
console.log(add(3, 4)) // 7

…then a smart bundler can exclude multiply from simple-math.js in the resulting bundle.

Drawbacks of ESM

ESM is late to the game. There are already many thousands of established packages in NPM built with CJS modules, and it’s difficult to provide interoperability between CJS and ESM modules (there are complications around name resolution).

Additionally, Node.js has only recently started to provide support for the ESM syntax. For a long time, no one could really use ESM natively except in isolated circumstances. Bundlers and loaders will still be needed for a while to convert ESM modules to CJS or non-modular JS. Until then, the esm package is a great way to bridge over the gap.

The esm package

The esm package by John-David Dalton (creator/author of Lodash) provides an excellent approach to writing JavaScript modules using the ESM syntax, with just a little bit of boilerplate to work nicely with CommonJS modules.

How it works

Recall that the implementation of CJS modules in Node.js, where your module code is wrapped in an anonymous function, is called with a require function (among other things). The way esm works is that it replaces that require function with its own implementation so that nested requires are able to load files using both ESM and CJS syntax.

It also takes advantage of some conventions around loading packages and the package.json file: The "main" field tells Node which file to load when require('package-name') is called, whereas the "module" field is being used for ESM-aware tools to load an ESM module file instead.

How to use it

1. Create two entrypoint files

To enable loading modules in ESM syntax, we need two entry files. One, called index.js, will define a CommonJS module for other CommonJS modules to require. The second, called main.js, is the real entrypoint to our package, written in ESM syntax.

The index.js file should look like this:

// index.js
require = require('esm')(module)
module.exports = require('./main.js')

The first line loads the esm module (remember, we’re inside of a CJS module). The esm module exports a function, which is called with the module object, and returns a new require function. The new require function is where all the magic happens: ESM modules loaded from this point on are understood and will load correctly, whether we are using ESM or CJS modules.

The main.js file is written using ESM syntax. The rest of the source code files are written in ESM as well.

// main.js
import { add } from './simple-math.js'
 
console.log(add(3, 4)) // 7

2. Updated your package.json file

This is where we take advantage of the package.json conventions mentioned earlier. Update your package.json to look like this:

{
    ...,
    "main": "index.js",
    "module": "main.js"
}

3. Profit

With this setup, importers of your package will consume it in the module syntax they understand. In other words, this setup provides graceful degradation of your Node.js package. ESM-aware consumers will recognize the module field in your package.json and load your package using ESM directly via your main.js file. CJS consumers will instead load the index.js file, which serves as the interpreter for the ESM modules behind it.

What about the .mjs extension?

If you have read about module support in Node.js before, then you might know that Node.js needs some external way to know whether a JavaScript file uses CommonJS or ESM syntax. One of those ways is by changing the file extension on ESM modules to .mjs, and this article provides a great explanation for why this is necessary.

The esm package supports this file extension as well, and using it turns on a more strict mode which is fully compliant with the specification. Browsers will also behave just fine if you use the .mjs extension. You can actually use any file extension in the browser; all that matters is that the <script> tag defines an attribute type="module" and that the file is served with a MIME type of application/javascript

Additional esm package features

The configuration shown above will work out of the box, but esm can also be customized with additional features and options.

You can also use Node’s require module feature to enable ESM loading when running ESM scripts with the node command, or even the REPL itself.

$ node -r esm
> import { add } from './simple-math.js'
undefined
> add(20, 22)
42

Final thoughts

JavaScript may have had humble beginnings as a simple language that brought some interactivity to the web browser, but by taking the language out of the browser and supercharging it with a module syntax, it has grown into a bona fide language for many other platforms. I do hope that this article has provided an informative walk through the evolution of modular JavaScript. The future is exciting, and the best is yet to come.

... more @ objectpartners.com

objectpartners.com