Wednesday, 3 January, 2018 UTC


Summary

It is time to have a look at string manipulation and template literals. There has been a shift in the trends of web development in the last ten years that moves the responsibility of rendering from the server side to the client side. Therefore, the HTML code that you can see on screen is often assembled using JavaScript. Furthermore, other trends such as isomorphic JavaScript have also emerged, where full stack developers can use JavaScript to render templates both on the server side and on the client side.
This is why it is important for JavaScript developers to learn about strings and template literals in depth.
As a side benefit, template literals are more convenient to use than strings, especially when it comes to string concatenation.

New string methods

Some popular checks that can be done with regular expressions are now possible in a more semantic way:
  • startsWith: s1.startsWith( s2 ) is true if and only if s1 starts with s2
  • endsWith: s1.endsWith( s2 ) is true if and only if s1 ends with s2
  • includes: s1.includes( s2 ) is true if and only if s2 is a substring of s1
  • repeat: s.repeat( n ) replicates s n times, and joins them
Examples:
'Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz'
    .startsWith( 'Rindfleisch' );
> true

'not good'.endsWith( 'good' );
> true

'good or bad'.includes( ' or ' );
> true

'ha'.repeat( 4 );
> 'hahahaha'
Side note: if you are wondering where I got the long word from, it was a valid German word not too long ago.

Better unicode support

We are going a bit lower level in this section.
In ES5, string operations are handled in two byte chunks. As a consequence, handling unicode strings with a mixture of two and four byte long codes was often confusing. Characters of unicode strings were often not printable on their own, and in general, any string operation required extra care.
For instance, if you recall the first exercise of the last section, the reference solution would have failed if a title had started with a three or four byte long character.
Even though the length of a string is still based on the number of bytes allocated by a string divided by two, there are some updates in ES6 that make String handling more user friendly.
  • codePointAt( n ) returns the code of the character at the nth position regardless of whether it is a two byte or a four byte long character. If n points at the second half of a four byte unicode character, only the code of the second half is returned
In ES6, it is possible to define four byte long Unicode characters with their code:
'\u{1f600}'                  // becomes an emoji
'\u{1f600}'.length           // is 2
'\u{1f600}'.charCodeAt( 0 )  // is 55357
'\u{1f600}'.codePointAt( 0 ) // is 128512
'\u{1f600}'.charCodeAt( 1 )  // is 56832
'\u{1f600}'.codePointAt( 1 ) // is 56832
'\u{1f600}'.startsWith( '\u{1f600}'[0] ) // is true
Note that the startsWith, endsWith, includes methods interpret the result in two byte chunks.
The for-of loop interprets three and four byte long characters as one unit, scoring another convenience point for ES6. Let’s execute the following
let str = '\u{1f600}\u{00fa}é';

for (const ch of str) {
    console.log( ch );
}
After executing the code, the response is as follows:
ú
é
The for...of loop prints three characters, an emoji, ú, and é.
[...str] spreads str character by character:
> [...str]
["", "ú", "é"]
Even though [...str] has three elements, the length of the str string is 4. This is because the length of [...str][0] is 2:
> str.length
4
Use the for...of loop or the spread operator to process characters of a string regardless of their length in bytes.
This section contains a minimalistic summary of Unicode featues of ES6 from a practical point of view. If you have to handle four byte long characters on a regular basis, do your research. Make sure you know the normalize method for Unicode normalization. Make sure you know that there is a new u flag to influence whether to consider Unicode characters when testing regular expressions.
Normalization and unicode support for regular expressions is outside the scope of this book.

Template literals

The purpose of template literals is to evaluate and insert values of JavaScript expressions in a template string.
let x = 555;
let evaluatedTemplate = `${x} === 555 is ${x === 555}`;

// evaluatedTemplate becomes "555 === 555 is true"

let y = '555';
let evaluatedTemplate = `${y} === 555 is ${y === 555}`;

// evaluatedTemplate becomes "555 === 555 is false"
One practical application of template literals is the creation of DOM markup with static structure, and dynamic values. For more information, check out my article on Underscore templating. If you compare the ES6 syntax, you will find that it is a lot more compact than the Underscore syntax.
Another application of template literals is to simplify writing strings that span multiple lines:
`first
second
third
fourth`

Tagged templates

A template tag is a function performing a transformation on a template literal, returning a string. The signature of a tag function is the following:
tagFunction( literalFragments, ...substitutionValues )
  • literalFragments is an array of Strings that store fragments of the template literal. The original template literal is split by substitutions.
  • the rest parameter ...substitutionValues contains the values of ${...} substitutions.
For the sake of simplicity, suppose that we only use alphanumeric characters in template substitutions. In order to understand how literalFragments are constructed, study the behavior of the JavaScript split method:
let emulatedSubs = '${sub1}abc ${sub2} def${sub3}'
        .split( /\${\w*}/ );
> ["", "abc ", " def", ""]
The emulatedSubs array contains all text fragments in order. The ith element of emulatedSubs is before the (i+1)th argument of the tag function, representing the substitution ${sub_i}.
Let’s now observe how the real literalFragments are constructed. We will create a tag function that prints all of its arguments. Let’s execute this tag function on the template ${sub1}abc ${sub2} def${sub3}.
let sub1=1, sub2=2, sub3 = 3; 
( (x, ...subs) => { 
        console.log( x, ...subs ); 
    })`${sub1}abc ${sub2} def${sub3}`
> ["", "abc ", " def", "", raw: Array[4]] 1 2 3
There is one small difference in the construction of the literalFragments array: the array has an associative property raw, containing the same four literal fragments as raw values.
As a first real example, let’s create a salutation tag.
let salutation = literalFragments =>
    'Hello, ' + literalFragments[0];

console.log( salutation`Ashley` );
> "Hello, Ashley"
If variable substitutions occur inside the template literal, their values can also be manipulated using tag functions.
let price = 5999.9;
let currencySymbol = '€';
let productName = 'Titanium Toothbrush';

let formatCurrency = function( currency, amount ) {
    return amount.toFixed(2) + currency;
}

let format = (textArray, ...substitutions) => {
    let template = textArray[0];
    template += substitutions[0];
    template += textArray[1];
    template += formatCurrency( substitutions[1], substitutions[2] );
    template += textArray[3];

    return template;
};

format`
<div class="js-product">
    Product: ${productName}
</div>
<div class="js-price">
    Price: ${currencySymbol}${price}
</div>
`
In the format function, we can access all variable substitutions and template fragments, and we can concatenate them in any order. Substitutions come from evaluating the values of productName, currencySymbol, and price in the scope of the template evaluation.
The result of the above tagged template looks like this:
<div class="js-product">
    Product: Titanium Toothbrush
</div>
<div class="js-price">
    Price: 5999.90€
</div>
Even though in this specific case, the format function relies on knowledge of the structure of the template, in some cases, templates have a variable number of substitutions. One of the exercises will require you to create tag functions handling variable number of substitutions.