Quantcast
Channel: Updates
Viewing all 599 articles
Browse latest View live

Loading WebAssembly modules efficiently

$
0
0

Loading WebAssembly modules efficiently

When working with WebAssembly, you often want to download a module, compile it, instantiate it, and then use whatever it exports in JavaScript. This post starts off with a common but suboptimal code snippet doing exactly that, discusses several possible optimizations, and eventually shows the simplest, most efficient way of running WebAssembly from JavaScript.

Note: Tools like Emscripten can produce the needed boilerplate code for you, so you don’t necessarily have to code this yourself. In cases where you need fine-grained control over the loading of WebAssembly modules however, it helps to keep the following best practices in mind.

This code snippet does the complete download-compile-instantiate dance, albeit in a suboptimal way:

// Don’t use this!
(async () => {
  const response = await fetch('fibonacci.wasm');
  const buffer = await response.arrayBuffer();
  const module = new WebAssembly.Module(buffer);
  const instance = new WebAssembly.Instance(module);
  const result = instance.exports.fibonacci(42);
  console.log(result);
})();

Note how we use new WebAssembly.Module(buffer) to turn a response buffer into a module. This is a synchronous API, meaning it blocks the main thread until it completes. To discourage its use, Chrome disables WebAssembly.Module for buffers larger than 4 KB. To work around the size limit, we can use await WebAssembly.compile(buffer) instead:

(async () => {
  const response = await fetch('fibonacci.wasm');
  const buffer = await response.arrayBuffer();
  const module = await WebAssembly.compile(buffer);
  const instance = new WebAssembly.Instance(module);
  const result = instance.exports.fibonacci(42);
  console.log(result);
})();

await WebAssembly.compile(buffer) is still not the optimal approach, but we’ll get to that in a second.

Almost every operation in the modified snippet is now asynchronous, as the use of await makes clear. The only exception is new WebAssembly.Instance(module). For consistency, we can use the asynchronous WebAssembly.instantiate(module).

(async () => {
  const response = await fetch('fibonacci.wasm');
  const buffer = await response.arrayBuffer();
  const module = await WebAssembly.compile(buffer);
  const instance = await WebAssembly.instantiate(module);
  const result = instance.exports.fibonacci(42);
  console.log(result);
})();

Let’s get back to the compile optimization I hinted at earlier. With streaming compilation, the browser can already start to compile the WebAssembly module while the module bytes are still downloading. Since download and compilation happen in parallel, this is faster — especially for large payloads.

When the download time is
longer than the compilation time of the WebAssembly module, then WebAssembly.compileStreaming()
finishes compilation almost immediately after the last bytes are downloaded.

To enable this optimization, use WebAssembly.compileStreaming instead of WebAssembly.compile. This change also allows us to get rid of the intermediate array buffer, since we can now pass the Response instance returned by await fetch(url) directly.

(async () => {
  const response = await fetch('fibonacci.wasm');
  const module = await WebAssembly.compileStreaming(response);
  const instance = await WebAssembly.instantiate(module);
  const result = instance.exports.fibonacci(42);
  console.log(result);
})();

Note: The server must be configured to serve the .wasm file with the correct MIME type by sending the Content-Type: application/wasm header. In previous examples, this wasn’t necessary since we were passing the response bytes as an array buffer, and so no MIME type checking took place.

The WebAssembly.compileStreaming API also accepts a promise that resolves to a Response instance. If you don’t have a need for response elsewhere in your code, you can pass the promise returned by fetch directly, without explicitly awaiting its result:

(async () => {
  const fetchPromise = fetch('fibonacci.wasm');
  const module = await WebAssembly.compileStreaming(fetchPromise);
  const instance = await WebAssembly.instantiate(module);
  const result = instance.exports.fibonacci(42);
  console.log(result);
})();

If you don’t need the fetch result elsewhere either, you could even pass it directly:

(async () => {
  const module = await WebAssembly.compileStreaming(
    fetch('fibonacci.wasm'));
  const instance = await WebAssembly.instantiate(module);
  const result = instance.exports.fibonacci(42);
  console.log(result);
})();

I personally find it more readable to keep it on a separate line, though.

See how we compile the response into a module, and then instantiate it immediately? As it turns out, WebAssembly.instantiate can compile and instantiate in one go. The WebAssembly.instantiateStreaming API does this in a streaming manner:

(async () => {
  const fetchPromise = fetch('fibonacci.wasm');
  const { module, instance } = await WebAssembly.instantiateStreaming(fetchPromise);
  // To create a new instance later:
  const otherInstance = await WebAssembly.instantiate(module); 
  const result = instance.exports.fibonacci(42);
  console.log(result);
})();

If you only need a single instance, there’s no point in keeping the module object around, simplifying the code further:

// This is our recommended way of loading WebAssembly.
(async () => {
  const fetchPromise = fetch('fibonacci.wasm');
  const { instance } = await WebAssembly.instantiateStreaming(fetchPromise);
  const result = instance.exports.fibonacci(42);
  console.log(result);
})();

The optimizations we applied can be summarized as follows:

  • Use asynchronous APIs to avoid blocking the main thread
  • Use streaming APIs to compile and instantiate WebAssembly modules more quickly
  • Don’t write code you don’t need

Have fun with WebAssembly!


New in Chrome 66

$
0
0

New in Chrome 66

And there’s plenty more!

I’m Pete LePage. Let’s dive in and see what’s new for developers in Chrome 66!

Note: Want the full list of changes? Check out the Chromium source repository change list.

CSS Typed Object Model

If you’ve ever updated a CSS property via JavaScript, you’ve used the CSS object model. But it returns everything as a string.

el.style.opacity = 0.3;
console.log(typeof el.style.opacity);
> 'string' // A string!?

To animate the opacity property, I’d have to cast the string to a number, then increment the value and apply my changes. Not exactly ideal.

function step(timestamp) {
  const currentOpacity = parseFloat(el.style.opacity);
  const newOpacity = currentOpacity + 0.01;
  element.style.opacity = newOpacity;
  if (newOpacity <= 1) {
    window.requestAnimationFrame(step);
  }
}

With the new CSS Typed Object Model, CSS values are exposed as typed JavaScript objects, eliminating a lot of the type manipulation, and providing a more sane way of working with CSS.

Instead of using element.style, you access styles through the .attributeStyleMap property or .styleMap. They return a map-like object that makes it easy to read or update.

el.attributeStyleMap.set('opacity', 0.3);
const oType = typeof el.attributeStyleMap.get('opacity').value;
console.log(oType);
> 'number' // Yay!

Compared to the old CSS Object Model, early benchmarks show about a 30% improvement in operations per second - something that’s especially important for JavaScript animations.

el.attributeStyleMap.set('opacity', 0.3);
el.attributeStyleMap.has('opacity'); // true
el.attributeStyleMap.delete('opacity');
el.attributeStyleMap.clear(); // remove all styles

It also helps to eliminate bugs caused by forgetting to cast the value from a string to a number, and it automatically handles rounding and clamping of values. Plus, there’s some pretty neat new methods for dealing with unit conversions, arithmetic and equality.

el.style.opacity = 3;
const opacity = el.computedStyleMap().get('opacity').value;
console.log(opacity);
> 1

Eric has a great post with several demos and examples in his explainer.

Async Clipboard API

const successful = document.execCommand('copy');

Synchronous copy & paste using document.execCommand can be OK for small bits of text, but for anything else, there’s a good chance it’s synchronous nature will block the page, causing a poor experience for the user. And the permission model between browsers is inconsistent.

The new Async Clipboard API is a replacement that works asynchronously, and integrates with the permission API to provide a better experience for users.

Text can be copied to the clipboard by calling writeText().

navigator.clipboard.writeText('Copy me!')
  .then(() => {
    console.log('Text is on the clipboard.');
  });

Since this API is asynchronous, the writeText() function returns a Promise that will be resolved or rejected depending on whether the text we passed is copied successfully.

Similarly, text can be read from the clipboard by calling getText() and waiting for the returned Promise to resolve with the text.

navigator.clipboard.getText()
  .then((text) => {
    console.log('Clipboard: ', text);
  });

Check out Jason’s post and demos in the explainer. He’s also got examples that use async functions.

New Canvas Context BitmapRenderer

The canvas element lets you manipulate graphics at the pixel level, you can draw graphs, manipulate photos, or even do real time video processing. But, unless you’re starting with a blank canvas, you need a way to render an image on the canvas.

Historically, that’s meant creating an image tag, then rendering it’s contents on to the canvas. Unfortunately that means the browser needs to store multiple copies of the image in memory.

const context = el.getContext('2d');
const img = new Image();
img.onload = function () {
  context.drawImage(img, 0, 0);
}
img.src = 'llama.png';

Starting in Chrome 66, there’s a new asynchronous rendering context that’s streamlined the display of ImageBitmap objects. They now render more efficiently and with less jank by working asynchronously and avoiding memory duplication.

To use it:

  1. Call createImageBitmap and hand it an image blob, to create the image.
  2. Grab the bitmaprenderer context from the canvas.
  3. Then transfer the image in.
const image = await createImageBitmap(imageBlob);
const context = el.getContext('bitmaprenderer');
context.transferFromImageBitmap(image);

Done, I’ve rendered the image!

AudioWorklet

Worklets are in! PaintWorklet shipped in Chrome 65, and now we’re enabling AudioWorklet by default in Chrome 66. This new type of Worklet can be used to process audio in the dedicated audio thread, replacing the legacy ScriptProcessorNode which ran on the main thread. Each AudioWorklet runs in its own global scope, reducing latency and increasing throughput stability.

There are some interesting examples of AudioWorklet over on Google Chrome Labs.

And more!

These are just a few of the changes in Chrome 66 for developers, of course, there’s plenty more.

  • TextArea and Select now support the autocomplete attribute.
  • Setting autocapitalize on a form element will apply to any child form fields, improving compatibility with Safari’s implementation of autocapitalize.
  • trimStart() and trimEnd() are now available as the standards-based way of trimming whitespace from strings.

Be sure to check out New in Chrome DevTools, to learn what’s new in for DevTools in Chrome 66. And, if you’re interested in Progressive Web Apps, check out the new PWA Roadshow video series. Then, click the subscribe button on our YouTube channel, and you’ll get an email notification whenever we launch a new video, or add our RSS feed to your feed reader.

I’m Pete LePage, and as soon as Chrome 67 is released, I’ll be right here to tell you -- what’s new in Chrome!

Deprecations and removals in Chrome 67

$
0
0

Deprecations and removals in Chrome 67

Deprecate HTTP-Based Public Key Pinning

HTTP-Based Public Key Pinning (HPKP) was intended to allow websites to send an HTTP header that pins one or more of the public keys present in the site's certificate chain. It has very low adoption, and although it provides security against certificate mis-issuance, it also creates risks of denial of service and hostile pinning.

To defend against certificate misissuance, web developers should use the Expect-CT header, including its reporting function. Expect-CT is safer than HPKP due to the flexibility it gives site operators to recover from configuration errors, and due to the built-in support offered by a number of certificate authorities.

We expect to remove this in Chrome 69.

Intent to Remove | ChromeStatus | Chromium Bug

Deprecate AppCache on Non-secure Contexts

AppCache over HTTP is deprecated. AppCache is a powerful feature that allows offline and persistent access to an origin. Allowing AppCache to be used over non-secure contexts makes it an attack vector for cross-site scripting hacks.

Removal is expected in Chrome 69.

Intent to Remove | ChromeStatus | Chromium Bug

Layout

Several -webkit- prefixed CSS properties will be removed in this release:

  • -webkit-box-flex-group: This property has virtually zero usage based on the UseCounter in stable.
  • Percent (%) values for -webkit-line-clamp: There is interest in finding a standards-based solution to the number values use case, but we haven't seen demand for the %-based values.
  • -webkit-box-lines: This property was never fully implemented. It was originally intended such that a "vertical"/"horizontal" -webkit-box could have multiple rows/columns.

Intent to Remove | ChromeStatus | Chromium Bug

BigInt: arbitrary-precision integers in JavaScript

$
0
0

BigInt: arbitrary-precision integers in JavaScript

BigInts are a new numeric primitive in JavaScript that can represent integers with arbitrary precision. With BigInts, you can safely store and operate on large integers even beyond the safe integer limit for Numbers. This article walks through some use cases and explains the new functionality in Chrome 67 by comparing BigInts to Numbers in JavaScript.

Use cases

Arbitrary-precision integers unlock lots of new use cases for JavaScript.

BigInts make it possible to correctly perform integer arithmetic without overflowing. That by itself enables countless new possibilities. Mathematical operations on large numbers are commonly used in financial technology, for example.

Large integer IDs and high-accuracy timestamps cannot safely be represented as Numbers in JavaScript. This often leads to real-world bugs, and causes JavaScript developers to represent them as strings instead. With BigInt, this data can now be represented as numeric values.

BigInt could form the basis of an eventual BigDecimal implementation. This would be useful to represent sums of money with decimal precision, and to accurately operate on them (a.k.a. the 0.10 + 0.20 !== 0.30 problem).

Previously, JavaScript applications with any of these use cases had to resort to userland libraries that emulate BigInt-like functionality. When BigInt becomes widely available, such applications can drop these run-time dependencies in favor of native BigInts. This helps reduce load time, parse time, and compile time, and on top of all that offers significant run-time performance improvements.

The native BigInt implementation in Chrome performs better than popular userland
             libraries.
The native BigInt implementation in Chrome performs better than popular userland libraries.

“Polyfilling” BigInts requires a run-time library that implements similar functionality, as well as a transpilation step to turn the new syntax into a call to the library’s API. Babel currently supports parsing BigInt literals through a plugin, but doesn’t transpile them. As such, we don’t expect BigInts to be used in production sites that require broad cross-browser compatibility just yet. It’s still early days, but now that the functionality is starting to ship in browsers, you can start to experiment with BigInts. Expect wider BigInt support soon.

The status quo: Number

Numbers in JavaScript are represented as double-precision floats. This means they have limited precision. The Number.MAX_SAFE_INTEGER constant gives the greatest possible integer that can safely be incremented. Its value is 2**53-1.

const max = Number.MAX_SAFE_INTEGER;
// → 9_007_199_254_740_991

Note: For readability, I’m grouping the digits in this large number per thousand, using underscores as separators. The numeric literal separators proposal enables exactly that for common JavaScript numeric literals.

Incrementing it once gives the expected result:

max + 1;
// → 9_007_199_254_740_992 ✅

But if we increment it a second time, the result is no longer exactly representable as a JavaScript Number:

max + 2;
// → 9_007_199_254_740_992 ❌

Note how max + 1 produces the same result as max + 2. Whenever we get this particular value in JavaScript, there is no way to tell whether it’s accurate or not. Any calculation on integers outside the safe integer range (i.e. from Number.MIN_SAFE_INTEGER to Number.MAX_SAFE_INTEGER) potentially loses precision. For this reason, we can only rely on numeric integer values within the safe range.

The new hotness: BigInt

BigInts are a new numeric primitive in JavaScript that can represent integers with arbitrary precision. With BigInts, you can safely store and operate on large integers even beyond the safe integer limit for Numbers.

To create a BigInt, add the n suffix to any integer literal. For example, 123 becomes 123n. The global BigInt(number) function can be used to convert a Number into a BigInt. In other words, BigInt(123) === 123n. Let’s use these two techniques to solve the problem we were having earlier:

BigInt(Number.MAX_SAFE_INTEGER) + 2n;
// → 9_007_199_254_740_993n ✅

Here’s another example, where we’re multiplying two Numbers:

1234567890123456789 * 123;
// → 151851850485185200000 ❌

Looking at the least significant digits, 9 and 3, we know that the result of the multiplication should end in 7 (because 9 * 3 === 27). However, the result ends in a bunch of zeroes. That can’t be right! Let’s try again with BigInts instead:

1234567890123456789n * 123n;
// → 151851850485185185047n ✅

This time we get the correct result.

The safe integer limits for Numbers don’t apply to BigInts. Therefore, with BigInt we can perform correct integer arithmetic without having to worry about losing precision.

A new primitive

BigInts are a new primitive in the JavaScript language. As such, they get their own type that can be detected using the typeof operator:

typeof 123;
// → 'number'
typeof 123n;
// → 'bigint'

Because BigInts are a separate type, a BigInt is never strictly equal to a Number, e.g. 42n !== 42. To compare a BigInt to a Number, convert one of them into the other’s type before doing the comparison or use abstract equality (==):

42n === BigInt(42);
// → true
42n == 42;
// → true

When coerced into a boolean (which happens when using if, &&, ||, or Boolean(int), for example), BigInts follow the same logic as Numbers.

if (0n) {
  console.log('if');
} else {
  console.log('else');
}
// → logs 'else', because `0n` is falsy.

Operators

BigInts support the most common operators. Binary +, -, *, and ** all work as expected. / and % work, and round towards zero as needed. Bitwise operations |, &, <<, >>, and ^ perform bitwise arithmetic assuming a two’s complement representation for negative values, just like they do for Numbers.

(7 + 6 - 5) * 4 ** 3 / 2 % 3;
// → 1
(7n + 6n - 5n) * 4n ** 3n / 2n % 3n;
// → 1n

Unary - can be used to denote a negative BigInt value, e.g. -42n. Unary + is not supported because it would break asm.js code which expects +x to always produce either a Number or an exception.

One gotcha is that it’s not allowed to mix operations between BigInts and Numbers. This is a good thing, because any implicit coercion could lose information. Consider this example:

BigInt(Number.MAX_SAFE_INTEGER) + 2.5;
// → ?? 🤔

What should the result be? There is no good answer here. BigInts can’t represent fractions, and Numbers can’t represent BigInts beyond the safe integer limit. For that reason, mixing operations between BigInts and Numbers results in a TypeError exception.

The only exception to this rule are comparison operators such as === (as discussed earlier), <, and >= – because they return booleans, there is no risk of precision loss.

1 + 1n;
// → TypeError
123 < 124n;
// → true

Note: Because BigInts and Numbers generally don’t mix, please avoid overloading or magically "upgrading" your existing code to use BigInts instead of Numbers. Decide which of these two domains to operate in, and then stick to it. For new APIs that operate on potentially large integers, BigInt is the best choice. Numbers still make sense for integer values that are known to be in the safe integer range.

Another thing to note is that the >>> operator, which performs an unsigned right shift, does not make sense for BigInts since they’re always signed. For this reason, >>> does not work for BigInts.

API

Several new BigInt-specific APIs are available.

The global BigInt constructor is similar to the Number constructor: it converts its argument into a BigInt (as mentioned earlier). If the conversion fails, it throws a SyntaxError or RangeError exception.

BigInt(123);
// → 123n
BigInt(1.5);
// → RangeError
BigInt('1.5');
// → SyntaxError

Two library functions enable wrapping BigInt values as either signed or unsigned integers, limited to a specific number of bits. BigInt.asIntN(width, value) wraps a BigInt value to a width-digit binary signed integer, and BigInt.asUintN(width, value) wraps a BigInt value to a width-digit binary unsigned integer. If you’re doing 64-bit arithmetic for example, you can use these APIs to stay within the appropriate range:

// Highest possible BigInt value that can be represented as a
// signed 64-bit integer.
const max = 2n ** (64n - 1n) - 1n;
BigInt.asIntN(64, max);
→ 9223372036854775807n
BigInt.asIntN(64, max + 1n);
// → -9223372036854775808n
//    ^ negative because of overflow

Note how overflow occurs as soon as we pass a BigInt value exceeding the 64-bit integer range (i.e. 63 bits for the absolute numeric value + 1 bit for the sign).

BigInts make it possible to accurately represent 64-bit signed and unsigned integers, which are commonly used in other programming languages. Two new typed array flavors, BigInt64Array and BigUint64Array, make it easier to efficiently represent and operate on lists of such values:

const view = new BigInt64Array(4);
// → [0n, 0n, 0n, 0n]
view.length;
// → 4
view[0];
// → 0n
view[0] = 42n;
view[0];
// → 42n

The BigInt64Array flavor ensures that its values remain within the signed 64-bit limit.

// Highest possible BigInt value that can be represented as a
// signed 64-bit integer.
const max = 2n ** (64n - 1n) - 1n;
view[0] = max;
view[0];
// → 9_223_372_036_854_775_807n
view[0] = max + 1n;
view[0];
// → -9_223_372_036_854_775_808n
//    ^ negative because of overflow

The BigUint64Array flavor does the same using the unsigned 64-bit limit instead.

Have fun with BigInts!

Note: Thanks to Daniel Ehrenberg, the BigInt proposal champion, for reviewing this article.

Announcing Lighthouse 3.0

$
0
0

Announcing Lighthouse 3.0

Lighthouse Logo

Lighthouse 3.0 is out! 3.0 features faster audits, less variance, a new report UI, new audits, and more.

How to update to 3.0

  • CLI. Run npm install -g lighthouse@next.
  • Node. Run npm install lighthouse@next.
  • Chrome Extension. Your extension should auto-update to 3.0.
  • Chrome DevTools. Lighthouse 3.0 will be available in Chrome 69.

Faster audits and less variance

Lighthouse 3.0 completes your audits faster, with less variance between runs, thanks to a few changes:

  • Simulated throttling. Previously Lighthouse actually throttled your page before running audits. Now, Lighthouse uses a new internal auditing engine, condenamed Lantern, that runs your audits under your normal network and CPU settings, and then estimates how long the page would take to load under mobile conditions.
  • Smaller waiting periods. To determine that a page has finished loading, Lighthouse needs to wait for the network and CPU to have no activity. This waiting period is smaller in v3.

New Report UI

Lighthouse 3.0 features a brand-new report UI, thanks to a collaboration between the Lighthouse and Chrome UX (Research & Design) teams.

Lighthouse 3 Report run on GMail
Figure 1. Lighthouse v3 report run on GMail's about page

Invocation changes

The Node version of Lighthouse now supports the same configuration options as the CLI version. This could be a breaking change, depending on how you configured your Node Lighthouse module in v2. See Invocation changes for more information.

Scoring changes

In Lighthouse 3.0 the scoring model for Performance audits changes. A score of 50 represents the 75th percentile, and a perfect score of 100 represents the 98th percentile, which is the point of diminishing returns.

The Performance score is a weighted average of the Performance audits. The weighting of the audits also changes in v3.

Audit Name v2 Weight v3 Weight
First Contentful Paint (New in v3) N/A 3
First Meaningful Paint 5 1
First CPU Idle (First Interactive in v2) 5 3
Time To Interactive (Consistently Interactive in v2) 5 5
Perceptual Speed Index 1 N/A
Speed Index N/A 4
Estimated Input Latency 1 0

Going forward, the Lighthouse v3 Scoring Guide is the source of truth for anything you need to know regarding how scoring works in Lighthouse v3.

New output formats and changes

CSV output support

Report results can now be output in CSV. Each row contains information and results for one audit, including:

  • The name of the category that the audit belongs to.
  • The name of the audit.
  • A description of the audit.
  • The score type used for the audit.
  • The score value.

JSON output changes

Version 3.0 introduces many changes to Lighthouse's JSON output format. See Lighthouse v3 Migration Guide for more details.

New audits

First Contentful Paint

Measure the time at which text or image content is first painted to the user's screen.

robots.txt is not valid

Ensure that your site's robots.txt file is properly formed so that search bots can crawl your site.

Use video formats for animated content

Replace GIFs with video tags for massive potential savings in video file sizes.

See Replace Animated GIFs with Video to learn more.

Avoid multiple, costly round trips to any origin

Improve your load performance by adding rel="preconnect" attributes to link tags, which informs the browser to establish a connection to an origin as soon as possible.

See Preconnect to learn more.

Audit changes

First Interactive ➡ First CPU Idle

The First Interactive audit has been renamed to First CPU Idle to better describe how it works. The general purpose of the audit is the same. Use this audit to measure when users are first able to interact with your page.

Perceptual Speed Index ➡ Speed Index

In Lighthouse 3.0 the Perceptual Speed Index audit is now Speed Index. This change aligns Lighthouse with how WebPageTest measures this metric. The purpose of the audit is the same, but the underyling metric is slightly different.

Using Lighthouse To Improve Page Load Performance

$
0
0

Using Lighthouse To Improve Page Load Performance

Lighthouse is an automated tool for improving the quality of your site. You give it a URL, and it provides a list of recommendations on how to improve page performance, make pages more accessible, adhere to best practices and more. You can run it from within Chrome DevTools, as a Chrome Extension, or even as a Node module, which is useful for continuous integration.

For a while now, Lighthouse has provided many tips for improving page load performance, such as enabling text compression or reducing render-blocking scripts. The Lighthouse team continues to ship new audits to give you even more useful advice for making your sites faster. This post is a roundup of newer performance audits that you may not be aware of, including:

Main Thread Work Breakdown

If you've ever used the performance panel in DevTools, you know it can be a bit of a chore to get a breakdown of where CPU time was spent loading a page. We're pleased to announce that this information is now readily and conveniently available via the new Main Thread Work Breakdown audit.

A breakdown of main thread activity in
Lighthouse.
Figure 1. A breakdown of main thread activity in Lighthouse.

This new diagnostic evaluates how much and what kind of activity occurs during page load, which you can use to get a handle on loading performance issues related to layout, script eval, parsing, and other activity.

Preload Key Requests

When browsers retrieve resources, they do so as they find references to them within the document and its subresources. This can be suboptimal at times, because some critical resources are discovered rather late in the page load process. Thankfully, rel=preload gives developers the ability to hint to compliant browsers which resources should be fetched as soon as possible. The new Preload Key Requests audit lets developers know what resources could benefit from being loaded sooner by rel=preload.

The Preload Key Requests Lighthouse audit
recommending a list of resources to consider preloading.
Figure 2. The Preload Key Requests Lighthouse audit recommending a list of resources to consider preloading.

It's super important you test and compare performance changes with and without rel=preload, as it can affect loading performance in ways you might not expect. For example, preloading a large image could delay initial render, but the tradeoff is that the preloaded image will appear sooner in the layout. Always make sure you're cool with the results!

JavaScript Boot-up Time is High

When too much JavaScript is loaded, the page can become unresponsive as the browser parses, compiles, and executes it. 3rd-party scripts and advertisements are a particular source of excessive script activity that can bog down even powerful devices. The new JavaScript Boot-up Time is High audit reveals how much CPU time each script on a page consumes, along with its URL:

Lighthouse displaying the amount of evaluation,
parsing, and compiling time for scripts on a page.
Figure 3. Lighthouse displaying the amount of evaluation, parsing, and compiling time for scripts on a page.

When you run this audit, you can also enable third party badges in the network panel and filter the list to identify third party script resources. With the data from this audit, you'll be better equipped to find sources of excessive JavaScript activity that turn pages from snappy to laggy. For scripts specific to your application, you can employ techniques like code splitting and tree shaking to limit the amount of JavaScript on each page of your site.

Avoids Page Redirects

Sometimes when a browser requests a URL, the server can respond with a 300-level status code. This causes the browser to redirect to another URL. While redirects are necessary for SEO and convenience purposes, they add latency to requests. This is especially true if they redirect to other origins, which can incur additional DNS lookup and connection/TLS negotiation time.

A redirect chain as seen in the network panel of
Chrome's developer tools.
Figure 4. A redirect chain as seen in the network panel of Chrome's developer tools.

Redirects are undesirable for landing pages on your site. To help you reduce latency and improve loading performance, Lighthouse now offers the Avoids Page Redirects audit, which lets you know when a navigation triggers any redirects.

A list of page redirects in Lighthouse.
Figure 5. A list of page redirects in Lighthouse.

Note that this audit is difficult to trigger in the DevTools version of Lighthouse, because it analyzes the current URL in the address bar of the page, which reflects the resolution of all redirects. You're likeliest to see this audit populated in the Node CLI.

Unused JavaScript

Dead code can be a serious problem in JavaScript-heavy applications. While it doesn't pose execution costs as it's never invoked, it does carry other undesirable effects. Dead code is still downloaded, parsed, and compiled by the browser. This affects loading performance and JavaScript boot-up time. Similar to the coverage panel in DevTools, the Unused JavaScript audit reveals JavaScript downloaded by the current page, but is never used.

Lighthouse displaying the amount of unused
JavaScript on a page.
Figure 6. Lighthouse displaying the amount of unused JavaScript on a page.

With this audit, you can identify dead code in your applications and remove it to improve loading performance and reduce system resource usage. Pro tip: You can also use the code coverage panel in Chrome's dev tools to find this information!

Note: This audit is off by default! It can be enabled in the Node CLI by using the lighthouse:full configuration profile.

Uses Inefficient Cache Policy on Static Assets

While much performance advice tends to focus on boosting the speed of a website for first time users, it's also important to use caching to improve loading performance for returning users. The Uses Inefficient Cache Policy on Static Assets audit inspects caching headers for network resources, and notifies you if cache policies for static resources are substandard.

Figure 7.

With the help of this audit, you'll be able to easily find and fix problems with your current cache policy. This will greatly improve performance for returning users, and they'll appreciate the extra speed!

Avoid Costly Multiple Round-Trips to Any Origin

When browsers retrieve resources from a server, it can take significant time to perform a DNS lookup and establish a connection to a server. rel=preconnect allows developers to mask this latency by establishing connections to other servers before the browser would in due course. The Avoid Costly Multiple Round-Trips to Any Origin audit will help you discover opportunities to use rel=preconnect!

A list of origins recommended for
<code>rel=preconnect</code> in Lighthouse.
Figure 8. A list of origins recommended for rel=preconnect in Lighthouse.

When latency for cross-origin assets is reduced, users will perceive that things are moving along a bit quicker. With this new Lighthouse audit, you'll learn of new opportunities to use rel=preconnect to do just that.

Use Video Formats for Animated Content

Animated GIFs are huge, often consuming at least several hundred kilobytes if not several megabytes of data. If you care about loading performance, converting those GIFs to video is the way to go. Thankfully, the Use Video Formats for Animated Content audit has your back.

A recommendation to convert a GIF to video in
Lighthouse.
Figure 9. A recommendation to convert a GIF to video in Lighthouse.

If your site has any GIFs that are over 100 KB, this audit will automatically flag them and direct you to some guidance on how to convert them to video and embed them. Sites like Imgur have significantly improved loading performance by converting their GIFs to video. Additionally, if your site is on a hosting plan with metered bandwidth, the potential cost savings alone should be enough to persuade you!

Give Lighthouse a try!

If you're excited about these new audits, update Lighthouse and give them a try!

  • The Lighthouse Chrome extension should automatically update, but you can manually update it via chrome://extensions.
  • In DevTools, you can run Lighthouse in the audits panel. Chrome updates to a new version about every 6 weeks, so some newer audits may not be available. If you're antsy to use the latest audits available, you can run the latest Chrome code by downloading Chrome Canary.
  • For Node users: Run npm update lighthouse, or npm update lighthouse -g if you installed Lighthouse globally.

Progressive Web Apps on the Desktop

$
0
0

Progressive Web Apps on the Desktop

Dogfood: Support for Desktop Progressive Web Apps is supported on Chrome OS 67, which is currently the beta branch. Work is already under way to support Mac and Windows.

Spotify's desktop progressive web app

Desktop progressive web apps can be 'installed' on the users device much like native apps. They're fast. Feel integrated because they launched in the same way as other apps, and run in an app window, without an address bar or tabs. They're reliable because service workers can cache all of the assets they need to run. And they create an engaging experience for users.

Desktop usage is important

Mobile has driven a lot of the evolution of Progressive Web Apps. But while the growth of mobile has been so strong, desktop usage is still growing. Mobile phone use peaks in the morning and evening, and tablet also has significantly higher use in the evening. Desktop usage is more evenly distributed throughout the day than mobile usage. It has significant use during the day when most people are at work and at their desks.

Device usage by time

Having that ‘installed’, native feel, is important to users, it gives them the confidence that the app will be fast, integrated, reliable and engaging. Desktop Progressive Web Apps can be launched from the same place as other desktop apps, but that they run in an app window - so they look and feel like other apps on the desktop.

Getting started

Dogfood: Desktop Progressive Web App support is available on Chrome OS 67 (currently beta), but work is underway to support Mac and Windows. To experiment with desktop progressive web apps in Chrome on on other operating systems, enable the #enable-desktop-pwas flag.

Getting started isn't any different than what you're already doing today; it's not like this is a whole new class of apps. All of the work you've done for your existing Progressive Web App still applies. Service workers make it works fast, and reliably; Web Push and Notifications keep users updated, and it can be ‘installed’ with the add to home screen prompt. The only real difference is that instead of running in a browser tab, it's running in an app window.

Add to home screen

Spotify's Add to Home Screen button

If the add to home screen criteria are met, Chrome will fire a beforeinstallprompt event. In the event handler, save the event, and update your user interface to indicate to the user that they can add your app to the home screen. For example, Spotify's desktop Progressive Web App, adds an 'Install App' button, just above the users profile name.

See Add to Home Screen for more information about how to handle the event, update the UI and show the add to home screen prompt.

The app window

With an app window, there are no tabs or address bar, it’s just your app. It’s optimized to support the needs of apps, with more flexible window organization and manipulation compared to browser tabs. App windows make it easy to uni- task with the window in full screen, or multi-task with multiple windows open. App windows also make it really easy to switch between apps using an app switcher or a keyboard shortcut such as alt-tab.

App window components on Chrome OS

As you’d expect, the app window has the standard title bar icons to minimize, maximize and close the window. On Chrome OS, the title bar is also themed, based on the theme_color defined in the web app manifest. And your app should be designed to take up the full width of the window.

App menu

Within the app window, there’s also the app menu (the button with the three dots), that gives you access to information about the app, makes it easy to access the URL, print the page, change the page zoom, or open the app in your browser.

Design considerations

There are some unique design considerations you need to take into account when building Desktop Progressive Web Apps, things that don’t necessarily apply to Progressive Web Apps on mobile devices.

Full screen app window

Apps on the desktop have access to significantly larger screen real-estate. Don’t just pad your content with extra margin, but use that additional space by creating new breakpoints for wider screens. Some applications really benefit from that wider view.

When thinking about your break-points, think about how users will use your app and how they may resize it. In a weather app, a large window might show a 7 day forecast, then, as the window gets smaller, instead of shrinking everything down, it might show a 5 day forecast. As it continues to get smaller, content might shuffle around, and it's been optimized for the smaller display.

7 day forecast in menu 5 day forecast in menu

Full screen app window

For some apps, a mini-mode might be really helpful. This weather app shows only the current conditions. A music player might only show me the current song and the buttons to change to the next song.

You can take this idea of responsive design to the next level to support convertibles like the Pixelbook or the Surface. When switched to tablet mode, these devices make the active window full screen, and depending on how the user holds the device, may be either landscape or portrait.

Focus on getting responsive design right - and that’s what matters here. Whether the user has resized the window, or the device has done so because it's switched to tablet mode, responsive design is critical to a successful desktop progressive web app.

The app window on desktop opens up so many new possibilities. Work with your designer and take a responsive approach that adds new breakpoints for larger screens, supports landscape or portrait views, works when fullscreen - or not, and works nicely with virtual keyboards.

What's next?

We’re already working on support for Mac and Windows. For all of these platforms, we’re looking at:

  • Adding support for keyboard shortcuts, so you can provide your own functionality.
  • Badging for the launch icon, so you can let the user know about important events that you don’t want to display a full notification for.
  • And link capturing - opening the installed PWA when the user clicks on a link handled by that app.

Learn more

Check out my Google I/O talk, PWAs: building bridges to mobile, desktop, and native, it covers everything from Desktop PWAs to, upcoming changes to add to home screen prompts, and more.

Welcome to the immersive web

$
0
0

Welcome to the immersive web

The immersive web means virtual world experiences hosted through the browser. This covers entire virtual reality (VR) experiences surfaced in the browser or in VR enabled headsets like Google's Daydream, Oculus Rift, Samsung Gear VR, HTC Vive, and Windows Mixed Reality Headsets, as well as augmented reality experiences developed for AR-enabled mobile devices.

Welcome to the immersive web.
Welcome to the immersive web.

Though we use two terms to describe immersive experiences, they should be thought of as a spectrum from complete reality to a completely immersive VR environment, with various levels of AR in between.

The immersive web is a specturum from complete reality to
  completely immersive, with various levels in between.
The immersive web is a specturum from complete reality to completely immersive, with various levels in between.

Examples of immersive experiences include:

  • Immersive 360° videos
  • Traditional 2D (or 3D) videos presented in immersive surroundings
  • Data visualizations
  • Home shopping
  • Art
  • Something cool nobody's thought of yet

How do I get there?

The immersive web has been available for nearly a year now in embryonic form. This was done through the WebVR 1.1 API, which has been available in an origin trial since Chrome 62. That API is also supported by Firefox and Edge as well as a polyfill for Safari.

But it's time to move on.

The origin trial is ending on July 24, 2018, and the spec has been superseded by the WebXR Device API and a new origin trial.

Note: If you're participating in the WebVR origin trial, you need a separate registration for the WebXR Origin Trial (explainer, sign-up form).

What happened to WebVR 1.1?

We learned a lot from WebVR 1.1, but over time, it became clear that some major changes were needed to support the types of applications developers want to build. The full list of lessons learned is too long to go into here, but includes issues like the API being explicitly tied to the main JavaScript thread, too many opportunities for developers to set up obviously wrong configurations, and common uses like magic window being a side effect rather than an intentional feature. (Magic window is a technique for viewing immersive content without a headset wherein the app renders a single view based on the device's orientation sensor.)

The new design facilitates simpler implementations and large performance improvements. At the same time, AR and other use cases were emerging and it became important that the API be extensible to support those in the future.

The WebXR Device API was designed and named with these expanded use cases in mind and provides a better path forward. Implementers of WebVR have committed to migrating to the WebXR Device API.

What is the WebXR Device API?

Like the WebVR spec before it, the WebXR Device API is a product of the Immersive Web Community Group which has contributors from Google, Microsoft, Mozilla, and others. The 'X in XR is intended as a sort of algebraic variable that stands for anything in the spectrum of immersive experiences. It's available in the previously mentioned origin trial as well as through a polyfill.

Note: As of Chrome 67 only VR capabilities are enabled. AR capabilities will land in Chrome 68 (Canary) soon and I hope to tell you about them in six weeks or so.

There's more to this new API than I can go to in an article like this. I want to give you enough to start making sense of the WebXR samples. You can find more information in both the original explainer and our Immersive Web Early Adopters Guide. I'll be expanding the latter as the origin trial progresses. Feel free to open issues or submit pull requests. For this article, I'm going to discuss starting, stopping and running an XR session, plus a few basics about processing input.

What I'm not going to cover is how to draw AR/VR content to the screen. The WebXR Device API does not provide image rendering features. That's up to you. Drawing is done using WebGL APIs. You can do that if you're really ambitious. Though, we recommend using a framework. The immersive web samples use one created just for the demos called Cottontail. Three.js will support WebXR in early May. There is no official word yet on A-Frame.

Starting and running an app

The basic process is this:

  1. Request an XR device.
  2. If it's available, request an XR session. If you want the user to put their phone in a headset, it's called an exclusive session and requires a user gesture to enter.
  3. Use the session to run a render loop which provides 60 image frames per second. Draw appropriate content to the screen in each frame.
  4. Run the render loop until the user decides to exit.
  5. End the XR session.

Let's look at this in a little more detail and include some code. You won't be able to run an app from what I'm about to show you. But again, this is just to give a sense of it.

Request an XR device

Here, you'll recognize the standard feature detection code. You could wrap this in a function called something like checkForXR().

If you're not using an exclusive session you can skip advertising the functionality and getting a user gesture and go straight to requesting a session. An exclusive session is one that requires a headset. A non-exclusive session simply shows content on the device screen. The former is what most people think of when you refer to virtual reality or augmented reality. The latter is sometimes called a 'magic window'.

if (navigator.xr) {
  navigator.xr.requestDevice()
  .then(xrDevice => {
    // Advertise the AR/VR functionality to get a user gesture.
  })
  .catch(err => {
    if (err.name === 'NotFoundError') {
      // No XRDevices available.
      console.error('No XR devices available:', err);
    } else {
      // An error occurred while requesting an XRDevice.
      console.error('Requesting XR device failed:', err);
    }
  })
} else{
  console.log("This browser does not support the WebXR API.");
}
A user gesture in a magic window.
A user gesture in a magic window.

Request an XR session

Now that we have our device and our user gesture, it's time to get a session. To create a session, the browser needs a canvas on which to draw.

xrPresentationContext = htmlCanvasElement.getContext('xrpresent');
let sessionOptions = {
  // The exclusive option is optional for non-exclusive sessions; the value
  //   defaults to false.
  exclusive: false,
  outputContext: xrPresentationContext
}
xrDevice.requestSession(sessionOptions)
.then(xrSession => {
  // Use a WebGL context as a base layer.
  xrSession.baseLayer = new XRWebGLLayer(session, gl);
  // Start the render loop
})

Run the render loop

The code for this step takes a bit of untangling. To untangle it, I'm about to throw a bunch of words at you. If you want a peek at the final code, jump ahead to have a quick look then come back for the full explanation. There's quite a bit that you may not be able to infer.

An immersive image as rendered to each eye.
An immersive image as rendered to each eye.

The basic process for a render loop is this:

  1. Request an animation frame.
  2. Query for the position of the device.
  3. Draw content to the position of the device based on it's position.
  4. Do work needed for the input devices.
  5. Repeat 60 times a second until the user decides to quit.

Request a presentation frame

The word 'frame' has several meanings in a Web XR context. The first is the frame of reference which defines where the origin of the coordinate system is calculated from, and what happens to that origin when the device moves. (Does the view stay the same when the user moves or does it shift as it would in real life?)

The second type of frame is the presentation frame, represented by an XRPresentationFrame object. This object contains the information needed to render a single frame of an AR/VR scene to the device. This is a bit confusing because a presentation frame is retrieved by calling requestAnimationFrame(). This makes it compatible with window.requestAnimationFrame() which comes in useful when ending an XR session. More about that later.

Before I give you any more to digest, I'll offer some code. The sample below shows how the render loop is started and maintained. Notice the dual use of the word frame. And notice the recursive call to requestAnimationFrame(). This function will be called 60 times a second.

xrSession.requestFrameOfReference('eyeLevel')
.then(xrFrameOfRef => {
  xrSession.requestAnimationFrame(onFrame(time, xrPresFrame) {
    // The time argument is for future use and not implemented at this time.
    // Process the frame.
    xrPresFrame.session.requestAnimationFrame(onFrame);
  }
});

Poses

Before drawing anything to the screen, you need to know where the display device is pointing and you need access to the screen. In general, the position and orientation of a thing in AR/VR is called a pose. Both viewers and input devices have a pose. (I cover input devices later.) Both viewer and input device poses are defined as a 4 by 4 matrix stored in a Float32Array in column major order. You get the viewer's pose by calling XRPresentationFrame.getDevicePose() on the current animation frame object. Always test to see if you got a pose back. If something went wrong you don't want to draw to the screen.

let pose = xrPresFrame.getDevicePose(xrFrameOfRef);
if (pose) {
  // Draw something to the screen.
}

Views

After checking the pose, it's time to draw something. The object you draw to is called a view (XRView). This is where the session type becomes important. Views are retrieved from the XRPresentationFrame object as an array. If you're in a non-exclusive session the array has one view. If you're in an exclusive session, the array has two, one for each eye.

for (let view of xrPresFrame.views) {
  // Draw something to the screen.
}

This is an important difference between WebXR and other immersive systems. Though it may seem pointless to iterate through one view, doing so allows you to have a single rendering path for a variety of devices.

The whole render loop

If I put all this together, I get the code below. I've left a placeholder for the input devices, which I'll cover in a later section.

xrSession.requestFrameOfReference('eyeLevel')
.then(xrFrameOfRef => {
  xrSession.requestAnimationFrame(onFrame(time, xrPresFrame) {
    // The time argument is for future use and not implemented at this time.
    let pose = xrPresFrame.getDevicePose(xrFrameOfRef);
    if (pose) {
      for (let view of xrPresFrame.views) {
        // Draw something to the screen.
      }
    }
    // Input device code will go here.
    frame.session.requestAnimationFrame(onFrame);
  }
}

End the XR session

An XR session may end for several reasons, including ending by your own code through a call to XRSession.end(). Other causes include the headset being disconnected or another application taking control of it. This is why a well-behaved application should monitor the end event and when it occurs, discard the session and renderer objects. An XR session once ended cannot be resumed.

xrDevice.requestSession(sessionOptions)
.then(xrSession => {
  // Create a WebGL layer and initialize the render loop.
  xrSession.addEventListener('end', onSessionEnd);
});

// Restore the page to normal after exclusive access has been released.
function onSessionEnd() {
  xrSession = null;

  // Ending the session stops executing callbacks passed to the XRSession's
  // requestAnimationFrame(). To continue rendering, use the window's
  // requestAnimationFrame() function.
  window.requestAnimationFrame(onDrawFrame);
}

How does interaction work?

As with the application lifetime, I'm just going to give you a taste for how to interact with objects in AR or VR.

The WebXR Device API adopts a "point and click" approach to user input. With this approach every input source has a defined pointer ray to indicate where an input device is pointing and events to indicate when something was selected. Your app draws the pointer ray and shows where it's pointed. When the user clicks the input device, events are fired—select, selectStart, and selectEnd, specifically. Your app determines what was clicked and responds appropriately.

Selecting in VR.
Selecting in VR.

The input device and the pointer ray

To users, the pointer ray is just a faint line between the controller and whatever they're pointing at. But your app has to draw it. That means getting the pose of the input device and drawing a line from its location to an object in AR/VR space. That process looks roughly like this:

let inputSources = xrSession.getInputSources();
for (let xrInputSource of inputSources) {
  let inputPose = frame.getInputPose(inputSource, xrFrameOfRef);
  if (!inputPose) {
    continue;
  }
  if (inputPose.gripMatrix) {
    // Render a virtual version of the input device
    //   at the correct position and orientation.
  }
  if (inputPose.pointerMatrix) {
    // Draw a ray from the gripMatrix to the pointerMatrix.
  }
}

This is a stripped down version of the Input Tracking sample from the Immersive Web Community Group. As with frame rendering, drawing the pointer ray and the device is up to you. As alluded to earlier, this code must be run as part of the render loop.

Selecting items in virtual space

Merely pointing at things in AR/VR is pretty useless. To do anything useful, users need the ability to select things. The WebXR Device API provides three events for responding to user interactions: select, selectStart, and selectEnd. They have a quirk I didn't expect: they only tell you that an input device was clicked. They don't tell you what item in the environment was clicked. Event handlers are added to the XRSession object and should be added as soon as its available.

xrDevice.requestSession(sessionOptions)
.then(xrSession => {
  // Create a WebGL layer and initialize the render loop.
  xrSession.addEventListener('selectstart', onSelectStart);
  xrSession.addEventListener('selectend', onSelectEnd);
  xrSession.addEventListener('select', onSelect);
});

This code is based on an Input Selection example , in case you want more context.

To figure out what was clicked you use a pose. (Are you surprised? I didn't think so.) The details of that are specific to your app or whatever framework you're using, and hence beyond the scope of this article. Cottontail's approach is in the Input Selection example.

function onSelect(ev) {
  let inputPose = ev.frame.getInputPose(ev.inputSource, xrFrameOfRef);
  if (!inputPose) {
    return;
  }
  if (inputPose.pointerMatrix) {
    // Figure out what was clicked and respond.
  }
}

Conclusion: looking ahead

As I said earlier, augmented reality should land in Chrome 68 (Canary as of May 2018) any day now. Nevertheless, I encourage you try what we've got so far. We need feedback to make it better. Follow it's progress by watching ChromeStatus.com for WebXR Hit Test. You can also follow WebXR Anchors which will improve pose tracking.


First Input Delay

$
0
0

First Input Delay

We all know how important it is to make a good first impression. It's important when meeting new people, and it's also important when building experiences on the web.

On the web, a good first impression can make the difference between someone becoming a loyal user or them leaving and never coming back. The question is, what makes for a good impression, and how do you measure what kind of impression you're likely making on your users?

On the web, first impressions can take a lot of different forms—we have first impressions of a site's design and visual appeal as well as first impressions of its speed and responsiveness.

While measuring how much users like a site's design is hard to do with web APIs, measuring its speed and responsiveness is not!

The first impression users have of how fast your site loads can be measured with metrics like First Paint (FP) and First Contentful Paint (FCP). But how fast your site can paint pixels to the screen is just part of the story. Equally important is how responsive your site is when users try to interact with those pixels!

To help measure your user's first impression of your site's interactivity and responsiveness, we're introducing a new metric called First Input Delay.

What is first input delay?

First Input Delay (FID) measures the time from when a user first interacts with your site (i.e. when they click a link, tap on a button, or use a custom, JavaScript-powered control) to the time when the browser is actually able to respond to that that interaction.

As developers who write code that responds to events, we often assume our code is going to be run immediately—as soon as the event happens. But as users, we've all frequently experienced the opposite—we've loaded a web page on our phone, tried to interact with it, and then been frustrated when nothing happened.

In general, input delay (or input latency) happens because the browser's main thread is busy doing something else, so it can't (yet) respond to the user. One common reason this might happen is the browser is busy parsing and executing a large JavaScript file loaded by your app. While it's doing that, it can't run any event listeners because the JavaScript it's loading might tell it to do something else.

Consider the following timeline of a typical web page load:

Page load trace

The above chart shows a page that's making a couple of network requests for resources (most likely CSS and JS files), and, after those resources are finished downloading, they're being processed on the main thread. This results in periods where the main thread is momentarily busy (which is indicated by the red color on the chart).

Now let's add two other metrics: First Contentful Paint (FCP), which I mentioned above, and Time to Interactive (TTI), which you've probably seen in tools like Lighthouse or WebPageTest:

Page load trace with FCP and TTI

As you can see, FCP measures the time from Navigation Start until the browser paints content to the screen (in this case not until after the stylesheets are downloaded and processed). And TTI measures the time from Navigation Start until the page's resources are loaded and the main thread is idle (for at least 5 seconds).

But you might have noticed that there's a fair amount of time between when content first paints to the screen and when the browser's main thread is consistently idle and thus reliably capable of responding quickly to user input.

If a user tries to interact with the page during that time (e.g. click on a link), there will likely be a delay between when the click happens and when the main thread is able to respond.

Let's add FID to the chart, so you can see what that might look like:

Page load trace with FCP, TTI, and FID

Here, the browser receives the input when the main thread is busy, so it has to wait until it's not busy to respond to the input. The time it must wait is the FID value for this user on this page.

Why do we need another interactivity metric?

Time to interactive (TTI) is a metric that measures how long it takes your app to load and become capable of quickly responding to user interactions, and First Input Delay (FID) is a metric that measures the delay that users experience when they interact with the page while it's not yet interactive.

So why do we need two metrics that measure similar things? The answer is that both metrics are important, but they're important in different contexts.

TTI is a metric that can be measured without any users present, which means it's ideal for lab environments like Lighthouse or WebPageTest. Unfortunately, lab metrics, by their very nature, cannot measure real user pain.

FID, on the other hand, directly represents user pain—every single FID measurement is an instance of a user having to wait for the browser to respond to an event. And when that wait time is long, users will get frustrated and often leave.

For these reasons we recommend both metrics, but we recommend you measure TTI in lab and you measure FID in the wild, with your analytics tool.

In fact, we plan to do the same with our performance tooling here at Google. Our lab tools like Lighthouse and WebPageTest already report TTI and we're exploring adding FID to our Real User Monitoring (RUM) tools like the Chrome User Experience Report (CrUX).

Also, while these are different metrics, our research has found that they correlate well with each other, meaning any work you do to improve your TTI will likely improve your FID as well.

Why only consider the first input?

While a delay from any input can lead to a bad user experience, we primarily recommend measuring the first input delay for a few reasons:

  1. The first input delay will be the user's first impression of your site's responsiveness, and first impressions are critical in shaping our overall impression of a site's quality and reliability.
  2. The biggest interactivity issues we see on the web today occur during page load. Therefore, we believe initially focusing on improving site's first user interaction will have the greatest impact on improving the overall interactivity of the web.
  3. The recommended solutions for how sites should fix high first input delays (code splitting, loading less JavaScript upfront, etc.) are not necessarily the same solutions for fixing slow input delays after page load. By separating out these metrics we'll be able to provide more specific performance guidelines to web developers.

What counts as a first input?

First Input Delay is a metric that measures a page's responsiveness during load. As such, it only focuses on input events from discrete actions like clicks, taps, and key presses.

Other interactions, like scrolling and zooming, are continuous actions and have completely different performance constraints (also, browsers are often able to hide their latency by running them on a separate thread).

To put this another way, FID focuses on the R (responsiveness) in the RAIL performance model, whereas scrolling and zooming are more related to A (animation), and their performance qualities should be evaluated separately.

What if a user never interacts with your site?

Not all users will interact with your site every time they visit. And not all interactions are relevant to FID (as mentioned in the previous section). In addition, some user's first interactions will be at bad times (when the main thread is busy for an extended period of time), and some user's first interactions will be at good times (when the main thread is completely idle).

This means some users will have no FID values, some users will have low FID values, and some users will probably have high FID values.

How you track, report on, and analyze FID will probably be quite a bit different from other metrics you may be used to. The next section explains how best to do this.

Tracking FID in JavaScript

FID can be measured in JavaScript in all browsers today. To make it easy, we've even created a JavaScript library that tracks and calculates it for you: GoogleChromeLabs/first-input-delay.

Refer to the library's README for full usage and installation instructions, but the gist is you:

  1. Include a minified snippet of code in the <head> of your document that adds the relevant event listeners (this code must be added as early as possible or you may miss events).
  2. In your application code, register a callback that will get invoked with the FID value (as well as the event itself) as soon as the first relevant input is detected.

Once you have your FID value, you can send it to whatever analytics tool you use. For example, with Google Analytics, your code might look something like this:

// The perfMetrics object is created by the code that goes in <head>.
perfMetrics.onFirstInputDelay(function(delay, evt) {
  ga('send', 'event', {
    eventCategory: 'Perf Metrics',
    eventAction: 'first-input-delay',
    eventLabel: evt.type,
    // Event values must be an integer.
    eventValue: Math.round(delay),
    // Exclude this event from bounce rate calculations.
    nonInteraction: true,
  });
});

Analyzing and reporting on FID data

Due to the expected variance in FID values, it's critical that when reporting on FID you look at the distribution of values and focus on the higher percentiles. In fact, we recommend specifically focusing on the 99th percentile, as that will correspond to the particularly bad first experiences users are having with your site. And it will show you the areas that need the most improvement.

This is true even if you segment your reports by device category or type. For example, if you run separate reports for desktop and mobile, the FID value you care most about on desktop should be the 99th percentile of desktop users, and the FID value you care about most on mobile should be the 99th percentile of mobile users.

Unfortunately, many analytics tools do not support reporting on data at specific quantiles without custom configuration and manual data processing/analysis.

For example, it's possible to report on specific quantiles in Google Analytics, but it takes a little extra work. In my article The Google Analytics Setup I Use on Every Site I Build, I have a section on setting hit-level dimensions. When you do this you unlock the ability to filter and segment by individual event values as well as create distributions, so you can calculate the 99th percentile. If you want to track FID with Google Analytics, I'd definitely recommend this approach.

Who should be tracking FID?

FID is a metric that any site could benefit from tracking, but there are a few types of sites that I think could particularly benefit from knowing what kinds of first input delays their users are actually experiencing:

Server-side rendered (SSR) JavaScript apps

Sites that send a server-rendered version of their page to the client along with a lot of JavaScript that needs to get loaded, parsed, and executed before the page is interactive are particularly susceptible to high FID values.

The reason is because the time between when they look interactive and when they actually are interactive is often large, especially on low-end devices that take longer to parse and execute JavaScript.

To be clear, it's absolutely possible to build a server-side rendered app that also gets interactive quickly. SSR in and of itself is not a bad pattern; the problem occurs when developers optimize for a fast first paint and then ignore interactivity. This is where tracking FID can be particularly eye-opening!

Sites with lots of third-party iframes

Third-party ads and social widgets have a history of not being particularly considerate of their host pages. They tend to run expensive DOM operations on the host page's main thread with no regard for how it will affect the user.

Also, third-party iframes can change their code at any time, so regressions can happen even if your application code doesn't change. Tracking FID on your production site can alert you to problems like this that might not get caught during your release process.

The future

First Input Delay is a brand new metric we're experimenting with on the Chrome team. It's particularly exciting to me because it's the first metric we've introduced that directly corresponds to the pain users experience with real-life interactions on the web.

Going forward we hope to standardize this metric within the W3C WebPerf Working Group, so it can be more easily accessed by asynchronously loaded JavaScript and third-party analytics tools (since right now it requires developers to add synchronous code to the head of their pages).

If you have feedback on the metric or the current implementation, we'd love to hear it! Please file issues or submit pull requests on GitHub.

Enabling Strong Authentication with WebAuthn

$
0
0

Enabling Strong Authentication with WebAuthn

The problem

Phishing is the #1 security problem on the web: 81% of account breaches last year were because of weak or stolen passwords. The industry's collective response to this problem has been multi-factor authentication, but implementations are fragmented and most still don't adequately address phishing. We have been working with the FIDO Alliance since 2013 and, more recently, with the W3C to implement a standardized phishing-resistant protocol that can be used by any Web application.

What is WebAuthn?

The Web Authentication API gives Web applications user-agent-mediated access to authenticators – which are often hardware tokens accessed over USB/BLE/NFC or modules built directly into the platform – for the purposes of generating and challenging application-scoped (eTLD+k) public-key credentials. This enables a variety of use-cases, such as:

  • Low friction and phishing-resistant 2FA (to be used in conjunction with a password)
  • Passwordless, biometrics-based re-authorization
  • Low friction and phishing-resistant 2FA without a password (to be used for passwordless accounts)

The API is on track to be implemented by most major browsers, and is intended to both simplify the UI encountered when having to prove your identity online and significantly reduce phishing.

WebAuthn extends the Credential Management API and adds a new credential type called PublicKeyCredential. WebAuthn abstracts the communication between the browser and an authenticator and allows a user to:

  1. Create and register a public key credential for a website
  2. Authenticate to a website by proving possession of the corresponding private key

Authenticators are devices that can generate private/public key pairs and gather consent. Consent for signing can be granted with a simple tap, a successful fingerprint read, or by other methods as long as they comply with FIDO2 requirements (there's a certification program for authenticators by the FIDO Alliance). Authenticators can either be built into the platform (such as fingerprint scanners on smartphones) or attached through USB, Bluetooth Low Energy (BLE), or Near-Field Communication (NFC).

How it works

Creating a key pair and registering a user

When a user wants to register a credential to a website (referred to by WebAuthn as the "relying party"):

  1. The relying party generates a challenge.
  2. The relying party asks the browser, through the Credential Manager API, to generate a new credential for the relying party, specifying device capabilities, e.g., whether the device provides its own user authentication (with biometrics, etc).
  3. After the authenticator obtains user consent, the authenticator generates a key pair and returns the public key and optional signed attestation to the website.
  4. The web app forwards the public key to the server.
  5. The server stores the public key, coupled with the user identity, to remember the credential for future authentications.
let credential = await navigator.credentials.create({ publicKey: {
  challenge: Uint8Array(32) [117, 61, 252, 231, 191, 241,…]
  rp: { id: "acme.com", name: "ACME Corporation" },
  user: {
    id: Uint8Array(8) [79, 252, 83, 72, 214, 7, 89, 26]
    name: "jamiedoe",
    displayName: "Jamie Doe"
  },
  pubKeyCredParams: [ {type: "public-key", alg: -7} ]
}});

Warning: Attestation provides a way for a relying party to determine the provenance of an authenticator. Google strongly recommends that relying parties not attempt to maintain whitelists of authenticators.

Authenticating a user

When a website needs to obtain proof that it is interacting with the correct user:

  1. The relying party generates a challenge and supplies the browser with a list of credentials that are registered to the user. It can also indicate where to look for the credential, e.g., on a local built-in authenticator, or on an external one over USB, BLE, etc.
  2. The browser asks the authenticator to sign the challenge.
  3. If the authenticator contains one of the given credentials, the authenticator returns a signed assertion to the web app after receiving user consent.
  4. The web app forwards the signed assertion to the server for the relying party to verify.
  5. Once verified by the server, the authentication flow is considered successful.
let credential = await navigator.credentials.get({ publicKey: {
  challenge: Uint8Array(32) [139, 66, 181, 87, 7, 203,…]
  rpId: "acme.com",
  allowCredentials: [{
    type: "public-key",
    id: Uint8Array(80) [64, 66, 25, 78, 168, 226, 174,…]
  }],
  userVerification: "required",
}});

Try WebAuthn yourself at https://webauthndemo.appspot.com/.

What's ahead?

Chrome 67 beta ships with support for navigator.credentials.get({publicKey: ...}) and navigator.credentials.create({publicKey:... }) and enables using U2F/CTAP 1 authenticators over USB transport on desktop.

Upcoming releases will add support for more transports such as BLE and NFC and the newer CTAP 2 wire protocol. We are also working on more advanced flows enabled by CTAP 2 and WebAuthn, such as PIN protected authenticators, local selection of accounts (instead of typing a username or password), and fingerprint enrollment.

Note that Microsoft Edge also supports the API and Firefox will be supporting WebAuthn as of Firefox 60.

Resources

We are working on more detailed documentation:

The session "What's new with sign-up and sign-in on the web" at Google I/O 2018 will cover WebAuthn. Come by the Web Sandbox to talk to the experts.

What's New In DevTools (Chrome 68)

$
0
0

experiments_path: /web/updates/2018/05/_experiments.yaml

What's New In DevTools (Chrome 68)

Note: The video version of these release notes will be published around late July 2018.

New to DevTools in Chrome 68:

Note: Check what version of Chrome you're running at chrome://version. If you're running an earlier version, these features won't exist. If you're running a later version, these features may have changed. Chrome auto-updates to a new major version about every 6 weeks.

Assistive Console

Chrome 68 ships with a few new Console features related to autocompletion and previewing.

Eager Evaluation

When you type an expression in the Console, the Console can now show a preview of the result of that expression below your cursor.

The Console is printing the result of the sort() operation before it has been
            explicitly executed.
Figure 1. The Console is printing the result of the sort() operation before it has been explicitly executed

To enable Eager Evaluation:

  1. Open the Console.
  2. Open Console Settings Console
Settings.
  3. Enable the Eager evaluation checkbox.

DevTools does not eager evaluate if the expression causes side effects.

Argument hints

As you're typing out functions, the Console now shows you the arguments that the function expects.

Argument hints in the Console.
Figure 2. Various examples of argument hints in the Console

Notes:

  • A question mark before an arg, such as ?options, represents an optional arg.
  • An ellipsis before an arg, such as ...items, represents a spread.
  • Some functions, such as CSS.supports(), accept multiple argument signatures.

Autocomplete after function executions

Note: This feature depends on Eager Evaluation, which needs to be enabled from Console Settings Console Settings.

After enabling Eager Evaluation, the Console now also shows you which which properties and functions are available after you type out a function.

After running document.querySelector('p'), the Console can now show you the available
            properties and functions for that element.
Figure 3. The top screenshot represents the old behavior, and the bottom screenshot represents the new behavior that supports function autocompletion

ES2017 keywords in autocomplete

ES2017 keywords, such as await, are now available in the Console's autocomplete UI.

The Console now suggests 'await' in its autocomplete UI.
Figure 4. The Console now suggests await in its autocomplete UI

Faster, more reliable audits, a new UI, and new audits

Chrome 68 ships with Lighthouse 3.0. The next sections are a roundup of some of the biggest changes. See Announcing Lighthouse 3.0 for the full story.

Faster, more reliable audits

Lighthouse 3.0 has a new internal auditing engine, codenamed Lantern, which completes your audits faster, and with less variance between runs.

New UI

Lighthouse 3.0 also brings a new UI, thanks to a collaboration between the Lighthouse and Chrome UX (Research & Design) teams.

The new report UI in Lighthouse 3.0.
Figure 5. The new report UI in Lighthouse 3.0

New audits

Lighthouse 3.0 also ships with 4 new audits:

  • First Contentful Paint
  • robots.txt is not valid
  • Use video formats for animated content
  • Avoid multiple, costly round trips to any origin

BigInt support

Note: This isn't a DevTools features per se, but it is a new JavaScript capability that you can try out in the Console.

Chrome 68 supports a new numeric primitive called BigInt. BigInt lets you represent integers with arbitrary precision. Try it out in the Console:

An example of BigInt in the Console.
Figure 6. An example of BigInt in the Console

Add property path to watch

While paused on a breakpoint, right-click a property in the Scope pane and select Add property path to watch to add that property to the Watch pane.

An example of Add property path to watch.
Figure 7. An example of Add property path to watch

Thanks to PhistucK for the contribution.

"Show timestamps" moved to settings

The Show timestamps checkbox previously in Console Settings Console Settings has moved Settings.

Feedback

Was this page helpful?

To discuss the new features and changes in this post, or anything else related to DevTools:

  • File bug reports at Chromium Bugs.
  • Discuss features and changes on the Mailing List. Please don't use the mailing list for support questions. Use Stack Overflow, instead.
  • Get help on how to use DevTools on Stack Overflow. Please don't file bugs on Stack Overflow. Use Chromium Bugs, instead.
  • Tweet us at @ChromeDevTools.
  • File bugs on this doc in the Web Fundamentals repository.

Consider Canary

If you're on Mac or Windows, please consider using Chrome Canary as your default development browser. If you report a bug or a change that you don't like while it's still in Canary, the DevTools team can address your feedback significantly faster.

Note: Canary is the bleeding-edge version of Chrome. It's released as soon as its built, without testing. This means that Canary breaks from time-to-time, about once-a-month, and it's usually fixed within a day. You can go back to using Chrome Stable while Canary is broken.

Previous release notes

See the devtools-whatsnew tag for links to all previous DevTools release notes.

Beyond SPAs: alternative architectures for your PWA

$
0
0

Beyond SPAs: alternative architectures for your PWA

Note: Prefer a video to an article? You can watch the presentation on which this was based instead:

Let's talk about... architecture?

I'm going to cover an important, but potentially misunderstood topic: The architecture that you use for your web app, and specifically, how your architectural decisions come into play when you're building a progressive web app.

"Architecture" can sound vague, and it may not be immediately clear why this matters. Well, one way to think about architecture is to ask yourself the following questions: When a user visits a page on my site, what HTML is loaded? And then, what's loaded when they visit another page?

The answers to those questions are not always straightforward, and once you start thinking about progressive web apps, they can get even more complicated. So my goal is to walk you through one possible architecture that I found effective. Throughout this article, I'll label the decisions that I made as being "my approach" to building a progressive web app.

You're free to use my approach when building your own PWA, but at the same time, there are always other valid alternatives. My hope is that seeing how all the pieces fit together will inspire you, and that you will feel empowered to customize this to suit your needs.

Stack Overflow PWA

To accompany this article I built a Stack Overflow PWA. I spend a lot of time reading and contributing to Stack Overflow, and I wanted to build a web app that would make it easy to browse frequently asked questions for a given topic. It's built on top of the public Stack Exchange API. It's open source, and you can learn more by visiting the GitHub project.

Multi-page Apps (MPAs)

Before I get into specifics, let's define some terms and explain pieces of underlying technology. First, I'm going to be covering what I like to call "Multi Page Apps", or "MPAs".

MPA is a fancy name for the traditional architecture used since this beginning of the web. Each time a user navigates to a new URL, the browser progressively renders HTML specific to that page. There's no attempt to preserve the page's state or the content in between navigations. Each time you visit a new page, you're starting fresh.

This is in contrast to the single-page app (SPA) model for building web apps, in which the browser runs JavaScript code to update the existing page when the user visits a new section. Both SPAs and MPAs are equally valid models to use, but for this post, I wanted to explore PWA concepts within the context of a multi-page app.

Reliably fast

You've heard me (and countless others) use the phrase "progressive web app", or PWA. You might already be familiar with some of the background material, elsewhere on this site.

You can think of a PWA as a web app that provides a first-class user experience, and that truly earns a place on the user's home screen. The acronym "FIRE", standing for Fast, Integrated, Reliable, and Engaging, sums up all the attributes to think about when building a PWA.

In this article, I'm going to focus on a subset of those attributes: Fast and Reliable.

Fast: While "fast" means different things in different contexts, I'm going to cover the speed benefits of loading as little as possible from the network.

Reliable: But raw speed isn't enough. In order to feel like a PWA, your web app should be reliable. It needs to be resilient enough to always load something, even if it's just a customized error page, regardless of the state of the network.

Reliably fast: And finally, I'm going to rephrase the PWA definition slightly and look at what it means to build something that's reliably fast. It's not good enough to be fast and reliable only when you're on a low-latency network. Being reliably fast means that your web app's speed is consistent, regardless of the underlying network conditions.

Enabling Technologies: Service Workers + Cache Storage API

PWAs introduce a high bar for speed and resilience. Fortunately, the web platform offers some building blocks to make that type of performance a reality. I'm referring to service workers and the Cache Storage API.

You can build a service worker that listens for incoming requests, passing some on to the network, and storing a copy of the response for future use, via the Cache Storage API.

A service worker using the Cache Storage API to save a copy of a
          network response.

The next time the web app makes the same request, its service worker can check its caches and just return the previously cached response.

A service worker using the Cache Storage API to respond, bypassing
          the network.

Avoiding the network whenever possible is a crucial part of offering reliably fast performance.

"Isomorphic" JavaScript

One more concept that I want to cover is what's sometimes referred to as "isomorphic", or "universal" JavaScript. Simply put, it's the idea that the same JavaScript code can be shared between different runtime environments. When I built my PWA, I wanted to share JavaScript code between my back-end server, and the service worker.

There are lots of valid approaches to sharing code in this way, but my approach was to use ES modules as the definitive source code. I then transpiled and bundled those modules for the server and the service worker using a combination of Babel and Rollup. In my project, files with an .mjs file extension is code that lives in an ES module.

The server

Keeping those concepts and terminology in mind, let's dive into how I actually built my Stack Overflow PWA. I'm going to start by covering our backend server, and explain how that fits into the overall architecture.

I was looking for a combination of a dynamic backend along with static hosting, and my approach was to use the Firebase platform.

Firebase Cloud Functions will automatically spin up a Node-based environment when there's an incoming request, and integrate with the popular Express HTTP framework, which I was already familiar with. It also offers out-of-the-box hosting for all of my site's static resources. Let's take a look at how the server handles requests.

When a browser makes a navigation request against our server, it goes through the following flow:

An overview of generating a navigation response, server-side.

The server routes the request based on the URL, and uses templating logic to create a complete HTML document. I use a combination of data from the Stack Exchange API, as well as partial HTML fragments that the server stores locally. Once our service worker knows how to respond, it can start streaming HTML back to our web app.

There are two pieces of this picture worth exploring in more detail: routing, and templating.

Routing

When it comes to routing, my approach was to use the Express framework's native routing syntax. It's flexible enough to match simple URL prefixes, as well as URLs that include parameters as part of the path. Here, I create a mapping between route names the underlying Express pattern to match against.

const routes = new Map([
  ['about', '/about'],
  ['questions', '/questions/:questionId'], ['index', '/'],
]);

export default routes;

I can then reference this mapping directly from the server's code. When there's a match for a given Express pattern, the appropriate handler responds with templating logic specific to the matching route.

import routes from './lib/routes.mjs';
app.get(routes.get('index'), async (req, res) => {
  // Templating logic.
});

Server-side templating

And what does that templating logic look like? Well, I went with an approach that pieced together partial HTML fragments in sequence, one after another. This model lends itself well to streaming.

The server sends back some initial HTML boilerplate immediately, and the browser is able to render that partial page right away. As the server pieces together the rest of the data sources, it streams them to the browser until the document is complete.

To see what I mean, take a look at the Express code for one of our routes:

app.get(routes.get('index'), async (req, res) => {
  res.write(headPartial + navbarPartial);
  const tag = req.query.tag || DEFAULT_TAG;
  const data = await requestData(...);
  res.write(templates.index(tag, data.items));
  res.write(footPartial);
  res.end();
});

By using the response object's write() method, and referencing locally stored partial templates, I'm able to start the response stream immediately, without blocking any external data source. The browser takes this initial HTML and renders a meaningful interface and loading message right away.

The next portion of our page uses data from the Stack Exchange API. Getting that data means that our server needs to make a network request. The web app can't render anything else until it gets a response back and process it, but at least users aren't staring at a blank screen while they wait.

Once the web app's received the response from the Stack Exchange API, it calls a custom templating function to translate the data from the API into its corresponding HTML.

Templating language

Templating can be a surprisingly contentious topic, and what I went with is just one approach among many. You'll want to substitute your own solution, especially if you have legacy ties to an existing templating framework.

What made sense for my use case was to just rely on JavaScript's template literals, with some logic broken out into helper functions. One of the nice things about building an MPA is that you don't have to keep track of state updates and re-render your HTML, so a basic approach that produced static HTML worked for me.

So here's an example of how I'm templating the dynamic HTML portion of my web app's index. As with my routes, the templating logic is stored in an ES module that can be imported into both the server and the service worker.

export function index(tag, items) {
  const title = `<h3>Top "${escape(tag)}" Questions</h3>`;
  const form = `<form method="GET">...</form>`;
  const questionCards = items.map((item) => questionCard({
    id: item.question_id,
    title: item.title,
  })).join('');
  const questions = `<div id="questions">${questionCards}</div>`;
  return title + form + questions;
}

Warning: Whenever you're taking user-provided input and converting it to HTML, it's crucial that you take care to properly escape potentially dangerous character sequences. If you're using an existing templating solution rather than rolling your own, that might already be taken care of for you.

These template functions are pure JavaScript, and it's useful to break out the logic into smaller, helper functions when appropriate. Here, I pass each of the items returned in the API response into one such function, which creates a standard HTML element with all of the appropriate attributes set.

function questionCard({id, title}) {
  return `<a class="card"
             href="/questions/${id}"
             data-cache-url="${questionUrl(id)}">${title}</a>`;
}

Of particular note is a data attribute that I add to each link, data-cache-url, set to the Stack Exchange API URL that I need in order to display the corresponding question. Keep that in mind. I'll revisit it later.

Jumping back to my route handler, once templating is complete, I stream the final portion of my page's HTML to the browser, and end the stream. This is the cue to the browser that the progressive rendering is complete.

app.get(routes.get('index'), async (req, res) => {
  res.write(headPartial + navbarPartial);
  const tag = req.query.tag || DEFAULT_TAG;
  const data = await requestData(...);
  res.write(templates.index(tag, data.items));
  res.write(footPartial);
  res.end();
});

So that's a brief tour of my server setup. Users who visit my web app for the first time will always get a response from the server, but when a visitor returns to my web app, my service worker will start responding. Let's dive in there.

The service worker

An overview of generating a navigation response, in the service
          worker.

This diagram should look familiar—many of the same pieces I've previously covered are here in a slightly different arrangement. Let's walk through the request flow, taking the service worker into account.

Our service worker handles an incoming navigation request for a given URL, and just like my server did, it uses a combination of routing and templating logic to figure out how to respond.

The approach is the same as before, but with different low-level primitives, like fetch() and the Cache Storage API. I use those data sources to construct the HTML response, which the service worker passes back to the web app.

Workbox

Rather than starting from scratch with low-level primitives, I'm going to build my service worker on top of a set of high-level libraries called Workbox. It provides a solid foundation for any service worker's caching, routing, and response generation logic.

Routing

Just as with my server-side code, my service worker needs to know how to match an incoming request with the appropriate response logic.

My approach was to translate each Express route into a corresponding regular expression, making use of a helpful library called regexparam. Once that translation is performed, I can take advantage of Workbox's built-in support for regular expression routing.

After importing the module that has the regular expressions, I register each regular expression with Workbox's router. Inside each route I'm able to provide custom templating logic to generate a response. Templating in the service worker is a bit more involved than it was in my backend server, but Workbox helps with a lot of the heavy lifting.

import regExpRoutes from './regexp-routes.mjs';

workbox.routing.registerRoute(regExpRoutes.get('index'),
  // Templating logic.
);

Static asset caching

One key part of the templating story is making sure that my partial HTML templates are locally available via the Cache Storage API, and are kept up to date when I deploy changes to the web app. Cache maintenance can be error prone when done by hand, so I turn to Workbox to handle precaching as part of my build process.

I tell Workbox which URLs to precache using a configuration file, pointing to the directory that contains all of my local assets along with a set of patterns to match. This file is automatically read by the Workbox's CLI, which is run each time I rebuild the site.

module.exports = {
  globDirectory: 'build',
  globPatterns: ['**/*.{html,js,svg}'],
  // Other options...
};

Workbox takes a snapshot of each file's contents, and automatically injects that list of URLs and revisions into my final service worker file. Workbox now has everything it needs to make the precached files always available, and kept up to date. The result is a service-worker.js file that contains something similar to the following:

workbox.precaching.precacheAndRoute([
  {
    url: 'partials/about.html',
    revision: '518747aad9d7e'
  }, {
    url: 'partials/foot.html',
    revision: '69bf746a9ecc6'
  },
  // etc.
]);

For folks who use a more complex build process, Workbox has both a webpack plugin and a generic node module, in addition to its command line interface.

Streaming

Next, I want the service worker to stream that precached partial HTML back to the web app immediately. This is a crucial part of being "reliably fast"—I always get something meaningful on the screen right away. Fortunately, using the Streams API within our service worker makes that possible.

Now, you might have heard about the Streams API before. My colleague Jake Archibald has been singing its praises for years. He made the bold prediction that 2016 would be the year of web streams. And the Streams API is just as awesome today as it was two years ago, but with a crucial difference.

While only Chrome supported Streams back then, the Streams API is more widely supported now. The overall story is positive, and with appropriate fallback code, there's nothing stopping you from using streams in your service worker today.

Well... there might be one thing stopping you, and that's wrapping your head around how the Streams API actually works. It exposes a very powerful set of primitives, and developers who are comfortable using it can create complex data flows, like the following:

const stream = new ReadableStream({
  pull(controller) {
    return sources[0].then((r) => r.read())
    .then((result) => {
      if (result.done) {
        sources.shift();
        if (sources.length === 0) return controller.close();
        return this.pull(controller);
      } else {
        controller.enqueue(result.value);
      }
    })
  }
});

But understanding the full implications of this code might not be for everyone. Rather than parse through this logic, let's talk about my approach to service worker streaming.

I'm using a brand new, high-level wrapper, workbox-streams. With it, I can pass it in a mix of streaming sources, both from caches and runtime data that might come from the network. Workbox takes care of coordinating the individual sources and stitching them together into a single, streaming response.

Additionally, Workbox automatically detects whether the Streams API is supported, and when it's not, it creates an equivalent, non-streaming response. This means that you don't have to worry about writing fallbacks, as streams inch closer to 100% browser support.

Runtime caching

Let's check out how my service worker deals with runtime data, from the Stack Exchange API. I'm making use of Workbox's built-in support for a stale-while-revalidate caching strategy, along with expiration to ensure that the web app's storage doesn't grow unbounded.

I set up two strategies in Workbox to handle the different sources that will make up the streaming response. In a few function calls and configuration, Workbox lets us do what would otherwise take hundreds of lines of handwritten code.

const cacheStrategy = workbox.strategies.cacheFirst({
  cacheName: workbox.core.cacheNames.precache,
});

const apiStrategy = workbox.strategies.staleWhileRevalidate({
  cacheName: API_CACHE_NAME,
  plugins: [
    new workbox.expiration.Plugin({maxEntries: 50}),
  ],
});

The first strategy reads data that's been precached, like our partial HTML templates.

The other strategy implements the stale-while-revalidate caching logic, along with least-recently-used cache expiration once we reach 50 entries.

Now that I have those strategies in place, all that's left is to tell Workbox how to use them to construct a complete, streaming response. I pass in an array of sources as functions, and each of those functions will be executed immediately. Workbox takes the result from each source and streams it to the web app, in sequence, only delaying if the next function in the array hasn't completed yet.

workbox.streams.strategy([
  () => cacheStrategy.makeRequest({request: '/head.html'}),
  () => cacheStrategy.makeRequest({request: '/navbar.html'}),
  async ({event, url}) => {
    const tag = url.searchParams.get('tag') || DEFAULT_TAG;
    const listResponse = await apiStrategy.makeRequest(...);
    const data = await listResponse.json();
    return templates.index(tag, data.items);
  },
  () => cacheStrategy.makeRequest({request: '/foot.html'}),
]);

The first two sources are precached partial templates read directly from the Cache Storage API, so they'll always be available immediately. This ensures that our service worker implementation will be reliably fast in responding to requests, just like my server-side code.

Our next source function fetches data from the Stack Exchange API, and processes the response into the HTML that the web app expects.

The stale-while-revalidate strategy means that if I have a previously cached response for this API call, I'll be able to stream it to the page immediately, while updating the cache entry "in the background" for the next time it's requested.

Finally, I stream a cached copy of my footer and close the final HTML tags, to complete the response.

Sharing code keeps things in sync

You'll notice that certain bits of the service worker code look familiar. The partial HTML and templating logic used by my service worker is identical to what my server-side handler uses. This code sharing ensures that users get a consistent experience, whether they're visiting my web app for the first time or returning to a page rendered by the service worker. That's the beauty of isomorphic JavaScript.

Dynamic, progressive enhancements

I've walked through both the server and service worker for my PWA, but there's one last bit of logic to cover: there's a small amount of JavaScript that runs on each of my pages, after they're fully streamed in.

This code progressively enhances the user experience, but isn't crucial—the web app will still work if it's not run.

Page metadata

My app uses client-side JavaScipt for to update a page's metadata based on the API response. Because I use the same initial bit of cached HTML for each page, the web app ends up with generic tags in my document's head. But through coordination between my templating and client-side code, I can update the window's title using page-specific metadata.

As part of the templating code, my approach is to include a script tag containing the properly escaped string.

const metadataScript = `<script>
  self._title = '${escape(item.title)}';
</script>`;

Then, once my page has loaded, I read that string and update the document title.

if (self._title) {
  document.title = unescape(self._title);
}

If there are other pieces of page-specific metadata you want to update in your own web app, you can follow the same approach.

Offline UX

The other progressive enhancement I've added is used to bring attention to our offline capabilities. I've built a reliable PWA, and I want users to know that when they're offline, they can still load previously visited pages.

First, I use the Cache Storage API to get a list of all the previously cached API requests, and I translate that into a list of URLs.

Remember those special data attributes I talked about, each containing the URL for the API request needed to display a question? I can cross-reference those data attributes against the list of cached URLs, and create an array of all the question links that don't match.

When the browser enters an offline state, I loop through the list of uncached links, and dim out the ones that won't work. Keep in mind that this is just a visual hint to the user about what they should expect from those pages—I'm not actually disabling the links, or preventing the user from navigating.

const apiCache = await caches.open(API_CACHE_NAME);
const cachedRequests = await apiCache.keys();
const cachedUrls = cachedRequests.map((request) => request.url);

const cards = document.querySelectorAll('.card');
const uncachedCards = [...cards].filter((card) => {
  return !cachedUrls.includes(card.dataset.cacheUrl);
});

const offlineHandler = () => {
  for (const uncachedCard of uncachedCards) {
    uncachedCard.style.opacity = '0.3';
  }
};

const onlineHandler = () => {
  for (const uncachedCard of uncachedCards) {
    uncachedCard.style.opacity = '1.0';
  }
};

window.addEventListener('online', onlineHandler);
window.addEventListener('offline', offlineHandler);

Common pitfalls

I've now gone through a tour of my approach to building a multi-page PWA. There are many factors that you'll have to consider when coming up with your own approach, and you may end up making different choices than I did. That flexibility is one of the great things about building for the web.

There are a few common pitfalls that you may encounter when making your own architectural decisions, and I want to save you some pain.

Don't cache full HTML

I recommend against storing complete HTML documents in your cache. For one thing, it's a waste of space. If your web app uses the same basic HTML structure for each of its pages, you'll end up storing copies of the same markup again and again.

More importantly, if you deploy a change to your site's shared HTML structure, every one of those previously cached pages is still stuck with your old layout. Imagine the frustration of a returning visitor seeing a mix of old and new pages.

Server / service worker drift

The other pitfall to avoid involves your server and service worker getting out of sync. *My approach** was to use isomorphic JavaScript, so that the same code was run in both places. Depending on your existing server architecture, that's not always possible.

Whatever architectural decisions you make, you should have some strategy for running the equivalent routing and templating code in your server and your service worker.

Worst case scenarios

Inconsistent layout / design

What happens when you ignore those pitfalls? Well, all sorts of failures are possible, but the worst case scenario is that a returning user visits a cached page with a very stale layout—perhaps one with out of date header text, or that uses CSS class names that are no longer valid.

Worst case scenario: Broken routing

Alternatively, a user might come across a URL that's handled by your server, but not your service worker. An site full of zombie layouts and dead ends is not a reliable PWA.

Tips for success

But you're not in this alone! The following tips can help you avoid those pitfalls:

Use templating and routing libraries that have multi-language implementations

Try to use templating and routing libraries that have JavaScript implementations. Now, I know that not every developer out there has the luxury of migrating off your current web server and templating language.

But a number of popular templating and routing frameworks have implementations in multiple languages. If you can find one that works with JavaScript as well as your current server's language, you're one step closer to keeping your service worker and server in sync.

Prefer sequential, rather than nested, templates

Next, I recommend using a series of sequential templates that can be streamed in one after another. It's okay if later portions of your page use more complicated templating logic, as long as you can stream in the initial part of your HTML as quickly as possible.

Cache both static and dynamic content in your service worker

For best performance, you should precache all of your site's critical static resources. You should also set up runtime caching logic to handle dynamic content, like API requests. Using Workbox means that you can build on top of well-tested, production-ready strategies instead of implementing it all from scratch.

Only block on the network when absolutely necessary

And related to that, you should only block on the network when it's not possible to stream a response from the cache. Displaying a cached API response immediately can often lead to a better user experience than waiting for fresh data.

Resources

New in Chrome 67

$
0
0

New in Chrome 67

  • Progressive Web Apps are coming to the desktop
  • The generic sensor API makes it way easier to get access to device sensors like the accelerometer, gyroscope and more.
  • And BigInt's make dealing with big integers way easier.

And there’s plenty more!

I’m Pete LePage. Let’s dive in and see what’s new for developers in Chrome 67!

Note: Want the full list of changes? Check out the Chromium source repository change list.

Desktop PWAs

Spotify's desktop progressive web app

Desktop Progressive Web Apps are now supported on Chrome OS 67, and we’ve already started working on support for Mac and Windows. Once installed, they’re launched in the same way as other apps, and run in an app window, without an address bar or tabs. Service workers ensure that they’re fast, and reliably, the app window experience makes them feel integrated. And they create an engaging experience for your users.

Getting started isn't any different than what you're already doing today. All of the work you've done for your existing Progressive Web App still applies, you simply need to consider some additional break points.

If your app meets the standard PWA criteria, Chrome will fire the beforeinstallprompt event, but it won’t automatically prompt the user. Instead, save the event; then, add some UI - like an install app button - to your app to tell the user your app can be installed. Then, when the user clicks the button, call prompt on the saved event; Chrome will then show the prompt to the user. If they click add, Chrome will add your PWA to their shelf and launcher.

Check out my Google I/O talk where Jenny and I go into detail about the technical and special design considerations you need to think about when building a desktop progressive web app.

And, if you want to start playing with this on Mac or Windows - check out the full Desktop Progressive Web App post for details on how to enable support with a flag.

Generic Sensor API

Sensor data is used in many apps to enable experiences like immersive gaming, fitness tracking, and augmented or virtual reality. This data is now available to web app using the Generic Sensor API.

The API consists of a base Sensor interface with a set of concrete sensor classes built on top. Having a base interface simplifies the implementation and specification process for the concrete sensor classes. For example, the Gyroscope class is super tiny!

const sensor = new Gyroscope({frequency: 500});
sensor.start();

sensor.onreading = () => {
    console.log("X-axis " + sensor.x);
    console.log("Y-axis " + sensor.y);
    console.log("Z-axis " + sensor.z);
};

The core functionality is specified by the base interface, and Gyroscope merely extends it with three attributes representing angular velocity. Chrome 67 supports the accelerometer, gyroscope, orientation sensor, and motion sensor.

Intel has put together several demos of the generic sensors API and sample code, and they’ve also updated the Sensors for the Web! post from September with everything you need to know.

BigInt's

BigInt's are a new numeric primitive in JavaScript that can represent integers with arbitrary precision. Large integer IDs and high-accuracy timestamps can’t be safely represented as Numbers in JavaScript, which often leads to real-world bugs because we end up representing them as strings instead.

let max = Number.MAX_SAFE_INTEGER;
// → 9_007_199_254_740_991
max = max + 1;
// → 9_007_199_254_740_992 - Yay!
max = max + 1;
// → 9_007_199_254_740_992 - Uh, no?

With BigInt's, we can safely store and perform integer arithmetic without overflowing. Today, dealing with large integers typically means we have to resort to a library that would emulate BigInt-like functionality.

let max = BigInt(Number.MAX_SAFE_INTEGER);
// → 9_007_199_254_740_991n
max = max + 9n;
// → 9_007_199_254_741_000n - Yay!

When BigInt becomes widely available, we’ll be able to drop these run-time dependencies in favor of native BigInts. Not only is the native implementation faster, it’ll help to reduce load time, parse time, and compile time because we won’t have to load those extra libraries.

And more!

These are just a few of the changes in Chrome 67 for developers, of course, there’s plenty more.

The Credential Management API has been supported since Chrome 51, and provides a framework for creating, retrieving and storing credentials. It did this through two credential types: PasswordCredential and FederatedCredential. The Web Authentication API adds a third credential type, PublicKeyCredential, which allows browsers to authenticate a user with a private/public key pair generated by an authenticator such as a security key, fingerprint reader, or any other device that can authenticate a user. Chrome 67 enables the API using U2F/CTAP 1 authenticators over USB transport on desktop.

Learn more about it in Eiji's Enabling Strong Authentication with WebAuthn post.

Google I/O is a wrap

If you didn’t make it to I/O, or may you did, but didn’t see all the web talks, check out the Chrome and Web playlist to get caught up on all the latest from Google I/O!

New in DevTools

Be sure to check out New in Chrome DevTools, to learn what’s new in for DevTools in Chrome 67.

Subscribe

Then, click the subscribe button on our YouTube channel, and you’ll get an email notification whenever we launch a new video, or add our RSS feed to your feed reader.

I’m Pete LePage, and as soon as Chrome 68 is released, I’ll be right here to tell you -- what’s new in Chrome!

Fresher service workers, by default

$
0
0

Fresher service workers, by default

tl;dr

Starting in Chrome 68, HTTP requests that check for updates to the service worker script will no longer be fulfilled by the HTTP cache by default. This works around a common developer pain point, in which setting an inadvertent Cache-Control: header on your service worker script could lead to delayed updates.

If you've already opted-out of HTTP caching for your /service-worker.js script by serving it with Cache-Control: max-age=0, then you shouldn't see any changes due to the new default behavior.

Background

Every time you navigate to a new page that's under a service worker's scope, explicitly call registration.update() from JavaScript, or when a service worker is "woken up" via a push or sync event, the browser will, in parallel, request the JavaScript resource that was originally passed in to the navigator.serviceWorker.register() call, to look for updates to the service worker script.

For the purposes of this article, let's assume its URL is /service-worker.js and that it contains a single call to importScripts(), which loads additional code that's run inside the service worker:

// Inside our /service-worker.js file:
importScripts('path/to/import.js');

// Other top-level code goes here.

What's changing?

Prior to Chrome 68, the update request for /service-worker.js would be made via the HTTP cache (as most fetches are). This meant if the script was originally sent with Cache-Control: max-age=600, updates within the next 600 seconds (10 minutes) would not go to the network, so the user may not receive the most up-to-date version of the service worker. However, if max-age was greater than 86400 (24 hours), it would be treated as if it were 86400, to avoid users being stuck with a particular version forever.

Starting in 68, the HTTP cache will be ignored when requesting updates to the service worker script, so existing web applications may see an increase in the frequency of requests for their service worker script. Requests for importScripts will still go via the HTTP cache. But this is just the default—a new registration option, updateViaCache is available that offers control over this behavior.

updateViaCache

Developers can now pass in a new option when calling navigator.serviceWorker.register(): the updateViaCache parameter. It takes one of three values: 'imports', 'all', or 'none'.

The values determine if and how the browser's standard HTTP cache comes into play when making the HTTP request to check for updated service worker resources.

  • When set to imports, the HTTP cache will never be consulted when checking for updates to the /service-worker.js script, but will be consulted when fetching any imported scripts (path/to/import.js, in our example). This is the default, and it matches the behavior starting in Chrome 68.

  • When set to 'all', the HTTP cache will be consulted when making requests for both the top-level /service-worker.js script, as well as any scripts imported inside of the service worker, like path/to/import.js. This option corresponds to the previous behavior in Chrome, prior to Chrome 68.

  • When set to 'none', the HTTP cache will not be consulted when making requests for either the top-level /service-worker.js or for any imported scripted, such as the hypothetical path/to/import.js.

For example, the following code will register a service worker, and ensure that the HTTP cache is never consulted when checking for updates to either the /service-worker.js script, or for any scripts that are referenced via importScripts() inside of /service-worker.js:

if ('serviceWorker' in navigator) {
  navigator.serviceWorker.register('/service-worker.js', {
    updateViaCache: 'none',
    // Optionally, set 'scope' here, if needed.
  });
}

What do developers need to do?

If you've effectively opted-out of HTTP caching for your /service-worker.js script by serving it with Cache-Control: max-age=0 (or a similar value), then you shouldn't see any changes due to the new default behavior.

If you do serve your /service-worker.js script with HTTP caching enabled, either intentionally or because it's just the default for your hosting environment, you may start seeing an uptick of additional HTTP requests for /service-worker.js made against your server—these are requests that used to be fulfilled by the HTTP cache. If you want to continue allowing the Cache-Control header value to influence the freshness of your /service-worker.js', you'll need to start explicitly setting updateViaCache: 'all' when registering your service worker.

Given that there may be a long-tail of users on older browser versions, it's still a good idea to continue setting the Cache-Control: max-age=0 HTTP header on service worker scripts, even though newer browsers might ignore them.

Developers can use this opportunity to decide whether they want to explicitly opt their imported scripts out of HTTP caching now, and add in updateViaCache: 'none' to their service worker registration if appropriate.

Further reading

"The Service Worker Lifecycle" and "Caching best practices & max-age gotchas", both by Jake Archibald, are recommended reading for all developers who deploy anything to the web.

Changes to Add to Home Screen Behavior

$
0
0

Changes to Add to Home Screen Behavior

Since we first launched the add to home screen banner, we’ve been working to label Progressive Web Apps more clearly and simplify the way users can install them. Our eventual goal is to provide an install button in the omnibox across all platforms, and in Chrome 68 we are making changes towards that goal.

What’s changing?

Starting in Chrome 68 on Android (Beta in June 2018), Chrome will no longer show the add to home screen banner. If the site meets the add to home screen criteria, Chrome will show the mini-infobar. Then, if the user clicks on the mini-infobar, or you call prompt() on the beforeinstallprompt event from within a user gesture, Chrome will show a modal add to home screen dialog.

A2HS banner
Chrome 67 and before

Shown automatically when site meets the add to home screen criteria, and the site does not call preventDefault() on the beforeinstallprompt event

OR

Shown by calling prompt() on the beforeinstallprompt event.

Mini-infobar
Chrome 68 and later

Shown when the site meets the add to home screen criteria

If dismissed by a user, it will not be shown until a sufficient period of time (~3 months) has passed.

Shown regardless of if preventDefault() was called on the beforeinstallprompt event.

This UI treatment will be removed in a future version of Chrome when the omnibox install button is introduced.

 
A2HS Dialog

Shown by calling prompt() from within a user gesture on the beforeinstallprompt event in Chrome 68 and later.

OR

Shown when a user taps the mini-infobar in Chrome 68 and later.

OR

Shown after the user clicks 'Add to Home screen' from the Chrome menu in all Chrome versions.

The mini-infobar

The mini-infobar

The mini-infobar is a Chrome UI component and is not controllable by the site, but can be easily dismissed by the user. Once dismissed by the user, it will not appear again until a sufficient amount of time has passed (currently 3 months). The mini-infobar will appear when the site meets the add to home screen criteria, regardless of whether you preventDefault() on the beforeinstallprompt event or not.

Early concept of the install button in the omnibox
The mini-infobar is an interim experience for Chrome on Android as we work towards creating a consistent experience across all platforms that includes an install button into the omnibox.
## Triggering the add to home screen dialog
Install button on a Desktop Progressive Web App

Instead of prompting the user on page load (an anti-pattern for permission requests), you can indicate your app can be installed with some UI, which will then show the modal install prompt. For example this desktop PWA adds an ‘Install App’ button just above the user's profile name.

Prompting to install your app on a user gesture feels less spammy to the user and increases the likelihood that they’ll click ‘Add’ instead of ‘Cancel’. Incorporating an Install button into your app means that even if the user chooses not to install your app today, the button will still be there tomorrow, or whenever they’re ready to install.

Listening for the beforeinstallprompt event

If your site meets the add to home screen criteria, Chrome will fire a beforeinstallprompt event, save a reference to the event, and update your user interface to indicate that the user can add your app to their home screen.

let installPromptEvent;

window.addEventListener('beforeinstallprompt', (event) => {
  // Prevent Chrome <= 67 from automatically showing the prompt
  event.preventDefault();
  // Stash the event so it can be triggered later.
  installPromptEvent = event;
  // Update the install UI to notify the user app can be installed
  document.querySelector('#install-button').disabled = false;
});

Note: Your site must meet the add to home screen criteria in order for the beforeinstallprompt event to be fired and your app installed.

The beforeinstallprompt event will not be fired if the app is already installed (see the add to home screen criteria). But if the user later uninstalls the app, the beforeinstallprompt event will again be fired on each page navigation.

Showing the dialog with prompt()

Add to home screen dialog

To show the add to home screen dialog, call prompt() on the saved event from within a user gesture. Chrome will show the modal dialog, prompting the user to add your app to their home screen. Then, listen for the promise returned by the userChoice property of the beforeinstallprompt event. The promise returns an object with an outcome property after the prompt has shown and the user has responded to it.

btnInstall.addEventListener('click', () => {
  // Update the install UI to remove the install button
  document.querySelector('#install-button').disabled = true;
  // Show the modal add to home screen dialog
  installPromptEvent.prompt();
  // Wait for the user to respond to the prompt
  installPromptEvent.userChoice.then((choice) => {
    if (choice.outcome === 'accepted') {
      console.log('User accepted the A2HS prompt');
    } else {
      console.log('User dismissed the A2HS prompt');
    }
    // Clear the saved prompt since it can't be used again
    installPromptEvent = null;
  });
});

Note: Although the beforeinstallprompt event may be fired without a user gesture, calling prompt() requires one.

You can only call prompt() on the deferred event once, if the user clicks cancel on the dialog, you'll need to wait until the beforeinstallprompt event is fired on the next page navigation. Unlike traditional permission requests, clicking cancel will not block future calls to prompt() because it call must be called within a user gesture.

Additional Resources

Check out App Install Banners for more information, including:

  • Details on the beforeinstallprompt event
  • Tracking the user's response to the add home screen prompt
  • Tracking if the app has been installed
  • Determining if your app is running as an installed app

Bring your payment method to the web with the Payment Handler API

$
0
0

Bring your payment method to the web with the Payment Handler API

What is the Payment Handler API?

The Payment Request API introduced an open, standards-based way to accept payments in a browser. It can collect payment credentials as well as shipping and contact information from the payer through a quick and easy user interface.

The Payment Handler API opens up a whole new ecosystem to payment providers. It allows a web-based payment application (using an installed service worker) to act as a payment method and to be integrated into merchant websites through the standard Payment Request API.

User experience

From a user's point of view, the user experience looks like this:

  1. A user decides to purchase an item and presses the "Buy Now" button on the product detail page.
  2. The Payment Request sheet opens.
  3. The user chooses a payment method (Payment Handlers have a URL listed below the payment method name).
  4. The payment app opens in a separate window where the user authenticates and authorizes the payment.
  5. Payment app window closes and the payment is processed.
  6. Payment is complete and the Payment Request sheet is closed.
  7. The website can display an order confirmation at this point.

Try it yourself here using Chrome 68 beta.

Notice there are three parties involved: an end user, a merchant website, and a payment handler provider.

Merchants' developer experience

For a merchant website, integrating an existing payment app is as easy as adding an entry to supportedMethods (payment method identifier) and optionally an accompanying data to the first argument of the Payment Request API. For example, to add a payment app called BobPay with the payment method identifier of https://bobpay.xyz/pay, the code would be:

const request = new PaymentRequest([{
  supportedMethods: 'https://bobpay.xyz/pay'
}], {
  total: {
    label: 'total',
    amount: { value: '10', currency: 'USD' }
  }
});

If a service worker that can handle the BobPay payment method is installed, the app will show up in the Payment Request UI and the user can pay by selecting it. In some cases, Chrome will skip ahead to the Payment Handler, providing a swift payment experience!

Chrome also supports a non-standard feature we call just-in-time (JIT) installation. In our example, this would allow BobPay's trusted Payment Handler to be installed on the fly without the user having visited BobPay's website in advance. Note that the installation only happens after the user explicitly selects BobPay as their payment method within the Payment Request UI. Also note that a payment method (BobPay) can only specify a maximum of one trusted Payment Handler that can be installed just-in-time.

How to build a Payment Handler

To build a payment handler, you'll need to do a little more than just implementing the Payment Handler API.

Install a service worker and add payment instruments

The heart of the Payment Handler API is the service worker. On your payment app website, register a service worker and add payment instruments through paymentManager under a ServiceWorkerRegistration object.

if ('serviceWorker' in navigator) {
  // Register a service worker
  const registratoin = await navigator.serviceWorker.register(
    // A service worker JS file is separate
    'service-worker.js'
  );
  // Check if Payment Handler is available
  if (!registration.paymentManager) return;

  registration.paymentManager.instruments.set(
    // Payment instrument key can be any string.
    "https://bobpay.xyz",
    // Payment instrument detail
    {
      name: 'Payment Handler Example',
      method: 'https://bobpay.xyz/pay'
    }
  )
}

To handle actual payment requests, listen to paymentrequest events in the service worker. When one is received, open a separate window and return a payment credential after getting the user's authorization for a payment.

const origin = 'https://bobpay.xyz';
const methodName = `${origin}/pay`;
const checkoutURL = `${origin}/checkout`;
let resolver;
let payment_request_event;

self.addEventListener('paymentrequest', e => {
  // Preserve the event for future use
  payment_request_event = e;
  // You'll need a polyfill for `PromiseResolver`
  // As it's not implemented in Chrome yet.
  resolver = new PromiseResolver();

  e.respondWith(resolver.promise);
  e.openWindow(checkoutURL).then(client => {
    if (client === null) {
      resolver.reject('Failed to open window');
    }
  }).catch(err => {
    resolver.reject(err);
  });
});

self.addEventListener('message', e => {
  console.log('A message received:', e);
  if (e.data === "payment_app_window_ready") {
    sendPaymentRequest();
    return;
  }

  if (e.data.methodName === methodName) {
    resolver.resolve(e.data);
  } else {
    resolver.reject(e.data);
  }
});

// Get the user's authorization

const sendPaymentRequest = () => {
  if (!payment_request_event) return;
  clients.matchAll({
    includeUncontrolled: false,
    type: 'window'
  }).then(clientList => {
    for (let client of clientList) {
      client.postMessage(payment_request_event.total);
    }
  });
}

Identifying a payment app

To identify the payment app from a URL-based payment method identifier (e.g., https://bobpay.xyz/pay), you'll need to include the following declarative materials.

  1. A payment method identifier: points to a payment method manifest.
  2. A payment method manifest: points to a web app manifest and supported origins.
  3. A web app manifest: describes a website that hosts a service worker that handles payment requests.

To learn more about how to implement a payment app, see Quick guide to implementing a payment app with the Payment Handler API.

Example payment methods

Because the Payment Handler API is designed to be flexible enough to accept any kind of payment method, supported methods can include:

  • Bank transfers
  • Cryptocurrencies
  • E-money
  • Carrier billings
  • Merchant's point system
  • Cash on delivery (Merchant's self-served)

Resources

Deprecations and removals in Chrome 68

$
0
0

Deprecations and removals in Chrome 68

Deprecate and Remove Negative Brightness Values in Filter

For compliance with specification, filter's brightness() function no longer accepts negative values.

Chromestatus Tracker | Chromium Bug

Remove document.createTouch

The document.createTouch() method is being removed because the Touch() constructor has been supported since Chrome 48. This follows a long-standing trend in JavaScript APIs of moving away from factory functions and toward constructors. The closely-related document.createTouchList() method is expected to be removed in Chrome 69.

Intent to Remove | Chromestatus Tracker | Chromium Bug

Remove Document.selectedStylesheetSet and Document.preferredStylesheetSet

The Document.selectedStylesheetSet and Document.preferredStylesheetSet attributes are removed because they are non-standard and only implemented by Chrome and WebKit. The standard versions of these attributes were removed from the spec in 2016.

Document.styleSheets provides some of the same functionality, thought not all. Fortunately the risk to websites is low as the use of these items appears to be in single digits. (See the Intent to Remove for exact numbers.)

Intent to Remove | Chromestatus Tracker | Chromium Bug

WEBGL_compressed_texture_atc

Previously, Chrome provided the AMD_compressed_ATC_texture_atc formats. These formats were widely supported at the time the extension was created. Hardware support has since dwindled to near-zero, with implementation currently possible only on Qualcomm devices. This extension has been rejected by the WebGL Working Group and support for it is now removed from Chrome.

Chromestatus Tracker | Chromium Bug

Chacmool: Augmented reality in Chrome Canary

$
0
0

Chacmool: Augmented reality in Chrome Canary

When preparing for Google I/O, we wanted to highlight the exciting possibilities of augmented reality (AR) on the web. Chacmool is an educational web experience demo we built to show how easily web based AR can help users engage with AR experiences. The web makes AR convenient and accessible everywhere.

We have now enabled this demo on Chrome Canary on ARCore-compatible Android devices with Android O or later. You'll also need to install ARCore. This work relies on a new WebXR proposal (the WebXR Hit Test API), so it is under a flag and intended to stay in Canary as we test and refine the new API proposal with other members of the Immersive Web Community Group. In fact, to access the demo you'll need to enable two flags in chrome://flags: #webxr and #webxr-hit-test. Once you have these both enabled and have restarted Canary, you can check out the Chacmool demo.

The Chacmool AR experience is centered around education, leveraging AR's immersive and interactive nature to help users learn about ancient Chacmool sculptures. Users can place a life size statue in their reality and move around to see the sculpture from various different angles and distances. The immersive nature of AR allows users to freely explore, discover and play with content, just like they can in the real world. When viewing an object in AR, as opposed to seeing it on a flat 2D screen, we are able to get a deep understanding of what we are looking at because we can see it from many different angles and distances using a very intuitive interaction model: walking around the object, and getting physically closer or further away. Also, in this experience, there are annotations placed directly on the sculpture. This enables users to directly connect what is described in text and where those features are on the sculpture.

This demo was built over the course of about a month to build, leveraging some of the components from the WebXR team's first web based AR demo, WebAR-Article. Information about the sculpture was sourced from its Google's Arts & Culture page, and the 3D model was provided by Google Arts & Culture's partner, CyArk. To get the 3D model ready for the web, a combination of Meshlab, and Mesh Mixer was used to repair the model and decimate its mesh to decrease its file size. Then Draco, a library for compressing and decompressing 3D geometric meshes and point clouds was used to reduce the model's file size from 44.3 megabytes to a mere 225 kilobytes. Finally, a web worker is used to load the model on a background thread so the page remains interactive while the model is loaded and decompressed, an operation that would typically cause jank and prevent the page from being scrolled.

We can't stress enough that, since we were developing on desktop and deploying onto a phone, using Chrome's remote debugging tools to help inspect the experience creates a great fast iteration cycle between code changes, and there are amazing developer tools in Chrome for debugging and checking performance.

Best practices for AR/VR experiences

Most design and engineering guidelines for designing for native AR experiences apply for making web based AR experiences. If you'd like to learn more about general best practices, check out the AR design guidelines we recently released.

In particular, when designing web based AR experiences, it's best to use the entire screen (i.e. go fullscreen similar to how video players going fullscreen on mobile) when using AR. This prevents users from scrolling the view or getting distracted by other elements on the web page. The transition into AR should be smooth and seamless, showing the camera view before entering AR onboarding (e.g. drawing a reticle). What is important to note about web based AR is that, unlike native, developers do not have access to the camera frame, lighting estimation, anchors, or planes (yet), so it's important that designers and developers keep these limitations in mind when designing a web based AR experience.

In addition, due to the variety of devices used for web experiences, it's important that performance is optimized to create the best user experience. So: keep poly counts low, try to get away with as few lights as possible, precompute shadows if possible and minimize draw calls. When displaying text in AR, use modern (i.e. signed distance field based) text rendering techniques to make sure the text is clear and readable from all distances and angles. When placing annotations, think about the user's gaze as another input and only show annotations when they are relevant (i.e. proximity based annotations that show up when a user is focus on an area of interest).

Lastly, because this content is web based, it's important to also apply general best design practices for the web. Be sure the site provides a good experience across devices (desktop, tablet, mobile, headset, etc) - supporting progressive enhancement means also designing for non-AR-capable devices (i.e. show a 3D model viewer on non-AR devices).

If you are interested in developing your own web-based AR experiences, we have a companion post here that will give more details about how to get started building AR on the Web yourself. (You can also check out the source to the Chacmool demo.) The WebXR Device API is actively in development and we'd love feedback so we can ensure it enables all types of applications and use cases, so please head over to GitHub and join the conversation!

Augmented reality for the web

$
0
0

Augmented reality for the web

In Chrome 67, we announced the WebXR Device API for both augmented reality (AR) and virtual reality (VR), though only the VR features were enabled. VR is an experience based purely on what's in a computing device. AR on the other hand allows you to render virtual objects in the real world. To allow placement and tracking of those objects, we just added the WebXR Hit Test API to Chrome Canary, a new method that helps immersive web code place objects in the real world.

Where can I get it?

This API is intended to stay in Canary for the immediate future. We want a protracted testing period because this is a very new API proposal and we want to make sure it's both robust and right for developers.

Aside from Chrome Canary, you'll also need:

With these, you can dive into the demos or try out our codelab.

Note: Some of the Immersive Web Community Group's existing demos, specfically the ones using magic windows, do not work with the WebXR Hit Test turned on. Please excuse our construction debris.

It's just the web

At Google IO this year, we demonstrated augmented reality with an early build of Chrome. I said something repeatedly to developers and non-developers alike during those three days that I wish I had known to put in my immersive web article: "It's just the web."

"What Chrome extension do I need to install?" "There's no extension. It's just the web."

"Do I need a special browser?" "It's just the web."

"What app do I need to install?" "There is no special app. It's just the web."

This may be obvious to you since you're reading this on a website devoted to the web. If you build demonstrations with this new API, prepare for this question. You'll get it a lot.

Speaking of IO, if you want to hear more about the immersive web in general, where it is, where it's going check out this video.

What's it useful for?

Augmented reality will be a valuable addition to a lot of existing web pages. For example, it can help people learn on education sites, and allow potential buyers to visualize objects in their home while shopping.

Our demos illustrates this. They allow users to place a life-size representation of an object as if in reality. Once placed, the image stays on the selected surface, appears the size it would be if the actual item were on that surface, and allows the user to move around it as well as closer to it or farther from it. This gives viewers a deeper understanding of the object than is possible with a two dimensional image.

If you're not sure what I mean by all of that, it will become clear when you use the demos. If you don't have a device that can run the demo, check out the video link at the top of this article.

One thing that demo and video doesn't show is how AR can convey the size of a real object. The video here shows an educational demo that we built called Chacmool. A companion article describes this demo in detail. The important thing for this discussion is that when you place the Chacmool statue in augmented reality, you're seeing its size as though it were actually in the room with you.

The Chacmool example is educational but it could just as easily be commercial. Imagine a furniture shopping site that lets you place a couch in your living room. The AR application tells you whether the couch fits your space and how it will look next to your other furniture.

Ray casts, hit tests, and reticles

A key problem to solve when implementing augmented reality is how to place objects in a real-world view. The method for doing this is called ray casting. Ray casting means calculating the intersection between the pointer ray and a surface in the real world. That intersection is called a hit and determining whether a hit has occurred is a hit test.

This is a good time to try out the new code sample in Chrome Canary. Before doing anything, double-check that you have the correct flags enabled. Now load the sample and click "Start AR".

Notice a few things. First, the speed meter which you may recognize from the other immersive samples shows 30 frame per second instead of 60. This is the rate at which the web page receives images from the camera.

AR runs at 30 frames per second

The AR Hit Test demo

The other thing you should notice is the sunflower image. It moves as you move and snaps to surfaces such as floors and table tops. If you tap the screen, a sunflower will be placed on a surface and a new sunflower will move with your device.

The image that moves with your device, and that attempts to lock to surfaces is called a reticle. A reticle is a temporary image that aids in placing an object in augmented reality. In this demo, the reticle is a copy of the image to be placed. But it doesn't need to be. In the Chacmool demo, for example, it's a rectangular box roughly the same shape as the base of the object being placed.

Down to the code

The Chacmool demo shows what AR might look like in a production app. Fortunately, there is a much simpler demo in the WebXR samples repo. My sample code comes from the AR Hit Test demo in that repository. FYI, I like to simplify code examples for the sake of helping you understand what's going on.

The basics of entering an AR session and running a render loop are the same for AR as they are for VR. You can read my previous article if you're unfamiliar. To be more specific, entering and running an AR session looks almost exactly like entering a VR magic window session. As with a magic window, the session type must be non-exclusive and the frame of reference type must be 'eye-level'.

The new API

Now I'll show you how to use the new API. Recall that in AR, the reticle attempts to find a surface before an item is placed. This is done with a hit test. To do a hit test, call XRSession.requestHitTest(). It looks like this:

xrSession.requestHitTest(origin, direction, frameOfReference)
.then(xrHitResult => {
  //
});

The three arguments to this method represent a ray cast. The ray cast is defined by two points on the ray (origin and direction) and where those points are calculated from (frameOfReference). The origin and direction are both 3D vectors. Regardless of what value you submit, they will be normalized (converted) to a length of 1.

Moving the reticle

As you move your device the reticle needs to move with it as it tries to find a location where an object can be placed. This means that the reticle must be redrawn in every frame.

Start with the requestAnimationFrame() callback. As with VR, you need a session and a pose.

function onXRFrame(t, frame) {
  let xrSession = frame.session;
  // The frame of reference, which was set elsewhere, is 'eye-level'.
  // See onSessionStarted() ins the sample code for details.
  let xrPose = frame.getDevicePose(xrFrameOfRef);
  if (xrPose && xrPose.poseModelMatrix) {
    // Do the hit test and draw the reticle.
  }
}

Once you have the session and the pose, determine where the ray is casting. The sample code uses the gl-matrix math library. But gl-matrix is not a requirement. The important thing is knowing what you're calculating with it and that it is based on the position of the device. Retrieve the device position from XRPose.poseModalMatrix. With your ray cast in hand, call requestHitTest().

function onXRFrame(t, frame) {
  let xrSession = frame.session;
  // The frame of reference, which was set elsewhere, is 'eye-level'.
  // See onSessionStarted() ins the sample code for details.
  let pose = frame.getDevicePose(xrFrameOfRef);
  if (xrPose && xrPose.poseModelMatrix) {
    // Calculate the origin and direction for the raycast.
    xrSession.requestHitTest(rayOrigin, rayDirection, xrFrameOfRef)
    .then((results) => {
      if (results.length) {
        // Draw for each view.
      }
    });
  }
  session.requestAnimationFrame(onXRFrame);
}

Though not as obvious in the hit test sample, you still need to loop through the views to draw the scene. Drawing is done using WebGL APIs. You can do that if you're really ambitious. Though, we recommend using a framework. The immersive web samples use one created just for the demos called Cottontail, and Three.js has supported WebXR since May.

Placing an object

An object is placed in AR when the user taps the screen. For that you use the select event. The important thing in this step is knowing where to place it. Since the moving reticle gives you a constant source of hit tests, the simplest way to place an object is to draw it at the location of the reticle at the last hit test. If you need to, say you have a legitimate reason not to show a reticle, you can call requestHitTest() in the select event as shown in the sample.

Conclusion

The best way to get a handle on this is to step through the sample code or try out the codelab. I hope I've given you enough background to make sense of both.

We're not done building immersive web APIs, not by a long shot. We'll publish new articles here as we make progress.

AudioWorklet Design Pattern

$
0
0

AudioWorklet Design Pattern

The previous article on AudioWorklets detailed the basic concepts and usage. Since its launch in Chrome 66 there have been many requests for more examples of how it can be used in actual applications. The AudioWorklet unlocks the full potential of WebAudio, but taking advantage of it can be challenging because it requires understanding concurrent programming wrapped with several JS APIs. Even for developers who are familiar with WebAudio, integrating the AudioWorklet with other APIs (e.g. WebAssembly) can be difficult.

This article will give the reader a better understanding of how to use the AudioWorklet in real-world settings and to offer tips to draw on its fullest power. Be sure to check out code examples and live demos as well!

Recap: AudioWorklet

Before diving in, let's quickly recap terms and facts around the AudioWorklet system which was previously introduced in this post.

  • BaseAudioContext: Web Audio API's primary object.
  • AudioWorklet: A special script file loader for the AudioWorklet operation. Belongs to BaseAudioContext. A BaseAudioContext can have one AudioWorklet. The loaded script file is evaluated in the AudioWorkletGlobalScope and is used to create the AudioWorkletProcessor instances.
  • AudioWorkletGlobalScope : A special JS global scope for the AudioWorklet operation. Runs on a dedicated rendering thread for the WebAudio. A BaseAudioContext can have one AudioWorkletGlobalScope.
  • AudioWorkletNode : An AudioNode designed for the AudioWorklet operation. Instantiated from a BaseAudioContext. A BaseAudioContext can have multiple AudioWorkletNodes similarly to the native AudioNodes.
  • AudioWorkletProcessor : A counterpart of the AudioWorkletNode. The actual guts of the AudioWorkletNode processing the audio stream by the user-supplied code. It is instantiated in the AudioWorkletGlobalScope when a AudioWorkletNode is constructed. An AudioWorkletNode can have one matching AudioWorkletProcessor.

Design Patterns

Using AudioWorklet with WebAssembly

WebAssembly is a perfect companion for AudioWorkletProcessor. The combination of these two features brings a variety of advantages to audio processing on the web, but the two biggest benefits are: a) bringing existing C/C++ audio processing code into the WebAudio ecosystem and b) avoiding the overhead of JS JIT compilation and garbage collection in the audio processing code.

The former is important to developers with an existing investment in audio processing code and libraries, but the latter is critical for nearly all users of the API. In the world of WebAudio, the timing budget for the stable audio stream is quite demanding: it is only 3ms at the sample rate of 44.1Khz. Even a slight hiccup in the audio processing code can cause glitches. The developer must optimize the code for faster processing, but also minimize the amount of JS garbage being generated. Using WebAssembly can be a solution that addresses both problems at the same time: it is faster and generates no garbage from the code.

The next section describes how WebAssembly can be used with an AudioWorklet and the accompanied code example can be found here. For the basic tutorial on how to use Emscripten and WebAssembly (especially the Emscripten glue code), please take a look at this article.

Setting Up

It all sounds great, but we need a bit of structure to set things up properly. The first design question to ask is how and where to instantiate a WebAssembly module. After fetching the Emscripten's glue code, there are two paths for the module instantiation:

  1. Instantiate a WebAssembly module by loading the glue code into the AudioWorkletGlobalScope via audioContext.audioWorklet.addModule().
  2. Instantiate a WebAssembly module in the main scope, then transfer the module via the AudioWorkletNode's constructor options.

The decision largely depends on your design and preference, but the idea is that the WebAssembly module can generate a WebAssembly instance in the AudioWorkletGlobalScope, which becomes an audio processing kernel within an AudioWorkletProcessor instance.

WebAssembly module instantiation pattern A: Using .addModule() call

For pattern A to work correctly, Emscripten needs a couple of options to generate the correct WebAssembly glue code for our configuration:

-s BINARYEN_ASYNC_COMPILATION=0 -s SINGLE_FILE=1 --post-js mycode.js

These options ensure the synchronous compilation of a WebAssembly module in the AudioWorkletGlobalScope. It also appends the AudioWorkletProcessor's class definition in mycode.js so it can be loaded after the module is initialized. The primary reason to use the synchronous compilation is that the promise resolution of audioWorklet.addModule() does not wait for the resolution of promises in the AudioWorkletGlobalScope. The synchronous loading or compilation in the main thread is not generally recommended because it blocks the other tasks in the same thread, but here we can bypass the rule because the compilation happens on the AudioWorkletGlobalScope, which runs off of the main thread. (See this for the more info.)

WASM module instantiation pattern B: Using AudioWorkletNode constructor's cross-thread transfer

The pattern B can be useful if asynchronous heavy-lifting is required. It utilizes the main thread for fetching the glue code from the server and compiling the module. Then it will transfer the WASM module via the constructor of AudioWorkletNode. This pattern makes even more sense when you have to load the module dynamically after the AudioWorkletGlobalScope starts rendering the audio stream. Depending on the size of the module, compiling it in the middle of the rendering can cause glitches in the stream.

Currently, the pattern B is only supported behind an experimental flag because it requires WebAssembly structured cloning. (chrome://flags/#enable-webassembly) If a WASM module should be a part of your AudioWorkletNode design, passing it through AudioWorkletNode constructor can definitely be useful.

WASM Heap and Audio Data

WebAssembly code only works on the memory allocated within a dedicated WASM heap. In order to take advantage of it, the audio data needs to be cloned back and forth between the WASM heap and the audio data arrays. The HeapAudioBuffer class in the example code handles this operation nicely.

HeapAudioBuffer class for the easier usage of WASM heap

There is an early proposal under discussion to integrate the WASM heap directly into the AudioWorklet system. Getting rid of this redundant data cloning between the JS memory and the WASM heap seems natural, but the specific details need to be worked out.

Handling Buffer Size Mismatch

An AudioWorkletNode and AudioWorkletProcessor pair is designed to work like a regular AudioNode; AudioWorkletNode handles the interaction with other codes while AudioWorkletProcessor takes care of internal audio processing. Because a regular AudioNode processes 128 frames at a time, AudioWorkletProcessor must do the same to become a first-class citizen. This is one of the advantages of the AudioWorklet design that ensures no additional latency due to internal buffering is introduced within the AudioWorkletProcessor, but it can be a problem if a processing function requires a buffer size different than 128 frames. The common solution for such case is to use a ring buffer, also known as a circular buffer or a FIFO.

Here's a diagram of AudioWorkletProcessor using two ring buffers inside to accommodate a WASM function that takes 512 frames in and out. (The number 512 here is arbitrarily picked.)

Using RingBuffer inside of AudioWorkletProcessor's process() method

The algorithm for diagram would be:

  1. AudioWorkletProcessor pushes 128 frames into the Input RingBuffer from its Input.
  2. Perform the following steps only if the Input RingBuffer has greater than or equal to 512 frames.
    1. Pull 512 frames from the Input RingBuffer.
    2. Process 512 frames with the given WASM function.
    3. Push 512 frames to the Output RingBuffer.
  3. AudioWorkletProcessor pulls 128 frames from the Output RingBuffer to fill its Output.

As shown in the diagram, Input frames always get accumulated into Input RingBuffer and it handles buffer overflow by overwriting the oldest frame block in the buffer. That is a reasonable thing to do for a real-time audio application. Similarly, the Output frame block will always get pulled by the system. Buffer underflow (not enough data) in Output RingBuffer will result silence causing a glitch in the stream.

This pattern is useful when replacing ScriptProcessorNode (SPN) with AudioWorkletNode. Since SPN allows the developer to pick a buffer size between 256 and 4096 frames, so the drop-in substitution of SPN with AudioWorkletNode can be difficult and using a ring buffer provides a nice workaround. A audio recorder would be a great example that can be built on top of this design.

However, it is important to understand that this design only reconciles the buffer size mismatch and it does not give more time to run the given script code. If the code cannot finish the task within the timing budget of render quantum (~3ms at 44.1Khz), it will affect the onset timing of subsequent callback function and eventually cause glitches.

Mixing this design with WebAssembly can be complicated because of memory management around WASM heap. At the time of writing, the data going in and out of WASM heap must be cloned but we can utilize HeapAudioBuffer class to make memory management slightly easier. The idea of using user-allocated memory to reduce redundant data cloning will be discussed in the future.

The RingBuffer class can be found here.

WebAudio Powerhouse: AudioWorklet and SharedArrayBuffer

Note: SharedArrayBuffer is disabled by default at the time of writing. Go to chrome://flags and enable SharedArrayBuffer to play with this feature.

The last design pattern in this article is to put several cutting edge APIs into one place; AudioWorklet, SharedArrayBuffer, Atomics and Worker. With this non-trivial setup, it unlocks a path for existing audio software written in C/C++ to run in a web browser while maintaining a smooth user experience.

An overview of the last design pattern: AudioWorklet, SharedArrayBuffer and Worker

The biggest advantage of this design is being able to use a DedicatedWorkerGlobalScope solely for audio processing. In Chrome, WorkerGlobalScope runs on a lower priority thread than the WebAudio rendering thread but it has several advantages over AudioWorkletGlobalScope . DedicatedWorkerGlobalScope is less constrained in terms of the API surface available in the scope. Also you can expect better support from Emscripten because the Worker API has existed for some years.

SharedArrayBuffer plays a critical role for this design to work efficiently. Although both Worker and AudioWorkletProcessor are equipped with asynchronous messaging (MessagePort), it is suboptimal for real-time audio processing because of repetitive memory allocation and messaging latency. So we allocate a memory block up front that can be accessed from both threads for fast bidirectional data transfer.

From Web Audio API purist's viewpoint, this design might look suboptimal because it uses the AudioWorklet as a simple "audio sink" and does everything in the Worker. But considering the cost of rewriting C/C++ projects in JavaScript can be prohibitive or even be impossible, this design can be the most efficient implementation path for such projects.

Shared States and Atomics

When using a shared memory for audio data, the access from both sides must be coordinated carefully. Sharing atomically accessible states is a solution for such problem. We can take advantage of Int32Array backed by a SAB for this purpose.

Synchronization mechanism: SharedArrayBuffer and Atomics

Synchronization mechanism: SharedArrayBuffer and Atomics

Each field of the States array represents vital information about the shared buffers. The most important one is a field for the synchronization (REQUEST_RENDER). The idea is that Worker waits for this field to be touched by AudioWorkletProcessor and process the audio when it wakes up. Along with SharedArrayBuffer (SAB), Atomics API makes this mechanism possible.

Note that the synchronization of two threads are rather loose. The onset of Worker.process() will be triggered by AudioWorkletProcessor.process() method, but the AudioWorkletProcessor does not wait until the Worker.process() is finished. This is by design; the AudioWorkletProcessor is driven by the audio callback so it must not be synchronously blocked. In the worst case scenario the audio stream might suffer from duplicate or drop out but it will eventually recover when the rendering performance is stabilized.

Setting Up and Running

As shown in the diagram above, this design has several components to arrange: DedicatedWorkerGlobalScope (DWGS), AudioWorkletGlobalScope (AWGS), SharedArrayBuffer and the main thread. The following steps describe what should happen in the initialization phase.

Initialization
  1. [Main] AudioWorkletNode constructor gets called.
    1. Create Worker.
    2. The associated AudioWorkletProcessor will be created.
  2. [DWGS] Worker creates 2 SharedArrayBuffers. (one for shared states and the other for audio data)
  3. [DWGS] Worker sends SharedArrayBuffer references to AudioWorkletNode.
  4. [Main] AudioWorkletNode sends SharedArrayBuffer references to AudioWorkletProcessor.
  5. [AWGS] AudioWorkletProcessor notifies AudioWorkletNode that the setup is completed.

Once the initialization is completed, AudioWorkletProcessor.process() starts to get called. The following is what should happen in each iteration of the rendering loop.

Rendering Loop
Multi-threaded rendering with SharedArrayBuffers
  1. [AWGS] AudioWorkletProcessor.process(inputs, outputs) gets called for every render quantum.
    1. inputs will be pushed into Input SAB.
    2. outputs will be filled by consuming audio data in Output SAB.
    3. Updates States SAB with new buffer indexes accordingly.
    4. If Output SAB gets close to underflow threshold, Wake Worker to render more audio data.
  2. [DWGS] Worker waits (sleeps) for the wake signal from AudioWorkletProcessor.process(). When it wakes up:
    1. Fetches buffer indexes from States SAB.
    2. Run the process function with data from Input SAB to fill Output SAB.
    3. Updates States SAB with buffer indexes accordingly.
    4. Goes to sleep and wait for the next signal.

The example code can be found here, but note that the SharedArrayBuffer experimental flag must be enabled for this demo to work. The code was written with pure JS code for simplicity, but it can be replaced with WebAssembly code if needed. Such case should be handled with extra care by wrapping memory management with HeapAudioBuffer class.

Conclusion

The ultimate goal of the AudioWorklet is to make the Web Audio API truly "extensible". A multi-year effort went into its design to make it possible to implement the rest of Web Audio API with the AudioWorklet. In turn, now we have higher complexity in its design and this can be an unexpected challenge.

Fortunately, the reason for such complexity is purely to empower developers. Being able to run WebAssembly on AudioWorkletGlobalScope unlocks huge potential for high-performance audio processing on the web. For large-scale audio applications written in C or C++, using an AudioWorklet with SharedArrayBuffers and Workers can be an attractive option to explore.

Credits

Special thanks to Chris Wilson, Jason Miller, Joshua Bell and Raymond Toy for reviewing a draft of this article and giving insightful feedback.

Viewing all 599 articles
Browse latest View live