JAVASCRIPT CHAPTER 4 2018-12-19T12:51:53+00:00


Topics:- (Project: A Programming Language, The Web, HTML, HTML and JavaScript, In the sandbox, Document structure, Trees, The standard, Finding the elements, Changing the document, Creating nodes, Attributes, Layouts, Styling, Query Selector, Positioning and animating, Event handlers, Event objects, Propagation, Default actions, Events, Timers, Debouncing)

Project: A Programming Language

Building your own programming language is surprisingly easy (as long as you do not aim too high) and very enlightening. The main thing we want to show in this chapter is that there is no magic involved in building your own language. We’ve often felt that some human inventions were so immensely clever and complicated that we’d never be able to understand them. But with a little reading and experimenting, they often turn out to be quite mundane. We will build a programming language called Egg. It will be a tiny, simple language—but one that is powerful enough to express any computation you can think of. It will allow simple abstraction based on functions.


The most immediately visible part of a programming language is its syntax, or notation. A parser is a program that reads a piece of text and produces a data structure that reflects the structure of the program contained in that text. If the text does not form a valid program, the parser should point out the error. Our language will have a simple and uniform syntax. Everything in Egg is an expression. An expression can be the name of a binding, a number, a string, or an application. Applications are used for function calls but also for constructs such as if or while.

To keep the parser simple, strings in Egg do not support anything like back-slash escapes. A string is simply a sequence of characters that are not double quotes, wrapped in double quotes. A number is a sequence of digits. Binding names can consist of any character that is not whitespace and that does not have a special meaning in the syntax. Applications are written the way they are in JavaScript, by putting parentheses after an expression and having any number of arguments between those parentheses, separated by commas.

do(define(x, 10),

if(>(x, 5),



The uniformity of the Egg language means that things that are operators in JavaScript (such as >) are normal bindings in this language, applied just like other functions. And since the syntax has no concept of a block, we need a do construct to represent doing multiple things in sequence. The data structure that the parser will use to describe a program consists of expression objects, each of which has a type property indicating the kind of expression it is and other properties to describe its content.

Expressions of type “value” represent literal strings or numbers. Their value property contains the string or number value that they represent. Expressions of type “word” are used for identifiers (names). Such objects have a name property that holds the identifier’s name as a string. Finally, “apply” expressions represent applications. They have an operator property that refers to the expression that is being applied, as well as an args property that holds an array of argument expressions. The >(x, 5) part of the previous program would be represented like this:


type: “apply”,

operator: {type: “word”, name: “>”},

args: [

{type: “word”, name: “x”},

{type: “value”, value: 5}



Such a data structure is called a syntax tree. If you imagine the objects as dots and the links between them as lines between those dots, it has a treelike shape. The fact that expressions contain other expressions, which in turn might contain more expressions, is similar to the way tree branches split and split again.

Contrast this to the parser we wrote for the configuration file format , which had a simple structure: it split the input into lines and handled those lines one at a time. There were only a few simple forms that a line was allowed to have. Here we must find a different approach. Expressions are not separated into lines, and they have a recursive structure. Application expressions contain other expressions. Fortunately, this problem can be solved very well by writing a parser function that is recursive in a way that reflects the recursive nature of the language.

We define a function parseExpression, which takes a string as input and returns an object containing the data structure for the expression at the start of the string, along with the part of the string left after parsing this expression. When parsing subexpressions (the argument to an application, for example), this function can be called again, yielding the argument expression as well as the text that remains. This text may in turn contain more arguments or may be the closing parenthesis that ends the list of arguments. This is the first part of the parser:

function parseExpression(program) {

program = skipSpace(program);

let match, expr;

if (match = /^”([^”]*)”/.exec(program)) {

expr = {type: “value”, value: match[1]};

} else if (match = /^\d+\b/.exec(program)) {

expr = {type: “value”, value: Number(match[0])};

} else if (match = /^[^\s(),”]+/.exec(program)) {

expr = {type: “word”, name: match[0]};

} else {

throw new SyntaxError(“Unexpected syntax: ” + program);


return parseApply(expr, program.slice(match[0].length));


function skipSpace(string) {

let first =\S/);

if (first == -1) return “”;

return string.slice(first);


Because Egg, like JavaScript, allows any amount of whitespace between its elements, we have to repeatedly cut the whitespace off the start of the program string. That is what the skipSpace function helps with. After skipping any leading space, parseExpression uses three regular expressions to spot the three atomic elements that Egg supports: strings, numbers, and words. The parser constructs a different kind of data structure depending on which one matches. If the input does not match one of these three forms, it is not a valid expression, and the parser throws an error. We use SyntaxError instead of Error as the exception constructor, which is another standard error type, because it is a little more specific—it is also the error type thrown when an attempt is made to run an invalid JavaScript program. We then cut off the part that was matched from the program string and pass that, along with the object for the expression, to parseApply, which checks whether the expression is an application. If so, it parses a parenthesized list of arguments.

function parseApply(expr, program) {

program = skipSpace(program);

if (program[0] != “(“) {

return {expr: expr, rest: program};


program = skipSpace(program.slice(1));

expr = {type: “apply”, operator: expr, args: []};

while (program[0] != “)”) {

let arg = parseExpression(program);


program = skipSpace(;

if (program[0] == “,”) {

program = skipSpace(program.slice(1));

} else if (program[0] != “)”) {

throw new SyntaxError(“Expected ‘,’ or ‘)'”);



return parseApply(expr, program.slice(1));


If the next character in the program is not an opening parenthesis, this is not an application, and parseApply returns the expression it was given. Otherwise, it skips the opening parenthesis and creates the syntax tree object for this application expression. It then recursively calls parseExpression to parse each argument until a closing parenthesis is found. The recursion is indirect, through parseApply and parseExpression calling each other. Because an application expression can itself be applied (such as in multiplier (2)(1)), parseApply must, after it has parsed an application, call itself again to check whether another pair of parentheses follows. This is all we need to parse Egg. We wrap it in a convenient parse function that verifies that it has reached the end of the input string after parsing the expression (an Egg program is a single expression), and that gives us the program’s data structure.

function parse(program) {

let {expr, rest} = parseExpression(program);

if (skipSpace(rest).length > 0) {

throw new SyntaxError(“Unexpected text after program”);


return expr;


console.log(parse(“+(a, 10)”));

// → {type: “apply”,

//    operator: {type: “word”, name: “+”},

//    args: [{type: “word”, name: “a”},

//               {type: “value”, value: 10}]}

It works! It doesn’t give us very helpful information when it fails and doesn’t store the line and column on which each expression starts, which might be helpful when reporting errors later, but it’s good enough for our purposes.

The evaluator

What can we do with the syntax tree for a program? Run it, of course! And that is what the evaluator does. You give it a syntax tree and a scope object that associates names with values, and it will evaluate the expression that the tree represents and return the value that this produces.

const specialForms = Object.create(null);

function evaluate(expr, scope) {

  if (expr.type == “value”) {

      return expr.value;

} else if (expr.type == “word”) {

  if ( in scope) {

       return scope[];

} else {

     throw new ReferenceError(

              `Undefined binding: ${}`);


} else if (expr.type == “apply”) {

         let {operator, args} = expr;

         if (operator.type == “word” && in specialForms) {

    return specialForms[](expr.args, scope);

} else {

let op = evaluate(operator, scope);

        if (typeof op == “function”) {

return op(… => evaluate(arg, scope)));

} else {

     throw new TypeError(“Applying a non-function.”);





The evaluator has code for each of the expression types. A literal value expression produces its value. (For example, the expression 100 just evaluates to the number 100.) For a binding, we must check whether it is actually defined in the scope and, if it is, fetch the binding’s value. Applications are more involved. If they are a special form, like if, we do not evaluate anything and pass the argument expressions, along with the scope, to the function that handles this form. If it is a normal call, we evaluate the operator, verify that it is a function, and call it with the evaluated arguments.

We use plain JavaScript function values to represent Egg’s function values. We will come back to this later, when the special form called fun is defined. The recursive structure of evaluate resembles the similar structure of the parser, and both mirror the structure of the language itself. It would also be possible to integrate the parser with the evaluator and evaluate during parsing, but splitting them up this way makes the program clearer. This is really all that is needed to interpret Egg. It is that simple. But without defining a few special forms and adding some useful values to the environment, you can’t do much with this language yet.

Special forms

The specialForms object is used to define special syntax in Egg. It associates words with functions that evaluate such forms. It is currently empty. Let’s add if.

specialForms.if = (args, scope) => {

if (args.length != 3) {

throw new SyntaxError(“Wrong number of args to if”);

} else if (evaluate(args[0], scope) !== false) { return evaluate(args[1], scope);

} else {

return evaluate(args[2], scope);



Egg’s if construct expects exactly three arguments. It will evaluate the first, and if the result isn’t the value false, it will evaluate the second. Otherwise, the third gets evaluated. This if form is more similar to JavaScript’s ternary ?: operator than to JavaScript’s if. It is an expression, not a statement, and it produces a value, namely, the result of the second or third argument. Egg also differs from JavaScript in how it handles the condition value to if. It will not treat things like zero or the empty string as false, only the precise value falseThe reason we need to represent if as a special form, rather than a regular function, is that all arguments to functions are evaluated before the function is called, whereas if should evaluate only either its second or its third argument, depending on the value of the first. The while form is similar.

specialForms.while = (args, scope) => {

if (args.length != 2) {

throw new SyntaxError(“Wrong number of args to while”);


while (evaluate(args[0], scope) !== false) { evaluate(args[1], scope);


// Since undefined does not exist in Egg, we return false,

// for lack of a meaningful result.

return false;


Another basic building block is do, which executes all its arguments from top to bottom. Its value is the value produced by the last argument. = (args, scope) => {

let value = false;

for (let arg of args) {

value = evaluate(arg, scope);


return value;


To be able to create bindings and give them new values, we also create a form called define. It expects a word as its first argument and an expression producing the value to assign to that word as its second argument. Since define, like everything, is an expression, it must return a value. We’ll make it return the value that was assigned (just like JavaScript’s = operator).

specialForms.define = (args, scope) => {

if (args.length != 2 || args[0].type != “word”) {

throw new SyntaxError(“Incorrect use of define”);


let value = evaluate(args[1], scope);

scope[args[0].name] = value;

return value;


The environment

The scope accepted by evaluate is an object with properties whose names correspond to binding names and whose values correspond to the values those bindings are bound to. Let’s define an object to represent the global scope. To be able to use the if construct we just defined, we must have access to Boolean values. Since there are only two Boolean values, we do not need special syntax for them. We simply bind two names to the values true and false and use them.

const topScope = Object.create(null);

topScope.true = true;

topScope.false = false;

We can now evaluate a simple expression that negates a Boolean value.

let prog = parse(`if(true, false, true)`);

console.log(evaluate(prog, topScope));

// → false

To supply basic arithmetic and comparison operators, we will also add some function values to the scope. In the interest of keeping the code short, we’ll use Function to synthesize a bunch of operator functions in a loop, instead of defining them individually.

for (let op of [“+”, “-“, “*”, “/”, “==”, “<“, “>”]) {

topScope[op] = Function(“a, b”, `return a ${op} b;`);


A way to output values is also useful, so we’ll wrap console.log in a function and call it print.

topScope.print = value => {


return value;


That gives us enough elementary tools to write simple programs. The following function provides a convenient way to parse a program and run it in a fresh scope:

function run(program) {

return evaluate(parse(program), Object.create(topScope));


We’ll use object prototype chains to represent nested scopes so that the program can add bindings to its local scope without changing the top-level scope.


do(define(total, 0),

define(count, 1),

while(<(count, 11),

do(define(total, +(total, count)),

define(count, +(count, 1)))),



// → 55

This is the program we’ve seen several times before, which computes the sum of the numbers 1 to 10, expressed in Egg. It is clearly uglier than the equivalent JavaScript program—but not bad for a language implemented in less than 150 lines of code.


A programming language without functions is a poor programming language indeed.Fortunately, it isn’t hard to add a fun construct, which treats its last argument as the function’s body and uses all arguments before that as the names of the function’s parameters. = (args, scope) => {

  if (!args.length) {

     throw new SyntaxError(“Functions need a body”);


let body = args[args.length – 1];

let params = args.slice(0, args.length – 1).map(expr => {

     if (expr.type != “word”) {

             throw new SyntaxError(“Parameter names must be words”);




return function() {

    if (arguments.length != params.length) {

          throw new TypeError(“Wrong number of arguments”);


let localScope = Object.create(scope);

for (let i = 0; i < arguments.length; i++) {

    localScope[params[i]] = arguments[i];


return evaluate(body, localScope);



Functions in Egg get their own local scope. The function produced by the fun form creates this local scope and adds the argument bindings to it. It then evaluates the function body in this scope and returns the result.


do(define(plusOne, fun(a, +(a, 1))),



// → 11


do(define(pow, fun(base, exp,

if(==(exp, 0),


*(base, pow(base, -(exp, 1)))))), print(pow(2, 10)))


// → 1024


What we have built is an interpreter. During evaluation, it acts directly on the representation of the program produced by the parser. Compilation is the process of adding another step between the parsing and the running of a program, which transforms the program into something that can be evaluated more efficiently by doing as much work as possible in advance. For example, in well-designed languages it is obvious, for each use of a binding, which binding is being referred to, without actually running the program. This can be used to avoid looking up the binding by name every time it is accessed, instead directly fetching it from some predetermined memory location.

Traditionally, compilation involves converting the program to machine code, the raw format that a computer’s processor can execute. But any process that converts a program to a different representation can be thought of as compilation. It would be possible to write an alternative evaluation strategy for Egg, one that first converts the program to a JavaScript program, uses Function to invoke the JavaScript compiler on it, and then runs the result. When done right, this would make Egg run very fast while still being quite simple to implement. If you are interested in this topic and willing to spend some time on it, we encourage you to try to implement such a compiler as an exercise.


When we defined if and while, you probably noticed that they were more or less trivial wrappers around JavaScript’s own if and while. Similarly, the values in Egg are just regular old JavaScript values. If you compare the implementation of Egg, built on top of JavaScript, with the amount of work and complexity required to build a programming language directly on the raw functionality provided by a machine, the difference is huge. Regardless, this example ideally gave you an impression of the way programming languages work.

And when it comes to getting something done, cheating is more effective than doing everything yourself. Though the toy language in this  doesn’t do anything that couldn’t be done better in JavaScript, there are situations where writing small languages helps get real work done. Such a language does not have to resemble a typical programming language. If JavaScript didn’t come equipped with regular expressions, for example, you could write your own parser and evaluator for regular expressions. Or imagine you are building a giant robotic dinosaur and need to program its behavior. JavaScript might not be the most effective way to do this. You might instead opt for a language that looks like this:

behavior walk

   perform when

         destination ahead


    move left-foot

            move right-foot

behavior attack

     perform when

         Godzilla in-view


     fire laser-eyes

     launch arm-rockets

This is what is usually called a domain-specific language, a language tailored to express a narrow domain of knowledge. Such a language can be more expressive than a general-purpose language because it is designed to describe exactly the things that need to be described in its domain, and nothing else.

JavaScript and the Browser

The next will talk about web browsers. Without web browsers, there would be no JavaScript. Or even if there were, no one would ever have paid any attention to it. Web technology has, from the start, been decentralized, not just technically but also in the way it evolved. Various browser vendors have added new functionality in ad hoc and sometimes poorly thought-out ways, which then, sometimes, ended up being adopted by others—and finally set down as in standards. This is both a blessing and a curse. On the one hand, it is empowering to not have a central party control a system but have it be improved by various parties working in loose collaboration (or occasionally open hostility). On the other hand, the haphazard way in which the Web was developed means that the resulting system is not exactly a shining example of internal consistency. Some parts of it are downright confusing and poorly conceived.

Networks and the Internet

Computer networks have been around since the 1950s. If you put cables between two or more computers and allow them to send data back and forth through these cables, you can do all kinds of wonderful things. And if connecting two machines in the same building allows us to do wonderful things, connecting machines all over the planet should be even better. The technology to start implementing this vision was developed in the 1980s, and the resulting network is called the Internet. It has lived up to its promise.

A computer can use this network to shoot bits at another computer. For any effective communication to arise out of this bit-shooting, the computers on both ends must know what the bits are supposed to represent. The meaning of any given sequence of bits depends entirely on the kind of thing that it is trying to express and on the encoding mechanism used. A network protocol describes a style of communication over a network. There are protocols for sending email, for fetching email, for sharing files, and even for controlling computers that happen to be infected by malicious software.

For example, the Hypertext Transfer Protocol (HTTP) is a protocol for re-trieving named resources (chunks of information, such as web pages or pictures). It specifies that the side making the request should start with a line like this, naming the resource and the version of the protocol that it is trying to use:

GET /index.html HTTP/1.1

There are a lot more rules about the way the requester can include more information in the request and the way the other side, which returns the resource, packages up its content. Most protocols are built on top of other protocols. HTTP treats the network as a streamlike device into which you can put bits and have them arrive at the correct destination in the correct order. As we saw , ensuring those things is already a rather difficult problem. The Transmission Control Protocol (TCP) is a protocol that addresses this problem. All Internet-connected devices “speak” it, and most communication on the Internet is built on top of it.

A TCP connection works as follows: one computer must be waiting, or listening, for other computers to start talking to it. To be able to listen for different kinds of communication at the same time on a single machine, each listener has a number (called a port) associated with it. Most protocols specify which port should be used by default. For example, when we want to send an email using the SMTP protocol, the machine through which we send it is expected to be listening on port 25. Another computer can then establish a connection by connecting to the target machine using the correct port number. If the target machine can be reached and is listening on that port, the connection is successfully created. The listening computer is called the server, and the connecting computer is called the client.

Such a connection acts as a two-way pipe through which bits can flow—the machines on both ends can put data into it. Once the bits are successfully transmitted, they can be read out again by the machine on the other side. This is a convenient model. You could say that TCP provides an abstraction of the network.

The Web

The World Wide Web (not to be confused with the Internet as a whole) is a set of protocols and formats that allow us to visit web pages in a browser. The “Web” part in the name refers to the fact that such pages can easily link to each other, thus connecting into a huge mesh that users can move through. To become part of the Web, all you need to do is connect a machine to the Internet and have it listen on port 80 with the HTTP protocol so that other computers can ask it for documents. Each document on the Web is named by a Uniform Resource Locator (URL), which looks something like this:

protocol            server                               path

The first part tells us that this URL uses the HTTP protocol (as opposed to, for example, encrypted HTTP, which would be https://). Then comes the part that identifies which server we are requesting the document from. Last is a path string that identifies the specific document (or resource) we are interested in. Machines connected to the Internet get an IP address, which is a number that can be used to send messages to that machine, and looks something like or 2001:4860:4860::8888. But lists of more or less random numbers are hard to remember and awkward to type, so you can instead register a domain name for a specific address or set of addresses. We registered to point at the IP address of a machine we control and can thus use that domain name to serve web pages.

If you type this URL into your browser’s address bar, the browser will try to retrieve and display the document at that URL. First, your browser has to find out what address refers to. Then, using the HTTP protocol, it will make a connection to the server at that address and ask for the resource /13_browser.html. If all goes well, the server sends back a document, which your browser then displays on your screen.


HTML, which stands for Hypertext Markup Language, is the document format used for web pages. An HTML document contains text, as well as tags that give structure to the text, describing things such as links, paragraphs, and headings. A short HTML document might look like this:

<!doctype html>



<meta charset=”utf-8″>

<title>My home page</title>



<h1>My home page</h1>

<p>Hello, I am Marijn and this is my home page.</p>

<p>I also wrote a book! Read it

<a href=””>here</a>.</p>



This is what such a document would look like in the browser:

The tags, wrapped in angle brackets (< and >, the symbols for less than and greater than), provide information about the structure of the document. The other text is just plain text.bThe document starts with <!doctype html>, which tells the browser to interpret the page as modern HTML, as opposed to various dialects that were in use in the past. HTML documents have a head and a body. The head contains information about the document, and the body contains the document itself. In this case, the head declares that the title of this document is “My home page” and that it uses the UTF-8 encoding, which is a way to encode Unicode text as binary data. The document’s body contains a heading (<h1>, meaning “heading 1”—<h2> to <h6> produce subheadings) and two paragraphs (<p>).

Tags come in several forms. An element, such as the body, a paragraph, or a link, is started by an opening tag like <p> and ended by a closing tag like </p>. Some opening tags, such as the one for the link (<a>), contain extra information in the form of name=”value” pairs. These are called attributes. In this case, the destination of the link is indicated with href=”http://eloquentjavascript .net”, where href stands for “hypertext reference”.Some kinds of tags do not enclose anything and thus do not need to be closed. The metadata tag <meta charset=”utf-8″> is an example of this. To be able to include angle brackets in the text of a document, even though they have a special meaning in HTML, yet another form of special notation has to be introduced. A plain opening angle bracket is written as &lt; (“less than”), and a closing bracket is written as &gt; (“greater than”). In HTML, an ampersand (&) character followed by a name or character code and a semicolon (;) is called an entity and will be replaced by the character it encodes.

This is analogous to the way backslashes are used in JavaScript strings. Since this mechanism gives ampersand characters a special meaning, too, they need to be escaped as &amp;. Inside attribute values, which are wrapped in double quotes, &quot; can be used to insert an actual quote character. HTML is parsed in a remarkably error-tolerant way. When tags that should be there are missing, the browser reconstructs them. The way in which this is done has been standardized, and you can rely on all modern browsers to do it in the same way. The following document will be treated just like the one shown previously:

<!doctype html>

<meta charset=utf-8>

<title>My home page</title>

<h1>My home page</h1>

<p>Hello, I am Marijn and this is my home page.

<p>I also wrote a book! Read it

<a href=>here</a>.

The <html>, <head>, and <body> tags are gone completely. The browser knows that <meta> and <title> belong in the head and that <h1> means the body has started. Furthermore, we are no longer explicitly closing the paragraphs since opening a new paragraph or ending the document will close them implicitly. The quotes around the attribute values are also gone. We will usually omit the <html>, <head>, and <body> tags from examples to keep them short and free of clutter. But we will close tags and include quotes around attributes. We will also usually omit the doctype and charset declaration. This is not to be taken as an encouragement to drop these from HTML documents. Browsers will often do ridiculous things when you forget them. You should consider the doctype and the charset metadata to be implicitly present in examples, even when they are not actually shown in the text.

HTML and JavaScript

The most important HTML tag is <script>. This tag allows us to include a piece of JavaScript in a document.

<h1>Testing alert</h1>


Such a script will run as soon as its <script> tag is encountered while the browser reads the HTML. This page will pop up a dialog when opened—the alert function resembles prompt, in that it pops up a little window, but only shows a message without asking for input. Including large programs directly in HTML documents is often impractical. The <script> tag can be given an src attribute to fetch a script file (a text file containing a JavaScript program) from a URL.

<h1>Testing alert</h1>

<script src=”code/hello.js”></script>

The code/hello.js file included here contains the same program—alert(“ hello!”). When an HTML page references other URLs as part of itself—for example, an image file or a script—web browsers will retrieve them immediately and include them in the page. A script tag must always be closed with </script>, even if it refers to a script file and doesn’t contain any code. If you forget this, the rest of the page will be interpreted as part of the script. You can load ES modules  in the browser by giving your script tag a type=”module” attribute. Such modules can depend on other modules by using URLs relative to themselves as module names in import declarations. Some attributes can also contain a JavaScript program. The <button> tag shown next (which shows up as a button) has an onclick attribute. The attribute’s value will be run whenever the button is clicked.

<button onclick=”alert(‘Boom!’);”>DO NOT PRESS</button>

Note that we had to use single quotes for the string in the onclick attribute because double quotes are already used to quote the whole attribute. We could also have used &quot;.

In the sandbox

Running programs downloaded from the Internet is potentially dangerous. You do not know much about the people behind most sites you visit, and they do not necessarily mean well. Running programs by people who do not mean well is how you get your computer infected by viruses, your data stolen, and your accounts hacked. Yet the attraction of the Web is that you can browse it without necessarily trusting all the pages you visit. This is why browsers severely limit the things a JavaScript program may do: it can’t look at the files on your computer or modify anything not related to the web page it was embedded in.

Isolating a programming environment in this way is called sandboxing, the idea being that the program is harmlessly playing in a sandbox. But you should imagine this particular kind of sandbox as having a cage of thick steel bars over it so that the programs playing in it can’t actually get out.The hard part of sandboxing is allowing the programs enough room to be useful yet at the same time restricting them from doing anything dangerous. Lots of useful functionality, such as communicating with other servers or reading the content of the copy-paste clipboard, can also be used to do problematic, privacy-invading things. Every now and then, someone comes up with a new way to circumvent the limitations of a browser and do something harmful, ranging from leaking minor private information to taking over the whole machine that the browser runs on. The browser developers respond by fixing the hole, and all is well again—until the next problem is discovered, and hopefully publicized, rather than secretly exploited by some government agency or mafia.

Compatibility and the browser wars

In the early stages of the Web, a browser called Mosaic dominated the market. After a few years, the balance shifted to Netscape, which was then, in turn, largely supplanted by Microsoft’s Internet Explorer. At any point where a single browser was dominant, that browser’s vendor would feel entitled to unilaterally invent new features for the Web. Since most users used the most popular browser, websites would simply start using those features—never mind the other browsers. This was the dark age of compatibility, often called the browser wars. Web developers were left with not one unified Web but two or three incompatible platforms. To make things worse, the browsers in use around 2003 were all full of bugs, and of course the bugs were different for each browser. Life was hard for people writing web pages.

Mozilla Firefox, a not-for-profit offshoot of Netscape, challenged Internet Explorer’s position in the late 2000s. Because Microsoft was not particularly interested in staying competitive at the time, Firefox took a lot of market share away from it. Around the same time, Google introduced its Chrome browser, and Apple’s Safari browser gained popularity, leading to a situation where there were four major players, rather than one. The new players had a more serious attitude toward standards and better engineering practices, giving us less incompatibility and fewer bugs. Microsoft, seeing its market share crumble, came around and adopted these attitudes in its Edge browser, which replaces Internet Explorer. If you are starting to learn web development today, consider yourself lucky. The latest versions of the major browsers behave quite uniformly and have relatively few bugs.

The Document Object Model

When you open a web page in your browser, the browser retrieves the page’s HTML text and parses it, much like the way our parser parsed programs. The browser builds up a model of the document’s structure and uses this model to draw the page on the screen. This representation of the document is one of the toys that a JavaScript program has available in its sandbox. It is a data structure that you can read or modify. It acts as a live data structure: when it’s modified, the page on the screen is updated to reflect the changes.

Document structure

You can imagine an HTML document as a nested set of boxes. Tags such as <body> and </body> enclose other tags, which in turn contain other tags or text. Here’s the example document:

<!doctype html>



<title>My home page</title>



<h1>My home page</h1>

<p>Hello, I am Marijn and this is my home page.</p> <p>I also wrote a book! Read it

<a href=””>here</a>.</p> </body>


This page has the following structure:

The data structure the browser uses to represent the document follows this shape. For each box, there is an object, which we can interact with to find out things such as what HTML tag it represents and which boxes and text it contains. This representation is called the Document Object Model, or DOM for short. The global binding document gives us access to these objects. Its documentElement property refers to the object representing the <html> tag. Since every HTML document has a head and a body, it also has head and body properties, pointing at those elements.


Think back to the syntax trees for a moment. Their structures are strikingly similar to the structure of a browser’s document. Each node may refer to other nodes, children, which in turn may have their own children. This shape is typical of nested structures where elements can contain subelements that are similar to themselves. We call a data structure a tree when it has a branching structure, has no cycles (a node may not contain itself, directly or indirectly), and has a single, well-defined root. In the case of the DOM, document.documentElement serves as the root. Trees come up a lot in computer science. In addition to representing recursive structures such as HTML documents or programs, they are often used to maintain sorted sets of data because elements can usually be found or inserted more efficiently in a tree than in a flat array.

A typical tree has different kinds of nodes. The syntax tree for the Egg language had identifiers, values, and application nodes. Application nodes may have children, whereas identifiers and values are leaves, or nodes without children. The same goes for the DOM. Nodes for elements, which represent HTML tags, determine the structure of the document. These can have child nodes. An example of such a node is document.body. Some of these children can be leaf nodes, such as pieces of text or comment nodes. Each DOM node object has a nodeType property, which contains a code (number) that identifies the type of node. Elements have code 1, which is also defined as the constant property Node.ELEMENT_NODE. Text nodes, representing a section of text in the document, get code 3 (Node.TEXT_NODE). Comments have code 8 (Node.COMMENT_NODE).

Another way to visualize our document tree is as follows:

The leaves are text nodes, and the arrows indicate parent-child relationships between nodes.

The standard

Using cryptic numeric codes to represent node types is not a very JavaScript-like thing to do. Later, we’ll see that other parts of the DOM interface also feel cumbersome and alien. The reason for this is that the DOM wasn’t designed for just JavaScript. Rather, it tries to be a language-neutral interface that can be used in other systems as well—not just for HTML but also for XML, which is a generic data format with an HTML-like syntax. This is unfortunate. Standards are often useful. But in this case, the advantage (cross-language consistency) isn’t all that compelling. Having an interface that is properly integrated with the language you are using will save you more time than having a familiar interface across languages. As an example of this poor integration, consider the childNodes property that element nodes in the DOM have. This property holds an array-like object, with a length property and properties labeled by numbers to access the child nodes. But it is an instance of the NodeList type, not a real array, so it does not have methods such as slice and map.

Then there are issues that are simply poor design. For example, there is no way to create a new node and immediately add children or attributes to it. Instead, you have to first create it and then add the children and attributes one by one, using side effects. Code that interacts heavily with the DOM tends to get long, repetitive, and ugly. But these flaws aren’t fatal. Since JavaScript allows us to create our own abstractions, it is possible to design improved ways to express the operations you are performing. Many libraries intended for browser programming come with such tools.

Moving through the tree

DOM nodes contain a wealth of links to other nearby nodes. The following diagram illustrates these:

Although the diagram shows only one link of each type, every node has a parentNode property that points to the node it is part of, if any. Likewise, every element node (node type 1) has a childNodes property that points to an array-like object holding its children. In theory, you could move anywhere in the tree using just these parent and child links. But JavaScript also gives you access to a number of additional convenience links. The firstChild and lastChild properties point to the first and last child elements or have the value null for nodes without children.

Similarly, previousSibling and nextSibling point to adjacent nodes, which are nodes with the same parent that appear immediately before or after the node itself. For a first child, previousSibling will be null, and for a last child, nextSibling will be null. There’s also the children property, which is like childNodes but contains only element (type 1) children, not other types of child nodes. This can be useful when you aren’t interested in text nodes. When dealing with a nested data structure like this one, recursive functions are often useful. The following function scans a document for text nodes containing a given string and returns true when it has found one:

function talksAbout(node, string) {

if (node.nodeType == Node.ELEMENT_NODE) {

for (let i = 0; i < node.childNodes.length; i++) {

    if (talksAbout(node.childNodes[i], string)) {

           return true;



return false;

} else if (node.nodeType == Node.TEXT_NODE) {

         return node.nodeValue.indexOf(string) > -1;



console.log(talksAbout(document.body, “book”));

// → true

Because childNodes is not a real array, we cannot loop over it with for/of and have to run over the index range using a regular for loop or use Array.fromThe nodeValue property of a text node holds the string of text that it represents.

Finding elements

Navigating these links among parents, children, and siblings is often useful. But if we want to find a specific node in the document, reaching it by starting at document.body and following a fixed path of properties is a bad idea. Doing so bakes assumptions into our program about the precise structure of the document—a structure you might want to change later. Another complicating factor is that text nodes are created even for the whitespace between nodes. The example document’s <body> tag does not have just three children (<h1> and two <p> elements) but actually has seven: those three, plus the spaces before, after, and between them. So if we want to get the href attribute of the link in that document, we don’t want to say something like “Get the second child of the sixth child of the document body”. It’d be better if we could say “Get the first link in the document”. And we can.

let link = document.body.getElementsByTagName(“a”)[0];


All element nodes have a getElementsByTagName method, which collects all elements with the given tag name that are descendants (direct or indirect children) of that node and returns them as an array-like object. To find a specific single node, you can give it an id attribute and use document .getElementById instead.

<p>My ostrich Gertrude:</p>

<p><img id=”gertrude” src=”img/ostrich.png”></p>


let ostrich = document.getElementById(“gertrude”);



A third, similar method is getElementsByClassName, which, like getElementsByTagName, searches through the contents of an element node and retrieves all elements that have the given string in their class attribute.

Changing the document

Almost everything about the DOM data structure can be changed. The shape of the document tree can be modified by changing parent-child relationships. Nodes have a remove method to remove them from their current parent node. To add a child node to an element node, we can use appendChild, which puts it at the end of the list of children, or insertBefore, which inserts the node given as the first argument before the node given as the second argument.





let paragraphs = document.body.getElementsByTagName(“p”); document.body.insertBefore(paragraphs[2], paragraphs[0]);


A node can exist in the document in only one place. Thus, inserting paragraph Three in front of paragraph One will first remove it from the end of the document and then insert it at the front, resulting in Three/One/Two. All operations that insert a node somewhere will, as a side effect, cause it to be removed from its current position (if it has one). The replaceChild method is used to replace a child node with another one. It takes as arguments two nodes: a new node and the node to be replaced. The replaced node must be a child of the element the method is called on. Note that both replaceChild and insertBefore expect the new node as their first argument.

Creating nodes

Say we want to write a script that replaces all images (<img> tags) in the document with the text held in their alt attributes, which specifies an alternative textual representation of the image. This involves not only removing the images but adding a new text node to replace them. Text nodes are created with the document.createTextNode method.

<p>The <img src=”img/cat.png” alt=”Cat”> in the

<img src=”img/hat.png” alt=”Hat”>.</p>

<p><button onclick=”replaceImages()”>Replace</button></p>


function replaceImages() {

let images = document.body.getElementsByTagName(“img”);

   for (let i = images.length – 1; i >= 0; i–) {

        let image = images[i];

         if (image.alt) {

      let text = document.createTextNode(image.alt);

     image.parentNode.replaceChild(text, image);





Given a string, createTextNode gives us a text node that we can insert into the document to make it show up on the screen. The loop that goes over the images starts at the end of the list. This is necessary because the node list returned by a method like getElementsByTagName (or a property like childNodes) is live. That is, it is updated as the document changes. If we started from the front, removing the first image would cause the list to lose its first element so that the second time the loop repeats, where i is 1, it would stop because the length of the collection is now also 1. If you want a solid collection of nodes, as opposed to a live one, you can convert the collection to a real array by calling Array.from.

let arrayish = {0: “one”, 1: “two”, length: 2};

let array = Array.from(arrayish);

console.log( => s.toUpperCase()));

// → [“ONE”, “TWO”]

To create element nodes, you can use the document.createElement method. This method takes a tag name and returns a new empty node of the given type. The following example defines a utility elt, which creates an element node and treats the rest of its arguments as children to that node. This function is then used to add an attribution to a quote.

<blockquote id=”quote”>

No book can ever be finished. While working on it we learn just enough to find it immature the moment we turn away from it.



function elt(type, …children) {

let node = document.createElement(type);

for (let child of children) {

      if (typeof child != “string”) node.appendChild(child);

          else node.appendChild(document.createTextNode(child));


return node;



   elt(“footer”, —””,

      elt(“strong”, “Karl Popper”),

          “, preface to the second editon of “,

elt(“em”, “The Open Society and Its Enemies”),

     “, 1950”));


This is what the resulting document looks like:


Some element attributes, such as href for links, can be accessed through a property of the same name on the element’s DOM object. This is the case for most commonly used standard attributes. But HTML allows you to set any attribute you want on nodes. This can be useful because it allows you to store extra information in a document. If you make up your own attribute names, though, such attributes will not be present as properties on the element’s node. Instead, you have to use the getAttribute and setAttribute methods to work with them.

<p data-classified=”secret”>The launch code is 00000000.</p>

<p data-classified=”unclassified”>I have two feet.</p>


let paras = document.body.getElementsByTagName(“p”);

       for (let para of Array.from(paras)) {

if (para.getAttribute(“data-classified”) == “secret”) {





It is recommended to prefix the names of such made-up attributes with data-to ensure they do not conflict with any other attributes. There is a commonly used attribute, class, which is a keyword in the JavaScript language. For historical reasons—some old JavaScript implementations could not handle property names that matched keywords—the property used to access this attribute is called className. You can also access it under its real name, “class”, by using the getAttribute and setAttribute methods.


You may have noticed that different types of elements are laid out differently. Some, such as paragraphs (<p>) or headings (<h1>), take up the whole width of the document and are rendered on separate lines. These are called block elements. Others, such as links (<a>) or the <strong> element, are rendered on the same line with their surrounding text. Such elements are called inline elements. For any given document, browsers are able to compute a layout, which gives each element a size and position based on its type and content. This layout is then used to actually draw the document.

The size and position of an element can be accessed from JavaScript. The offsetWidth and offsetHeight properties give you the space the element takes up in pixels. A pixel is the basic unit of measurement in the browser. It traditionally corresponds to the smallest dot that the screen can draw, but on modern displays, which can draw very small dots, that may no longer be the case, and a browser pixel may span multiple display dots. Similarly, clientWidth and clientHeight give you the size of the space inside the element, ignoring border width.

<p style=”border: 3px solid red”>

I’m boxed in



let para = document.body.getElementsByTagName(“p”)[0];

console.log(“clientHeight:”, para.clientHeight);

console.log(“offsetHeight:”, para.offsetHeight);


Giving a paragraph a border causes a rectangle to be drawn around it.

The most effective way to find the precise position of an element on the screen is the getBoundingClientRect method. It returns an object with top, bottom, left, and right properties, indicating the pixel positions of the sides of the element relative to the top left of the screen. If you want them relative to the whole document, you must add the current scroll position, which you can find in the pageXOffset and pageYOffset bindings. Laying out a document can be quite a lot of work. In the interest of speed, browser engines do not immediately relayout a document every time you change it but wait as long as they can. When a JavaScript program that changed the document finishes running, the browser will have to compute a new layout to draw the changed document to the screen. When a program asks for the position or size of something by reading properties such as offsetHeight or calling getBoundingClientRect, providing correct information also requires computing a layout.

A program that repeatedly alternates between reading DOM layout information and changing the DOM forces a lot of layout computations to happen and will consequently run very slowly. The following code is an example of this. It contains two different programs that build up a line of X characters 2,000 pixels wide and measures the time each one takes.

<p><span id=”one”></span></p>

<p><span id=”two”></span></p>


function time(name, action) {

let start =; // Current time in milliseconds action();

console.log(name, “took”, – start, “ms”);


time(“naive”, () => {

let target = document.getElementById(“one”);

while (target.offsetWidth < 2000) {




// → naive took 32 ms

time(“clever”, function() {

let target = document.getElementById(“two”); target.appendChild(document.createTextNode(“XXXXX”));

let total = Math.ceil(2000 / (target.offsetWidth / 5));

target.firstChild.nodeValue = “X”.repeat(total);


// → clever took 1 ms </script>


We have seen that different HTML elements are drawn differently. Some are displayed as blocks, others inline. Some add styling—<strong> makes its content bold, and <a> makes it blue and underlines it. The way an <img> tag shows an image or an <a> tag causes a link to be followed when it is clicked is strongly tied to the element type. But we can change the styling associated with an element, such as the text color or underline. Here is an example that uses the style property:

<p><a href=”.”>Normal link</a></p>

<p><a href=”.” style=”color: green”>Green link</a></p>

The second link will be green instead of the default link color.

A style attribute may contain one or more declarations, which are a property (such as color) followed by a colon and a value (such as green). When there is more than one declaration, they must be separated by semicolons, as in

“color: red; border: none”.

A lot of aspects of the document can be influenced by styling. For example, the display property controls whether an element is displayed as a block or an inline element.

This text is displayed <strong>inline</strong>,

<strong style=”display: block”>as a block</strong>, and

<strong style=”display: none”>not at all</strong>.

The block tag will end up on its own line since block elements are not displayed inline with the text around them. The last tag is not displayed at all—display: none prevents an element from showing up on the screen. This is a way to hide elements. It is often preferable to removing them from the document entirely because it makes it easy to reveal them again later.

JavaScript code can directly manipulate the style of an element through the element’s style property. This property holds an object that has properties for all possible style properties. The values of these properties are strings, which we can write to in order to change a particular aspect of the element’s style.

<p id=”para” style=”color: purple”>

Nice text



let para = document.getElementById(“para”);

console.log(; = “magenta”;


Some style property names contain hyphens, such as font-family. Because such property names are awkward to work with in JavaScript (you’d have to say style[“font-family”]), the property names in the style object for such properties have their hyphens removed and the letters after them capitalized (style.fontFamily).

Cascading styles

The styling system for HTML is called CSS, for Cascading Style Sheets. A style sheet is a set of rules for how to style elements in a document. It can be given inside a <style> tag.


strong {

font-style: italic;

color: gray;



<p>Now <strong>strong text</strong> is italic and gray.</p>

The cascading in the name refers to the fact that multiple such rules are combined to produce the final style for an element. In the example, the default styling for <strong> tags, which gives them font-weight: bold, is overlaid by the rule in the <style> tag, which adds font-style and colorWhen multiple rules define a value for the same property, the most recently read rule gets a higher precedence and wins. So if the rule in the <style> tag included font-weight: normal, contradicting the default font-weight rule, the text would be normal, not bold. Styles in a style attribute applied directly to the node have the highest precedence and always win.

It is possible to target things other than tag names in CSS rules. A rule for .abc applies to all elements with “abc” in their class attribute. A rule for #xyz applies to the element with an id attribute of “xyz” (which should be unique within the document).

.subtle {

color: gray;

font-size: 80%;


#header {

background: blue;

color: white;


/* p elements with id main and with classes a and b */ p#main.a.b {

margin-bottom: 20px;


The precedence rule favoring the most recently defined rule applies only when the rules have the same specificity. A rule’s specificity is a measure of how precisely it describes matching elements, determined by the number and kind (tag, class, or ID) of element aspects it requires. For example, a rule that targets p.a is more specific than rules that target p or just .a and would thus take precedence over them. The notation p > a …{} applies the given styles to all <a> tags that are direct children of <p> tags. Similarly, p a …{} applies to all <a> tags inside <p> tags, whether they are direct or indirect children.

Query selectors

We won’t be using style sheets all that much . Understanding them is helpful when programming in the browser, but they are complicated enough. The main reason we introduced selector syntax—the notation used in style sheets to determine which elements a set of styles apply to—is that we can use this same mini-language as an effective way to find DOM elements. The querySelectorAll method, which is defined both on the document object and on element nodes, takes a selector string and returns a NodeList containing all the elements that it matches.

<p>And if you go chasing

<span class=”animal”>rabbits</span></p>

<p>And you know you’re going to fall</p>

<p>Tell ’em a <span class=”character”>hookah smoking

<span class=”animal”>caterpillar</span></span></p>

<p>Has given you the call</p>


function count(selector) {

return document.querySelectorAll(selector).length;


console.log(count(“p”)); // All <p> elements

// → 4

console.log(count(“.animal”)); // Class animal

// → 2

console.log(count(“p .animal”)); // Animal inside of <p>

// → 2

console.log(count(“p > .animal”)); // Direct child of <p>

// → 1


Unlike methods such as getElementsByTagName, the object returned by querySelectorAll is not live. It won’t change when you change the document. It is still not a real array, though, so you still need to call Array.from if you want to treat it like one. The querySelector method (without the All part) works in a similar way. This one is useful if you want a specific, single element. It will return only the first matching element or null when no element matches.

Positioning and animating

The position style property influences layout in a powerful way. By default it has a value of static, meaning the element sits in its normal place in the document. When it is set to relative, the element still takes up space in the document, but now the top and left style properties can be used to move it relative to that normal place. When position is set to absolute, the element is removed from the normal document flow—that is, it no longer takes up space and may overlap with other elements. Also, its top and left properties can be used to absolutely position it relative to the top-left corner of the nearest enclosing element whose position property isn’t static, or relative to the document if no such enclosing element exists. We can use this to create an animation. The following document displays a picture of a cat that moves around in an ellipse:

<p style=”text-align: center”>

<img src=”img/cat.png” style=”position: relative”>



let cat = document.querySelector(“img”);

let angle = Math.PI / 2;

function animate(time, lastTime) {

if (lastTime != null) {

angle += (time – lastTime) * 0.001;

} = (Math.sin(angle) * 20) + “px”; = (Math.cos(angle) * 200) + “px”;

requestAnimationFrame(newTime => animate(newTime, time));




The gray arrow shows the path along which the image moves.

Our picture is centered on the page and given a position of relative. We’ll repeatedly update that picture’s top and left styles to move it. The script uses requestAnimationFrame to schedule the animate function to run whenever the browser is ready to repaint the screen. The animate function itself again calls requestAnimationFrame to schedule the next update. When the browser window (or tab) is active, this will cause updates to happen at a rate of about 60 per second, which tends to produce a good-looking animation.

If we just updated the DOM in a loop, the page would freeze, and nothing would show up on the screen. Browsers do not update their display while a JavaScript program is running, nor do they allow any interaction with the page. This is why we need requestAnimationFrame—it lets the browser know that we are done for now, and it can go ahead and do the things that browsers do, such as updating the screen and responding to user actions. The animation function is passed the current time as an argument. To ensure that the motion of the cat per millisecond is stable, it bases the speed at which the angle changes on the difference between the current time and the last time the function ran. If it just moved the angle by a fixed amount per step, the motion would stutter if, for example, another heavy task running on the same computer were to prevent the function from running for a fraction of a second. Moving in circles is done using the trigonometry functions Math.cos and Math.sin. For those who aren’t familiar with these, we’ll briefly introduce them since we will occasionally use them.

Math.cos and Math.sin are useful for finding points that lie on a circle around point (0,0) with a radius of one. Both functions interpret their argument as the position on this circle, with zero denoting the point on the far right of the circle, going clockwise until 2 (about 6.28) has taken us around the whole circle. Math.cos tells you the x-coordinate of the point that corresponds to the given position, and Math.sin yields the y-coordinate. Positions (or angles) greater than 2 or less than 0 are valid—the rotation repeats so that a+2 refers to the same angle as aThis unit for measuring angles is called radians—a full circle is 2 radians, similar to how it is 360 degrees when measuring in degrees. The constant is available as Math.PI in JavaScript.

The cat animation code keeps a counter, angle, for the current angle of the animation and increments it every time the animate function is called. It can then use this angle to compute the current position of the image element. The top style is computed with Math.sin and multiplied by 20, which is the vertical radius of our ellipse. The left style is based on Math.cos and multiplied by 200 so that the ellipse is much wider than it is high. Note that styles usually need units. In this case, we have to append “px” to the number to tell the browser that we are counting in pixels (as opposed to centimeters, “ems”, or other units). This is easy to forget. Using numbers without units will result in your style being ignored—unless the number is 0, which always means the same thing, regardless of its unit.

Handling Events

Some programs work with direct user input, such as mouse and keyboard actions. That kind of input isn’t available as a well-organized data structure—it comes in piece by piece, in real time, and the program is expected to respond to it as it happens.

Event handlers

Imagine an interface where the only way to find out whether a key on the keyboard is being pressed is to read the current state of that key. To be able to react to keypresses, you would have to constantly read the key’s state so that you’d catch it before it’s released again. It would be dangerous to perform other time-intensive computations since you might miss a keypress. Some primitive machines do handle input like that. A step up from this would be for the hardware or operating system to notice the keypress and put it in a queue. A program can then periodically check the queue for new events and react to what it finds there.

Of course, it has to remember to look at the queue, and to do it often, because any time between the key being pressed and the program noticing the event will cause the software to feel unresponsive. This approach is called polling. Most programmers prefer to avoid it. A better mechanism is for the system to actively notify our code when an event occurs. Browsers do this by allowing us to register functions as handlers for specific events.

<p>Click this document to activate the handler.</p> <script>

window.addEventListener(“click”, () => {

console.log(“You knocked?”);



The window binding refers to a built-in object provided by the browser. It represents the browser window that contains the document. Calling its addEventListener method registers the second argument to be called whenever the event described by its first argument occurs.

Events and DOM nodes

Each browser event handler is registered in a context. In the previous example we called addEventListener on the window object to register a handler for the whole window. Such a method can also be found on DOM elements and some other types of objects. Event listeners are called only when the event happens in the context of the object they are registered on.

<button>Click me</button>

<p>No handler here.</p>


let button = document.querySelector(“button”);

button.addEventListener(“click”, () => {

console.log(“Button clicked.”);



That example attaches a handler to the button node. Clicks on the button cause that handler to run, but clicks on the rest of the document do not. Giving a node an onclick attribute has a similar effect. This works for most types of events—you can attach a handler through the attribute whose name is the event name with on in front of it.

But a node can have only one onclick attribute, so you can register only one handler per node that way. The addEventListener method allows you to add any number of handlers so that it is safe to add handlers even if there is already another handler on the element. The removeEventListener method, called with arguments similar to addEventListener , removes a handler.

<button>Act-once button</button>


let button = document.querySelector(“button”); function once() {


button.removeEventListener(“click”, once);


button.addEventListener(“click”, once);


The function given to removeEventListener has to be the same function value that was given to addEventListener. So, to unregister a handler, you’ll want to give the function a name (once, in the example) to be able to pass the same function value to both methods.

Event objects

Though we have ignored it so far, event handler functions are passed an argument: the event object. This object holds additional information about the event. For example, if we want to know which mouse button was pressed, we can look at the event object’s button property.

<button>Click me any way you want</button>


let button = document.querySelector(“button”);

button.addEventListener(“mousedown”, event => {

if (event.button == 0) {

    console.log(“Left button”);

} else if (event.button == 1) {

     console.log(“Middle button”);

} else if (event.button == 2) {

     console.log(“Right button”);




The information stored in an event object differs per type of event. We’ll discuss different types later in the chapter. The object’s type property always holds a string identifying the event (such as “click” or “mousedown”).


For most event types, handlers registered on nodes with children will also receive events that happen in the children. If a button inside a paragraph is clicked, event handlers on the paragraph will also see the click event. But if both the paragraph and the button have a handler, the more specific handler—the one on the button—gets to go first. The event is said to propagate outward, from the node where it happened to that node’s parent node and on to the root of the document. Finally, after all handlers registered on a specific node have had their turn, handlers registered on the whole window get a chance to respond to the event.

At any point, an event handler can call the stopPropagation method on the event object to prevent handlers further up from receiving the event. This can be useful when, for example, you have a button inside another clickable element and you don’t want clicks on the button to activate the outer element’s click behavior. The following example registers “mousedown” handlers on both a button and the paragraph around it. When clicked with the right mouse button, the handler for the button calls stopPropagation, which will prevent the handler on the paragraph from running. When the button is clicked with another mouse button, both handlers will run.

<p>A paragraph with a <button>button</button>.</p>


let para = document.querySelector(“p”);

let button = document.querySelector(“button”);

para.addEventListener(“mousedown”, () => {

         console.log(“Handler for paragraph.”);


button.addEventListener(“mousedown”, event => {

     console.log(“Handler for button.”);

              if (event.button == 2) event.stopPropagation(); });


Most event objects have a target property that refers to the node where they originated. You can use this property to ensure that you’re not accidentally handling something that propagated up from a node you do not want to handle. It is also possible to use the target property to cast a wide net for a specific type of event. For example, if you have a node containing a long list of buttons, it may be more convenient to register a single click handler on the outer node and have it use the target property to figure out whether a button was clicked, rather than register individual handlers on all of the buttons.





document.body.addEventListener(“click”, event => {

if ( == “BUTTON”) {





Default actions

Many events have a default action associated with them. If you click a link, you will be taken to the link’s target. If you press the down arrow, the browser will scroll the page down. If you right-click, you’ll get a context menu. And so on.For most types of events, the JavaScript event handlers are called before the default behavior takes place. If the handler doesn’t want this normal behavior to happen, typically because it has already taken care of handling the event, it can call the preventDefault method on the event object. This can be used to implement your own keyboard shortcuts or context menu. It can also be used to obnoxiously interfere with the behavior that users expect. For example, here is a link that cannot be followed:

<a href=””>MDN</a>


let link = document.querySelector(“a”);

link.addEventListener(“click”, event => {





Try not to do such things unless you have a really good reason to. It’ll be unpleasant for people who use your page when expected behavior is broken. Depending on the browser, some events can’t be intercepted at all. On Chrome, for example, the keyboard shortcut to close the current tab (control-W or command-W) cannot be handled by JavaScript.

Key events

When a key on the keyboard is pressed, your browser fires a “keydown” event. When it is released, you get a “keyup” event.

<p>This page turns violet when you hold the V key.</p>


window.addEventListener(“keydown”, event => {

    if (event.key == “v”) { = “violet”;



window.addEventListener(“keyup”, event => {

     if (event.key == “v”) {

    = “”;




Despite its name, “keydown” fires not only when the key is physically pushed down. When a key is pressed and held, the event fires again every time the key repeats. Sometimes you have to be careful about this. For example, if you add a button to the DOM when a key is pressed and remove it again when the key is released, you might accidentally add hundreds of buttons when the key is held down longer. The example looked at the key property of the event object to see which key the event is about. This property holds a string that, for most keys, corresponds to the thing that pressing that key would type. For special keys such as enter, it holds a string that names the key (“Enter”, in this case). If you hold shift while pressing a key, that might also influence the name of the key—“v” becomes “V”, and “1” may become “!”, if that is what pressing shift-1 produces on your keyboard.

Modifier keys such as shift, control, alt, and meta (command on Mac) generate key events just like normal keys. But when looking for key combinations, you can also find out whether these keys are held down by looking at the shiftKey, ctrlKey, altKey, and metaKey properties of keyboard and mouse events.

<p>Press Control-Space to continue.</p>


window.addEventListener(“keydown”, event => {

    if (event.key == ” ” && event.ctrlKey) {





The DOM node where a key event originates depends on the element that has focus when the key is pressed. Most nodes cannot have focus unless you give them a tabindex attribute, but things like links, buttons, and form fields can.When nothing in particular has focus, document.body acts as the target node of key events.

When the user is typing text, using key events to figure out what is being typed is problematic. Some platforms, most notably the virtual keyboard on Android phones, don’t fire key events. But even when you have an old-fashioned keyboard, some types of text input don’t match key presses in a straightforward way, such as input method editor (IME) software used by people whose scripts don’t fit on a keyboard, where multiple key strokes are combined to create characters. To notice when something was typed, elements that you can type into, such as the <input> and <textarea> tags, fire “input” events whenever the user changes their content. To get the actual content that was typed, it is best to directly read it from the focused field.

Pointer events

There are currently two widely used ways to point at things on a screen: mice (including devices that act like mice, such as touchpads and trackballs) and touchscreens. These produce different kinds of events.

Mouse clicks

Pressing a mouse button causes a number of events to fire. The “mousedown” and “mouseup” events are similar to “keydown” and “keyup” and fire when the button is pressed and released. These happen on the DOM nodes that are immediately below the mouse pointer when the event occurs. After the “mouseup” event, a “click” event fires on the most specific node that contained both the press and the release of the button. For example, if I press down the mouse button on one paragraph and then move the pointer to another paragraph and release the button, the “click” event will happen on the element that contains both those paragraphs. If two clicks happen close together, a “dblclick” (double-click) event also fires, after the second click event.

To get precise information about the place where a mouse event happened, you can look at its clientX and clientY properties, which contain the event’s coordinates (in pixels) relative to the top-left corner of the window, or pageX and pageY, which are relative to the top-left corner of the whole document (which may be different when the window has been scrolled). The following implements a primitive drawing program. Every time you click the document, it adds a dot under your mouse pointer.


body {

height: 200px;

background: beige;


.dot {

height: 8px; width: 8px;

border-radius: 4px; /* rounds corners */

background: blue;

position: absolute;




window.addEventListener(“click”, event => {

let dot = document.createElement(“div”);

dot.className = “dot”; = (event.pageX – 4) + “px”; = (event.pageY – 4) + “px”;




Mouse motion

Every time the mouse pointer moves, a “mousemove” event is fired. This event can be used to track the position of the mouse. A common situation in which this is useful is when implementing some form of mouse-dragging functionality. As an example, the following program displays a bar and sets up event handlers so that dragging to the left or right on this bar makes it narrower or wider:

<p>Drag the bar to change its width:</p>

<div style=”background: orange; width: 60px; height: 20px”>



let lastX; // Tracks the last observed mouse X position

let bar = document.querySelector(“div”);

bar.addEventListener(“mousedown”, event => {

if (event.button == 0) {

   lastX = event.clientX;

   window.addEventListener(“mousemove”, moved);

    event.preventDefault(); // Prevent selection



function moved(event) {

   if (event.buttons == 0) {

       window.removeEventListener(“mousemove”, moved);

} else {

  let dist = event.clientX – lastX;

  let newWidth = Math.max(10, bar.offsetWidth + dist); = newWidth + “px”;

  lastX = event.clientX;




The resulting page looks like this:

Note that the “mousemove” handler is registered on the whole window. Even if the mouse goes outside of the bar during resizing, as long as the button is held we still want to update its size. We must stop resizing the bar when the mouse button is released. For that, we can use the buttons property (note the plural), which tells us about the buttons that are currently held down. When this is zero, no buttons are down. When buttons are held, its value is the sum of the codes for those buttons—the left button has code 1, the right button 2, and the middle one 4. That way, you can check whether a given button is pressed by taking the remainder of the value of buttons and its code. Note that the order of these codes is different from the one used by button, where the middle button came before the right one. As mentioned, consistency isn’t really a strong point of the browser’s programming interface.

Touch events

The style of graphical browser that we use was designed with mouse interfaces in mind, at a time where touchscreens were rare. To make the Web “work” on early touchscreen phones, browsers for those devices pretended, to a certain extent, that touch events were mouse events. If you tap your screen, you’ll get “mousedown”, “mouseup”, and “click” events. But this illusion isn’t very robust. A touchscreen works differently from a mouse: it doesn’t have multiple buttons, you can’t track the finger when it isn’t on the screen (to simulate “mousemove”), and it allows multiple fingers to be on the screen at the same time.

Mouse events cover touch interaction only in straightforward cases—if you add a “click” handler to a button, touch users will still be able to use it. But something like the resizeable bar in the previous example does not work on a touchscreen. There are specific event types fired by touch interaction. When a finger starts touching the screen, you get a “touchstart” event. When it is moved while touching, “touchmove” events fire. Finally, when it stops touching the screen, you’ll see a “touchend” event. Because many touchscreens can detect multiple fingers at the same time, these events don’t have a single set of coordinates associated with them. Rather, their event objects have a touches property, which holds an array-like object of points, each of which has its own clientX, clientY, pageX, and pageY properties. You could do something like this to show red circles around every touching finger:


dot { position: absolute; display: block;

         border: 2px solid red; border-radius: 50px;

         height: 100px; width: 100px; }


<p>Touch this page</p>


function update(event) {

     for (let dot; dot = document.querySelector(“dot”);) {



for (let i = 0; i < event.touches.length; i++) {

  let {pageX, pageY} = event.touches[i];

  let dot = document.createElement(“dot”); = (pageX – 50) + “px”; = (pageY – 50) + “px”;




window.addEventListener(“touchstart”, update);

window.addEventListener(“touchmove”, update);

window.addEventListener(“touchend”, update);


You’ll often want to call preventDefault in touch event handlers to override the browser’s default behavior (which may include scrolling the page on swiping) and to prevent the mouse events from being fired, for which you may also have a handler.

Scroll events

Whenever an element is scrolled, a “scroll” event is fired on it. This has various uses, such as knowing what the user is currently looking at (for disabling off-screen animations or sending spy reports to your evil headquarters) or show-ing some indication of progress (by highlighting part of a table of contents or showing a page number). The following example draws a progress bar above the document and updates it to fill up as you scroll down:


#progress {

border-bottom: 2px solid blue;

width: 0;

position: fixed;

top: 0; left: 0;



<div id=”progress”></div>


// Create some content


     “supercalifragilisticexpialidocious “.repeat(1000)));

let bar = document.querySelector(“#progress”);

window.addEventListener(“scroll”, () => {

let max = document.body.scrollHeight – innerHeight; = `${(pageYOffset / max) * 100}%`;



Giving an element a position of fixed acts much like an absolute position but also prevents it from scrolling along with the rest of the document. The effect is to make our progress bar stay at the top. Its width is changed to indicate the current progress. We use %, rather than px, as a unit when setting the width so that the element is sized relative to the page width.

The global innerHeight binding gives us the height of the window, which we have to subtract from the total scrollable height—you can’t keep scrolling when you hit the bottom of the document. There’s also an innerWidth for the window width. By dividing pageYOffset, the current scroll position, by the maximum scroll position and multiplying by 100, we get the percentage for the progress bar. Calling preventDefault on a scroll event does not prevent the scrolling from happening. In fact, the event handler is called only after the scrolling takes place.

Focus events

When an element gains focus, the browser fires a “focus” event on it. When it loses focus, the element gets a “blur” event. Unlike the events discussed earlier, these two events do not propagate. A handler on a parent element is not notified when a child element gains or loses focus.The following example displays help text for the text field that currently has focus:

<p>Name: <input type=”text” data-help=”Your full name”></p>

<p>Age: <input type=”text” data-help=”Your age in years”></p>

<p id=”help”></p>


let help = document.querySelector(“#help”);

let fields = document.querySelectorAll(“input”);

for (let field of Array.from(fields)) {

     field.addEventListener(“focus”, event => {

          let text =“data-help”);

         help.textContent = text;


field.addEventListener(“blur”, event => {

       help.textContent = “”;




This screenshot shows the help text for the age field. 

The window object will receive “focus” and “blur” events when the user moves from or to the browser tab or window in which the document is shown.

Load event

When a page finishes loading, the “load” event fires on the window and the document body objects. This is often used to schedule initialization actions that require the whole document to have been built. Remember that the content of <script> tags is run immediately when the tag is encountered. This may be too soon, for example when the script needs to do something with parts of the document that appear after the <script> tag. Elements such as images and script tags that load an external file also have a “load” event that indicates the files they reference were loaded. Like the focus-related events, loading events do not propagate.

When a page is closed or navigated away from (for example, by following a link), a “beforeunload” event fires. The main use of this event is to prevent the user from accidentally losing work by closing a document. Preventing the page from unloading is not, as you might expect, done with the preventDefault method. Instead, it is done by returning a non-null value from the handler. When you do that, the browser will show the user a dialog asking if they are sure they want to leave the page. This mechanism ensures that a user is always able to leave, even on malicious pages that would prefer to keep them there forever and force them to look at dodgy weight-loss ads.

Events and the event loop

In the context of the event loop, browser event handlers behave like other asynchronous notifications. They are scheduled when the event occurs but must wait for other scripts that are running to finish before they get a chance to run. The fact that events can be processed only when nothing else is running means that, if the event loop is tied up with other work, any interaction with the page (which happens through events) will be delayed until there’s time to process it. So if you schedule too much work, either with long-running event handlers or with lots of short-running ones, the page will become slow and cumbersome to use.

For cases where you really do want to do some time-consuming thing in the background without freezing the page, browsers provide something called web workers. A worker is a JavaScript process that runs alongside the main script, on its own timeline. Imagine that squaring a number is a heavy, long-running computation that we want to perform in a separate thread. We could write a file called code/ squareworker.js that responds to messages by computing a square and sending a message back.

addEventListener(“message”, event => {

postMessage( *;


To avoid the problems of having multiple threads touching the same data, workers do not share their global scope or any other data with the main script’s environment. Instead, you have to communicate with them by sending messages back and forth. This code spawns a worker running that script, sends it a few messages, and outputs the responses.

let squareWorker = new Worker(“code/squareworker.js”); squareWorker.addEventListener(“message”, event => {

console.log(“The worker responded:”,;




The postMessage function sends a message, which will cause a “message” event to fire in the receiver. The script that created the worker sends and receives messages through the Worker object, whereas the worker talks to the script that created it by sending and listening directly on its global scope. Only values that can be represented as JSON can be sent as messages—the other side will receive a copy of them, rather than the value itself.


We saw the setTimeout function before. It schedules another function to be called later, after a given number of milliseconds. Sometimes you need to cancel a function you have scheduled. This is done by storing the value returned by setTimeout and calling clearTimeout on it.

let bombTimer = setTimeout(() => {


}, 500);

if (Math.random() < 0.5) { // 50% chance




The cancelAnimationFrame function works in the same way as clearTimeout —calling it on a value returned by requestAnimationFrame will cancel that frame (assuming it hasn’t already been called). A similar set of functions, setInterval and clearInterval, are used to set timers that should repeat every X milliseconds.

let ticks = 0;

let clock = setInterval(() => {

console.log(“tick”, ticks++);

if (ticks == 10) {




}, 200);


Some types of events have the potential to fire rapidly, many times in a row (the “mousemove” and “scroll” events, for example). When handling such events, you must be careful not to do anything too time-consuming or your handler will take up so much time that interaction with the document starts to feel slow. If you do need to do something nontrivial in such a handler, you can use setTimeout to make sure you are not doing it too often. This is usually called debouncing the event. There are several slightly different approaches to this. In the first example, we want to react when the user has typed something, but we don’t want to do it immediately for every input event. When they are typing quickly, we just want to wait until a pause occurs. Instead of immediately performing an action in the event handler, we set a timeout. We also clear the previous timeout (if any) so that when events occur close together (closer than our timeout delay), the timeout from the previous event will be canceled.

<textarea>Type something here…</textarea>


let textarea = document.querySelector(“textarea”);

let timeout;

textarea.addEventListener(“input”, () => {


timeout = setTimeout(() => console.log(“Typed!”), 500); });


Giving an undefined value to clearTimeout or calling it on a timeout that has already fired has no effect. Thus, we don’t have to be careful about when to call it, and we simply do so for every event. We can use a slightly different pattern if we want to space responses so that they’re separated by at least a certain length of time but want to fire them during a series of events, not just afterward. For example, we might want to respond to “mousemove” events by showing the current coordinates of the mouse but only every 250 milliseconds.


let scheduled = null;

  window.addEventListener(“mousemove”, event => {

     if (!scheduled) {

       setTimeout(() => {

            document.body.textContent =

                `Mouse at ${scheduled.pageX}, ${scheduled.pageY}`;

        scheduled = null;

    }, 250);


scheduled = event;




JavaScript programs may inspect and interfere with the document that the browser is displaying through a data structure called the DOM. This data structure represents the browser’s model of the document, and a JavaScript program can modify it to change the visible document. The DOM is organized like a tree, in which elements are arranged hierarchically according to the structure of the document. The objects representing elements have properties such as parentNode and childNodes, which can be used to navigate through this tree. The way a document is displayed can be influenced by styling, both by attaching styles to nodes directly and by defining rules that match certain nodes. There are many different style properties, such as color or display. JavaScript code can manipulate an element’s style directly through its style property.

Event handlers make it possible to detect and react to events happening in our web page. The addEventListener method is used to register such a handler. Each event has a type (“keydown”, “focus”, and so on) that identifies it. Most events are called on a specific DOM element and then propagate to that element’s ancestors, allowing handlers associated with those elements to handle them. When an event handler is called, it is passed an event object with additional information about the event. This object also has methods that allow us to stop further propagation (stopPropagation) and prevent the browser’s default handling of the event (preventDefault).

Pressing a key fires “keydown” and “keyup” events. Pressing a mouse button fires “mousedown”, “mouseup”, and “click” events. Moving the mouse fires “mousemove” events. Touchscreen interaction will result in “touchstart”, ” touchmove”, and “touchend” events. Scrolling can be detected with the “scroll” event, and focus changes can be detected with the “focus” and “blur” events. When the document finishes loading, a “load” event fires on the window.

This Is A Custom Widget

This Sliding Bar can be switched on or off in theme options, and can take any widget you throw at it or even fill it with your custom HTML Code. Its perfect for grabbing the attention of your viewers. Choose between 1, 2, 3 or 4 columns, set the background color, widget divider color, activate transparency, a top border or fully disable it on desktop and mobile.

This Is A Custom Widget

This Sliding Bar can be switched on or off in theme options, and can take any widget you throw at it or even fill it with your custom HTML Code. Its perfect for grabbing the attention of your viewers. Choose between 1, 2, 3 or 4 columns, set the background color, widget divider color, activate transparency, a top border or fully disable it on desktop and mobile.