Grokking NodeJS: The V8 Engine, Run-time Environments and NPM
You have to be able to imagine where everything is, and why they're there.
Note: I’m now taking the Google technical documentation course, so I can better structure my writing here to be as clear as possible, while cutting down on flowery expression. This will mark a departure from too much creativity — and I mourn that — in favor of the best signal to noise ratio I can possibly muster — and I for one am excited for that.
On with the show!
Introduction
If you write enough JavaScript, you’ll eventually write some NodeJS. On the surface, JavaScript and Node can be thought of as indistinguishable from one other (though there are major differences, as we’ll soon see), but that’s one of the reasons Node itself is popular: it is, for all intents and purposes just JavaScript, extracted from its original home in the browser.
Untethered to the browser, JavaScript suddenly found its way everywhere — into your servers, desktop apps, IoT devices, games, etc. It’s easy to appreciate this ubiquity, but it’s much harder to understand it spatially.
Spatial: (definition 2) — of, relating to, or involved in the perception of relationships (as of objects) in space. In this newsletter, we’re attempting to understand what it means for JavaScript to live outside its original space: the browser.
When we say NodeJS is JavaScript out of the browser, we could benefit from a mental sketch of the browserspace:
The image above is a simplified view of the internals of a web browser (specifically, the Chrome Browser). In this drawing, we’re representing some of the browser internals such as:
The V8 engine, which compiles and executes JavaScript in the browser,
JavaScript, which is compiled and executed in the V8 engine,
Browser APIs, which are provided by the browser to help JavaScript accomplish certain tasks, such as DOM manipulation.
The V8 engine, developed by the Chrome team, ‘is Google’s open source high-performance JavaScript and WebAssembly engine, written in C++’. According to the V8 docs, it ‘compiles and executes JavaScript source code, handles memory allocation for objects, and garbage collects objects it no longer needs.’
In other words, it is the engine that runs JavaScript on the browser. So — what happens if you pull it out of the browser? It means you can port v8 into new environments and run JavaScript there. And that’s exactly what Node.JS did.
Node.js is defined on the official homepage as the free, open-sourced, cross-platform JavaScript run-time environment that lets developers write command line tools and server-side scripts outside of a browser.
Since this newsletter is all about understanding Node.js spatially, we cannot proceed further without understanding a crucial aspect of the definition of Node.js above: run-time environments.
Run-time environments
This is where things get a little tricky, but we can have great success if we focus on what we know, and where the things we know live in the general space of things that interact for our programs to run.
For starters, we know that v8 is an engine, written in C++, that originally compiled and ran JavaScript in the browser. We know that Node.js does run v8 outside the browser, which is how we get JavaScript outside the browser. Now we have to make sure we understand how it runs JavaScript outside the browser, and the answer to that question is ‘by being a run-time environment’.
Which begs the question: what is a run-time environment?
Wikipedia says it’s ‘a sub-system that exists both in the computer where a program is created, as well as in the computers where the program is intended to be run.’
In the diagram above, we have the same program in different contexts: when it was created (eg, on your local machine) and when it’s executed at the intended destination (eg, on your user’s computer). The run-time itself remains the same.
An obvious example is when you write a small client-side web app with HTML, JavaScript and CSS. We deploy that app to the web, and a user in Madagascar can fire up a browser and the web app will run with the same assumptions about its ‘environment’ (eg, that there’s a windows object, and that event listeners exist, along with the DOM tree). In other words, we’re counting on the browser run-time whenever we write a client-side web application.
It stands to reason that when we yank v8 out of Chrome, we lose the sweet benefits the browser run-time supplied us. No more windows object for starters. This is where Node.JS steps in — it becomes the new run-time, back-filling those missing items, and providing some super-charged features for us to make the most of our brave new browserless paradigm.
NPM
NPM, which stands for Non-deterministic Palindrome Machine Node Package Manager, is as advertised in the box: it’s a platform (and a tool) for managing your Node application’s packages.
I am adding this section because I know beginner developers struggle to understand just exactly what NPM does, and how it works under the hood. I definitely did struggle. For way too long I didn’t understand the point of package.json, package-lock.json, and node_modules. Hopefully I can help to detangle this neatly.
One of the most powerful properties of software (or at least, good software) is that is it composable. Composable software means that I can take smaller bits of code (‘components’) and combine them or swap them out in ways that yield impressively powerful systems. I have written about composability before, so I’ll link it below:
Imagine that you were writing a web server in Node.js.
Now imagine that a user could send their birthday (in text form, of course) to your server, and your server could respond with their zodiac reading for the day.
You could spend a lot of time writing a small tool in your program to do this, or you could look for a zodiac package on NPM. NPM is essentially a platform that allows people create and distribute ‘packages’ (individual software files bundled together to perform a coherent goal, like the zodiac one mentioned above), which significantly improve your quality of life as a developer.
Remember what I said about composability? Instead of writing every little piece of functionality from scratch (like a UUID generator), you can instead download a package from npm directly into your project, import it, and use it.
This ecosystem makes it possible for several things to happen:
Developers don’t have to rewrite the same functionality from scratch every time they need it.
More time is spent on the mission-critical parts of a software project.
Developers are able to know what the best packages are, based on package usage activity, information and analytics, such as:
How many people downloaded this package?
What’s the size of this package (the smaller the better)?
What’s the package’s source code?
The way to get a package into your machine is by typing npm install <PACKAGE_NAME>
or yarn add <PACKAGE_NAME> (
yarn, by the way, is simply an alternative to npm). There’s also pnpm, which means ‘performant NPM’.
Now, let’s dive into the design of NPM:
Node_modules
Apart from being the heaviest thing in the universe, `node modules` are the local files based on packages your Node app needs to run.
When you run, say npm install uuid
, all the modules that makes UUID available to your application need to go somewhere, and that ‘somewhere’ is in the node_modules folder.
Why are they so large? Well, in much the same way you are now depending on UUID for functionality, UUID also depends on other packages to do its thing, so there’s a recursive download process involved. You may even see this dependence happen if you watch npm while it’s installing your packages. (pro-tip: they’re also called dependencies because, haha, you are depending on them).
This is why one of the first things you have to do before you run a node app is run npm install
. That is the command for ‘look in package.json, in the dependencies and devDependencies section, and install every single dependency I need’. Which leads us to…
Package.json
This file is the control system of your node dependencies. It contains the name of your node app, the version, a description of the app, its version, and the packages it depends on (or simply put: its dependencies).
Package-lock.json
Dependencies are tricky things. Since we’re in recursive, composable territory, packages depend on packages, which depend on packages, which depend on packages. It’s actually now a sprawling, convoluted beast, that can break if some package buried deep within the bowels of your node_modules folder stops working the way it used to do.
And: spoiler alert, packages change every time. They get breaking changes.
That’s why package-lock.json exists. It’s a file that keeps a manifest of all the packages your app depends on, and locks their version so that even if the package managers update them in a breaking way, that instability won’t cascade into your app. If you delete your package-lock.json file, you’ll more or less get the latest versions of your dependency’s dependencies.
These are the basics. A lot more can be said, and a lot has — for example, Dan Abramov’s article on the broken design of npm audit is a doozy. However, this is running long, so consider this newsletter a soft introduction to some core Node.JS ideas and concepts.
We’ll dive deeper in subsequent editions.
I was just looking for articles to better understand how NodeJs works and I got this gem in my inbox. Thank you for clearly explaining everything.
I can’t wait to read the subsequent ones. Hopefully, one of your articles on this will clear my doubts on the connection between the callstack, NodeJs event loop (and all its phases) , and the thread pool.
Once again, thank you for this article.