Asynchronous flow control
The material in this post is heavily inspired by Mixu's Node.js Book.
At its core, JavaScript is designed to be non-blocking on the "main" thread, this is where views are rendered. You can imagine the importance of this in the browser. When the main thread becomes blocked it results in the infamous "freezing" that end users dread, and no other events can be dispatched resulting in the loss of data acquisition, for example.
This creates some unique constraints that only a functional style of programming can cure. This is where callbacks come in to the picture.
However, callbacks can become challenging to handle in more complicated procedures. This often results in "callback hell" where multiple nested functions with callbacks make the code more challenging to read, debug, organize, etc.
async1(function (input, result1) {
async2(function (result2) {
async3(function (result3) {
async4(function (result4) {
async5(function (output) {
// do something with output
});
});
});
});
});
Of course, in real life there would most likely be additional lines of code to handle result1
, result2
, etc., thus, the length and complexity of this issue usually results in code that looks much more messy than the example above.
This is where functions come in to great use. More complex operations are made up of many functions:
- initiator style / input
- middleware
- terminator
The "initiator style / input" is the first function in the sequence. This function will accept the original input, if any, for the operation. The operation is an executable series of functions, and the original input will primarily be:
- variables in a global environment
- direct invocation with or without arguments
- values obtained by file system or network requests
Network requests can be incoming requests initiated by a foreign network, by another application on the same network, or by the app itself on the same or foreign network.
A middleware function will return another function, and a terminator function will invoke the callback. The following illustrates the flow to network or file system requests. Here the latency is 0 because all these values are available in memory.
function final(someInput, callback) {
callback(`${someInput} and terminated by executing callback `);
}
function middleware(someInput, callback) {
return final(`${someInput} touched by middleware `, callback);
}
function initiate() {
const someInput = 'hello this is a function ';
middleware(someInput, function (result) {
console.log(result);
// requires callback to `return` result
});
}
initiate();
State management
Functions may or may not be state dependent. State dependency arises when the input or other variable of a function relies on an outside function.
In this way there are two primary strategies for state management:
- passing in variables directly to a function, and
- acquiring a variable value from a cache, session, file, database, network, or other outside source.
Note, I did not mention global variable. Managing state with global variables is often a sloppy anti-pattern that makes it difficult or impossible to guarantee state. Global variables in complex programs should be avoided when possible.
Control flow
If an object is available in memory, iteration is possible, and there will not be a change to control flow:
function getSong() {
let _song = '';
let i = 100;
for (i; i > 0; i -= 1) {
_song += `${i} beers on the wall, you take one down and pass it around, ${
i - 1
} bottles of beer on the wall\n`;
if (i === 1) {
_song += "Hey let's get some more beer";
}
}
return _song;
}
function singSong(_song) {
if (!_song) throw new Error("song is '' empty, FEED ME A SONG!");
console.log(_song);
}
const song = getSong();
// this will work
singSong(song);
However, if the data exists outside of memory the iteration will no longer work:
function getSong() {
let _song = '';
let i = 100;
for (i; i > 0; i -= 1) {
/* eslint-disable no-loop-func */
setTimeout(function () {
_song += `${i} beers on the wall, you take one down and pass it around, ${
i - 1
} bottles of beer on the wall\n`;
if (i === 1) {
_song += "Hey let's get some more beer";
}
}, 0);
/* eslint-enable no-loop-func */
}
return _song;
}
function singSong(_song) {
if (!_song) throw new Error("song is '' empty, FEED ME A SONG!");
console.log(_song);
}
const song = getSong('beer');
// this will not work
singSong(song);
// Uncaught Error: song is '' empty, FEED ME A SONG!
Why did this happen? setTimeout
instructs the CPU to store the instructions elsewhere on the bus, and instructs that the data is scheduled for pickup at a later time. Thousands of CPU cycles pass before the function hits again at the 0 millisecond mark, the CPU fetches the instructions from the bus and executes them. The only problem is that song ('') was returned thousands of cycles prior.
The same situation arises in dealing with file systems and network requests. The main thread simply cannot be blocked for an indeterminate period of time-- therefore, we use callbacks to schedule the execution of code in time in a controlled manner.
You will be able to perform almost all of your operations with the following 3 patterns:
- In series: functions will be executed in a strict sequential order, this one is most similar to
for
loops.
// operations defined elsewhere and ready to execute
const operations = [
{ func: function1, args: args1 },
{ func: function2, args: args2 },
{ func: function3, args: args3 },
];
function executeFunctionWithArgs(operation, callback) {
// executes function
const { args, func } = operation;
func(args, callback);
}
function serialProcedure(operation) {
if (!operation) process.exit(0); // finished
executeFunctionWithArgs(operation, function (result) {
// continue AFTER callback
serialProcedure(operations.shift());
});
}
serialProcedure(operations.shift());
- Full parallel: when ordering is not an issue, such as emailing a list of 1,000,000 email recipients.
let count = 0;
let success = 0;
const failed = [];
const recipients = [
{ name: 'Bart', email: 'bart@tld' },
{ name: 'Marge', email: 'marge@tld' },
{ name: 'Homer', email: 'homer@tld' },
{ name: 'Lisa', email: 'lisa@tld' },
{ name: 'Maggie', email: 'maggie@tld' },
];
function dispatch(recipient, callback) {
// `sendEmail` is a hypothetical SMTP client
sendMail(
{
subject: 'Dinner tonight',
message: 'We have lots of cabbage on the plate. You coming?',
smtp: recipient.email,
},
callback
);
}
function final(result) {
console.log(`Result: ${result.count} attempts \
& ${result.success} succeeded emails`);
if (result.failed.length)
console.log(`Failed to send to: \
\n${result.failed.join('\n')}\n`);
}
recipients.forEach(function (recipient) {
dispatch(recipient, function (err) {
if (!err) {
success += 1;
} else {
failed.push(recipient.name);
}
count += 1;
if (count === recipients.length) {
final({
count,
success,
failed,
});
}
});
});
- Limited parallel: parallel with limit, such as successfully emailing 1,000,000 recipients from a list of 10 million users.
let successCount = 0;
function final() {
console.log(`dispatched ${successCount} emails`);
console.log('finished');
}
function dispatch(recipient, callback) {
// `sendEmail` is a hypothetical SMTP client
sendMail(
{
subject: 'Dinner tonight',
message: 'We have lots of cabbage on the plate. You coming?',
smtp: recipient.email,
},
callback
);
}
function sendOneMillionEmailsOnly() {
getListOfTenMillionGreatEmails(function (err, bigList) {
if (err) throw err;
function serial(recipient) {
if (!recipient || successCount >= 1000000) return final();
dispatch(recipient, function (_err) {
if (!_err) successCount += 1;
serial(bigList.pop());
});
}
serial(bigList.pop());
});
}
sendOneMillionEmailsOnly();
Each has its own use cases, benefits, and issues you can experiment and read about in more detail. Most importantly, remember to modularize your operations and use callbacks! If you feel any doubt, treat everything as if it were middleware!