Running Node.js modules from the command line

A Node.js module is typically intended to be imported and called from other modules of your application. But sometimes it is practical if you can call the module's functionality from the command line yourself. For example a Repository offering methods for manipulation with data can be used from the command line for cleaning the database while you are testing your application. Or a Parser, normally used internally by the app, can read a file, parse it and write the result to the console or to a file.

This article describes how to easily amend a module to serve as a command-line tool that you can use during the development. We show how to deal with command-line parameters and how to play well with other CLI tools you might be using.

Check: imported or called?

The first thing is to distinguish whether the code is being imported by using require() or called directly from the command line like $ node ./script.js.

if(require.main == module)  
    run()

The method run() will be called only when the module has been directly executed from a command line. Explanation can be found in the Node.js documentation.

Command line arguments

The globally accessible variable process.argv contains an array of arguments that were used when calling the script. For example if you run $ node script.js -ab -c=1 --long 1 2 3 then the process.argv will contain an array ['node', 'c:\temp\script.js', '-ab', '-c=1', '--long', '1', '2', '3']. The problem is that it is not trivial to find out whether, for example, the switch -b has been used or what the value of parameter -c was.

Even for the simplest cases it is worth it to use a special library to do the argument parsing. One of the established ones is yargs. This is an official successor of now abandoned optimist that was loved by the Node community. Yargs parses the arguments and transforms them into an easily usable data structure. Yargs supports short/long parameter aliases (-v, --verbose), grouping of switches (-xzvf) and positional parameters (copy file.txt ./there). For description of all features see the project page at Github. Here is just a short overview to give you the feeling.

var argv = require('yargs').argv;  
console.log('%d + %d = %d', argv.x, argv.y, argv.x+argv.y);  
---
$ node add.js -x 10 -y 21
10 + 21 = 31  
var argv = require('yargs').argv;  
if(argv.x) console.log('Extract');  
if(argv.z) console.log('Decompress');  
if(argv.v) console.log('Verbose');  
if(argv.f) console.log('File "%s"', argv.f);  
---
$ tar.js -xzvf file.tar.gz
Extract  
Decompress  
Verbose  
File "file.tar.gz"  

You can declaratively define the accepted arguments and yargs will automatically compose the usage information.

var argv = require('yargs')  
    .usage('Extract or unpack a tarball file')
    .example('$0 -xzvf file.tar.gz', 'Extracts a gzip file')
    .alias('x', 'extract').describe('x', 'Extract a tar ball')
    .alias('z', 'ungzip').describe('z', 'Decompress a gzip file')
    .alias('v', 'verbose').describe('v', 'Verbose output')
    .required('f', 'File must be provided').alias('f', 'file').describe('f', 'File to be extracted')
    .help('h').alias('h', 'help')
    .argv;
$ node tar.js --help
Extract or unpack a tarball file

Examples:  
  node c:\temp\test\tar.js -xzvf file.tar.gz    Extracts a gzip file

Options:  
  -x, --extract  Extract a tar ball
  -z, --ungzip   Decompress a gzip file
  -v, --verbose  Verbose output
  -f, --file     File to be extracted    [required]
  -h, --help     Show help

In most cases it is not necessary to check the input parameters. Yargs does the validation automatically. Using a strict() rejects any unknown parameters, required() arguments are then mandatory. In the example above the call would fail if the user didn't provide the -f parameter.

The validations and the whole argument parasing makes sense only when the module is used as a script. Therefore the require('yargs') must not be at the top of the file with the other imports but it must be inside the run() method.

Return value

It is a common convention that a script should have an exit code 0 in case of success and higher than zero in case of an error. Because of its asynchronous nature Node.js doesn't automatically propagate the program's last return value as an exit code. Unless the execution ends with an exception, exit code 0 is automatically returned by Node. To set the error code the method process.exit([code]) must be called.

if(success)  
    process.exit() //exit code 0
else  
    process.exit(1)

Working with pipes

If it's necessary to pass the script output to another program using shell pipes, all you need to do is to write the output to the process.stdout. You can simply use console.log() because this method just formats the ouput and writes it to the standard output.

If you need to read the input data from a pipe, just read the process.stdin. The following example uses concat-stream to read the whole input at once:

var concat = require('concat-stream');  
if(process.stdin.isTTY) {  
    process.stdin.setEncoding('utf8');
    process.stdin.pipe(concat(function (input) {
        process(input);
    }));
} else {
    process('something else');
}

Example

The following code is written in the CoffeeScript language. Let's assume the name of the script is parse.coffee.

parse = (text, callback) ->  
    result = { content: text, length: text.length }
    callback(null, result)

run = () ->  
    fs = require('fs')
    argv = require('yargs')
        .required('f').alias('f', 'file')
        .argv
    fs.readFile argv.file, 'ASCII', (err, content) ->
        if(err?) then process.exit(1)
        parse content, (err, result) ->
            if(err?) then process.exit(2)
            console.log(result)

module.exports = parse  
if(require.main == module)  
    run()

Calling the parse() from other code:

parse = require('parse')  
parse 'content', (err, result) ->  
    #...

Using it from the command line:

$ coffee parse.coffee -f file.txt