$ cnpm install minitask
A standard/convention for running tasks over a list of files based around Node core streams2.
minitask is a library for processing tasks on files. It is used in many of my projects, such as gluejs and generate-markdown.
Most file processing tasks can be divided into three phases, and minitask provides tools for each phase:
[ 1. Directory iteration: selecting a set of files to operate on, using the List class ]
[ 2. Task definition:
- defining operations on files using the Task class
- making use of cached results using the Cache class ]
[ 3. Task execution:
- executing operations in parallel using the Runner class
- storing cached results using the Cache class ]
Separating these into distinct phases has several advantages. The main advantage is that each of these operations can be written independently of the other two: e.g. no task definition during iteration and no execution parallelism concerns during task definition.
Further, separating task definition from execution allows for much greater execution parallelism compared to a naive sequential stream processing implementation. This means faster builds.
The List API essentially consists of:
add function which adds path targetsexclude and find which select filesexec function which performs the actual traversalA few notes:
[].filter on the result)add and exec function because this allows the same List object to be run multiple times against a changing directory structure, which is nice if you are running the same operations multiple times (e.g. in a server).The list API is documented in docs/list.md.
The Task API provides a way to express a set of transformations using an array of:
without having to worry about the details of how these things are connected. Node's duplex streams are a bit tedious for simple transforms and Node's child_process returns something that's not quite a duplex stream. The Task API works around those limitations by providing some plumbing, and returns a queueable task object that can be run later.
A few notes:
The task API is documented in docs/task.md.
Tasks are often run multiple times without the underlying file changing, which means we can skip the work and use a cached version. The cache API handles:
The cache API supports storing result files and file metadata in a way that ensures that if the underlying file changes, the related cached data is invalidated. The input file can be checked using size + date modified, or by running a hash algorithm such as md5 on the file.
A few notes:
The cache API is documented in docs/cache.md.
The runner API is documented in docs/runner.md.
Copyright 2013 - present © cnpmjs.org | Home |