-
-
Notifications
You must be signed in to change notification settings - Fork 47
FilterBase
All provided filters use FilterBase
as its foundation. It is a base class meant to be extended, which provides common facilities used by filters.
FilterBase
is based on Transform. It operates in object mode editing token streams produced by a parser or another filter.
This document describes the user-facing interface only. If you want to build your own filter, feel free to inspect the code to gain more insights.
Internally, FilterBase
keeps track of objects by building a stack. Items of a stack can be:
- Number. In this case, a corresponding object is an array, and the number is the current index.
- String. In this case, a corresponding object is an object, and the string is the current property key.
-
null
. In this case, a corresponding object is an object, but keys are not tracked.FilterBase
keeps track of keys only if a previous stream returns packed keys. In this case, a filter assumes that only object's shape will be used for filtering.
The stack is used to make filtering.
options
is an optional object described in details in node.js' Stream documentation. Additionally, the following optional custom properties are recognized:
-
pathSeparator
is a string that separates stack values when it is converted to a string. The algorithm is straightforward:stack.join(pathSeparator)
. The default:'.'
.const obj = [{a: 1}, {b: 2}]; // stack when filtering 1: [0, 'a'] // converted to a string: '0.a' // stack when filtering 2: [1, 'b'] // converted to a string: '1.b'
-
filter
is a way to accept or reject a data item. The interpretation of its returned value is up to concrete filter objects. Its value can be one of the following types:- String. The stack is converted to a string using
pathSeparator
, then it should be equal tofilter
value, or it should be longer and thefilter
value should be on a boundary of thepathSeparator
value.const obj = {a: [1, 2], ab: null}; const filter = 'a'; // it fits ['a'], ['a', 0], and ['a', 1], but not ['ab']
-
RegExp
. The stack is converted to a string usingpathSeparator
, then the filter is applied usingfilter.test(path)
.const obj = {a: [1, 2], ab: null}; const filter = /^a\b/; // it fits ['a'], ['a', 0], and ['a', 1], but not ['ab'] const filter = /^a/; // it fits ['a'], ['a', 0], ['a', 1], and ['ab']
- Function. The filter is applied using
filter(stack, chunk)
, wherechunk
is a data item being filtered. The function is called in the context of current filter object. It should return a truthy/falsy value. - The default:
() => true
.
- String. The stack is converted to a string using
-
once
is a flag. When it is truthy, a filter object will make a selection (depending on its definition of selection) only once. Otherwise, all selections are included. The default:false
.- It can be used as an optimization feature when we know that our stream contains exactly one object we want to do our action on.
-
replacement
is what should be used instead of skipped objects. Not all filters use this option. Its value can be one of the following types:- Function. The filter is applied using
replacement(stack, chunk)
, wherechunk
is a data item being filtered. The function is called in the context of current filter object. It should return an array of semantically valid data items. - Otherwise, it is assumed to be a static array of semantically valid data items.
- The default: a
null
data item —[{name: 'nullValue', value: null}]
.
- Function. The filter is applied using
-
allowEmptyReplacement
is a flag. It explicitly allows or disallows to replace removed values with an empty array.- The problem is that when streaming an object, a key will be already streamed, then a filter may want to remove a corresponding value (replace it with an empty array), which produces an invalid JSON stream. In order to avoid it, when
allowEmptyReplacement
is falsy, a filter checks the length of a replacement array and replaces it with the default (usually thenull
data item) if it is empty. - If a source stream packs keys, the problem can be avoided by delaying streaming keys. When
allowEmptyReplacement
istrue
, a filter will use this algorithm to stream keys.
- The problem is that when streaming an object, a key will be already streamed, then a filter may want to remove a corresponding value (replace it with an empty array), which produces an invalid JSON stream. In order to avoid it, when
- Streaming flags. They are used only when a filter stream delayed keys (
allowEmptyReplacement
istrue
). Both of them are here for compatibility withParser
. See details in Parser's options.-
streamValues
is assigned first. -
streamKeys
is assigned next. When it is effective value is falsy, nostartKey
,stringChunk
, norendKey
are produced. OnlykeyValue
is issued. - The default:
true
.
-
When using a string or a regular expression as a filter function, the stack is converted to a path string before the filter can be applied. It should be noted that when a source stream does not produce keyValue
data items, the stack uses null
to denote an undefined property key, which is converted to a path string as an empty string:
[].join('.')
// produces: ''
[null].join('.')
// produces: ''
[null, null].join('.')
// produces: '.'
[null, 1, null].join('.')
// produces: '.1.'
[1, null, null, null, 2, null].join('.')
// produces: '1....2.'
Be aware of this behavior when crafting filters.
Property keys can be arbitrary strings. Sometimes it can mess up paths and textual filters. In order to avoid it, you can choose a different pathSeparator
. It can be any string you like, just make sure it works with your filters.
const filter = Filter({pathSeparator: '->'});
// it will produce paths like that:
// [1, 'a'] => '1->a'
// [1, 0, 'ab', 0] => '1->0->ab->0'
Filters do not check if an array of replacement items is valid or not. Malformed arrays will produce substreams, which can break the rest of the data pipeline. Be extra careful with replacement
and allowEmptyReplacement
options.