Skip to content

Commit

Permalink
ARROW-1652: [JS] housekeeping, vector cleanup
Browse files Browse the repository at this point in the history
This PR addresses the first few issues in the [JS roadmap doc](https://docs.google.com/document/d/142dek89oM2TVI2Yql106Zo8IB1Ff_9zDg_EG6jPWS0M) I sent out a week or so ago. Sorry for the big PR, the housekeeping and vector cleanup work were pretty co-dependent.

JIRA issues addressed by this PR:
[ARROW-1032](https://issues.apache.org/jira/browse/ARROW-1032) - Support custom_metadata
[ARROW-1651](https://issues.apache.org/jira/browse/ARROW-1651) - Lazy row accessor in Table
[ARROW-1652](https://issues.apache.org/jira/browse/ARROW-1652) - Separate Vector into BatchVector and CompositeVector

Tasks from the roadmap (some not in JIRA):

##### Housekeeping
1. Enable the strict-mode tsc compiler settings in the build
2. Compile mjs files for node 8.x ESModules
3. Compile ES6 UMD target with native iterators/generators

##### Vector
1. Refactor Vector types to primitive forms representing the portion of a column in a single RecordBatch
2. Add Column Vector that represents primitive Vectors across RecordBatches as an entire column
3. Refactor linear column-to-batch-index lookup in `Vector.get(i)`
4. Simplify inheritance hierarchy/generic types with Traits (e.g. Nullable, Iterable, and Typed numeric variants)

##### Table
1. Implement lazy row accessor
2. Share API/row logic with StructVector

cc: @wesm @TheNeuralBit

Author: Paul Taylor <[email protected]>

Closes apache#1273 from trxcllnt/vector-cleanup and squashes the following commits:

c53d6de [Paul Taylor] refactor: rename vector mixins
2c83c82 [Paul Taylor] update to [email protected]
48c6ca4 [Paul Taylor] refactor: StructVector/Table#get always take numeric index, Table extends StructVector
18671ed [Paul Taylor] fix lint
04e9941 [Paul Taylor] refactor: use new compilation targets in perf tests
bd7a837 [Paul Taylor] refactor: update test's Arrow imports for new types
37b7f61 [Paul Taylor] refactor: update vector tests for new types
15ab8d4 [Paul Taylor] refactor: update table tests for new types
db04a0b [Paul Taylor] refactor: export new Arrow types
84233de [Paul Taylor] refactor reader to use new arrow types, fix strict TS compilation errors
af4845d [Paul Taylor] refactor: add Arrow vector mixins
54fa2fd [Paul Taylor] refactor: break out virtual vector, move to types folder
2121bf1 [Paul Taylor] refactor: break out table, add Row type, move to types folder
abc9331 [Paul Taylor] refactor: move struct to types folder
2a4127c [Paul Taylor] refactor: move dictionary to types folder
607be42 [Paul Taylor] refactor: break out list/fixedsizelist/utf8, move to types folder
b8a6866 [Paul Taylor] refactor: break out Typed vectors, move into types folder
0f8de75 [Paul Taylor] refactor: rename vector folder to types, move vector base class
d2def19 [Paul Taylor] clean up build scripts, add ES2015 UMD and mjs targets
84b2c50 [Paul Taylor] use strict typescript compiler settings
  • Loading branch information
trxcllnt authored and wesm committed Nov 3, 2017
1 parent 0373541 commit 527af63
Show file tree
Hide file tree
Showing 55 changed files with 2,195 additions and 1,398 deletions.
9 changes: 2 additions & 7 deletions js/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
# Logs
logs
*.log
.esm-cache
npm-debug.log*
yarn-debug.log*
yarn-error.log*
Expand Down Expand Up @@ -57,10 +58,6 @@ build/Release
node_modules/
jspm_packages/

# Typescript declaration files
types/
typings/

# Optional npm cache directory
.npm

Expand All @@ -85,6 +82,4 @@ package-lock.json

# compilation targets
dist
targets/es5
targets/es2015
targets/esnext
targets
2 changes: 1 addition & 1 deletion js/closure-compiler-scripts/text-encoding.js
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
// Utilities
//

goog.module("module$text_encoding");
goog.module("module$text_encoding_utf_8");
goog.module.declareLegacyNamespace();
/**
* @param {number} a The number to test.
Expand Down
36 changes: 36 additions & 0 deletions js/gulp/argv.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

const argv = require(`command-line-args`)([
{ name: `all`, alias: `a`, type: Boolean },
{ name: 'update', alias: 'u', type: Boolean },
{ name: 'verbose', alias: 'v', type: Boolean },
{ name: `target`, type: String, defaultValue: `` },
{ name: `module`, type: String, defaultValue: `` },
{ name: `coverage`, type: Boolean, defaultValue: false },
{ name: `targets`, alias: `t`, type: String, multiple: true, defaultValue: [] },
{ name: `modules`, alias: `m`, type: String, multiple: true, defaultValue: [] }
]);

const { targets, modules } = argv;

argv.target && !targets.length && targets.push(argv.target);
argv.module && !modules.length && modules.push(argv.module);
(argv.all || !targets.length) && targets.push(`all`);
(argv.all || !modules.length) && modules.push(`all`);

module.exports = { argv, targets, modules };
57 changes: 57 additions & 0 deletions js/gulp/arrow-task.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

const {
mainExport, gCCLanguageNames,
targetDir, observableFromStreams
} = require('./util');

const gulp = require('gulp');
const path = require('path');
const gulpRename = require(`gulp-rename`);
const { memoizeTask } = require('./memoize-task');
const { Observable, ReplaySubject } = require('rxjs');

const arrowTask = ((cache) => memoizeTask(cache, function copyMain(target, format) {
const out = targetDir(target), srcGlob = `src/**/*`;
const es5Glob = `${targetDir(`es5`, `cjs`)}/**/*.js`;
const esmGlob = `${targetDir(`es2015`, `esm`)}/**/*.js`;
const es5UmdGlob = `${targetDir(`es5`, `umd`)}/**/*.js`;
const es5UmdMaps = `${targetDir(`es5`, `umd`)}/**/*.map`;
const es2015UmdGlob = `${targetDir(`es2015`, `umd`)}/**/*.js`;
const es2015UmdMaps = `${targetDir(`es2015`, `umd`)}/**/*.map`;
const ch_ext = (ext) => gulpRename((p) => { p.extname = ext; });
const append = (ap) => gulpRename((p) => { p.basename += ap; });
return Observable.forkJoin(
observableFromStreams(gulp.src(srcGlob), gulp.dest(out)), // copy src ts files
observableFromStreams(gulp.src(es5Glob), gulp.dest(out)), // copy es5 cjs files
observableFromStreams(gulp.src(esmGlob), ch_ext(`.mjs`), gulp.dest(out)), // copy es2015 esm files and rename to `.mjs`
observableFromStreams(gulp.src(es5UmdGlob), append(`.es5.min`), gulp.dest(out)), // copy es5 umd files and add `.min`
observableFromStreams(gulp.src(es5UmdMaps), gulp.dest(out)), // copy es5 umd sourcemap files, but don't rename
observableFromStreams(gulp.src(es2015UmdGlob), append(`.es2015.min`), gulp.dest(out)), // copy es2015 umd files and add `.es6.min`
observableFromStreams(gulp.src(es2015UmdMaps), gulp.dest(out)), // copy es2015 umd sourcemap files, but don't rename
).publish(new ReplaySubject()).refCount();
}))({});

const arrowTSTask = ((cache) => memoizeTask(cache, function copyTS(target, format) {
return observableFromStreams(gulp.src(`src/**/*`), gulp.dest(targetDir(target, format)));
}))({});


module.exports = arrowTask;
module.exports.arrowTask = arrowTask;
module.exports.arrowTSTask = arrowTSTask;
35 changes: 35 additions & 0 deletions js/gulp/build-task.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

const { npmPkgName } = require('./util');
const { memoizeTask } = require('./memoize-task');

const uglifyTask = require('./uglify-task');
const closureTask = require('./closure-task');
const typescriptTask = require('./typescript-task');
const { arrowTask, arrowTSTask } = require('./arrow-task');

const buildTask = ((cache) => memoizeTask(cache, function build(target, format, ...args) {
return target === npmPkgName ? arrowTask(target, format, ...args)()
: target === `ts` ? arrowTSTask(target, format, ...args)()
: format === `umd` ? target === `es5` ? closureTask(target, format, ...args)()
: uglifyTask(target, format, ...args)()
: typescriptTask(target, format, ...args)();
}))({});

module.exports = buildTask;
module.exports.buildTask = buildTask;
31 changes: 31 additions & 0 deletions js/gulp/clean-task.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

const del = require('del');
const { targetDir } = require('./util');
const { memoizeTask } = require('./memoize-task');
const { Observable, ReplaySubject } = require('rxjs');

const cleanTask = ((cache) => memoizeTask(cache, function clean(target, format) {
return Observable
.from(del(`${targetDir(target, format)}/**`))
.catch((e) => Observable.empty())
.multicast(new ReplaySubject()).refCount();
}))({});

module.exports = cleanTask;
module.exports.cleanTask = cleanTask;
91 changes: 91 additions & 0 deletions js/gulp/closure-task.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

const {
targetDir,
mainExport,
gCCLanguageNames,
UMDSourceTargets,
observableFromStreams
} = require('./util');

const gulp = require('gulp');
const path = require('path');
const sourcemaps = require('gulp-sourcemaps');
const { memoizeTask } = require('./memoize-task');
const { Observable, ReplaySubject } = require('rxjs');
const closureCompiler = require('google-closure-compiler').gulp();

const closureTask = ((cache) => memoizeTask(cache, function closure(target, format) {
const src = targetDir(target, `cls`);
const out = targetDir(target, format);
const entry = path.join(src, mainExport);
const externs = path.join(src, `${mainExport}.externs`);
return observableFromStreams(
gulp.src([
/* external libs first --> */ `closure-compiler-scripts/*.js`,
/* then sources glob --> */ `${src}/**/*.js`,
/* and exclusions last --> */ `!${src}/format/*.js`,
`!${src}/Arrow.externs.js`,
], { base: `./` }),
sourcemaps.init(),
closureCompiler(createClosureArgs(entry, externs)),
// rename the sourcemaps from *.js.map files to *.min.js.map
sourcemaps.write(`.`, { mapFile: (mapPath) => mapPath.replace(`.js.map`, `.${target}.min.js.map`) }),
gulp.dest(out)
).publish(new ReplaySubject()).refCount();
}))({});

const createClosureArgs = (entry, externs) => ({
third_party: true,
warning_level: `QUIET`,
dependency_mode: `LOOSE`,
rewrite_polyfills: false,
externs: `${externs}.js`,
entry_point: `${entry}.js`,
// formatting: `PRETTY_PRINT`,
compilation_level: `ADVANCED`,
assume_function_wrapper: true,
js_output_file: `${mainExport}.js`,
language_in: gCCLanguageNames[`es2015`],
language_out: gCCLanguageNames[`es5`],
output_wrapper:
`// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
(function (global, factory) {
typeof exports === 'object' && typeof module !== 'undefined' ? factory(exports) :
typeof define === 'function' && define.amd ? define(['exports'], factory) :
(factory(global.Arrow = global.Arrow || {}));
}(this, (function (exports) {%output%}.bind(this))));`
});

module.exports = closureTask;
module.exports.closureTask = closureTask;
30 changes: 30 additions & 0 deletions js/gulp/memoize-task.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

const { taskName } = require('./util');

const memoizeTask = ((cache, taskFn) => ((target, format, ...args) => {
// Give the memoized fn a displayName so gulp's output is easier to follow.
const fn = () => (
cache[taskName(target, format)] || (
cache[taskName(target, format)] = taskFn(target, format, ...args)));
fn.displayName = `${taskFn.name || ``}:${taskName(target, format, ...args)}:task`;
return fn;
}));

module.exports = memoizeTask;
module.exports.memoizeTask = memoizeTask;
Loading

0 comments on commit 527af63

Please sign in to comment.