Release 0.16.0 #1413

ianoc · 2015-08-10T17:30:57Z

What do we have not in, what do we want done to cut the next release for OSS?

We have a pretty large backlog of things all done since the last release :
-> Hyperdoop increased stability, ready for GA in its own serialization module
-> Collapsed the macro packages into their parents to improve dependency mgmt
-> Scalding macro based jdbc (this might not be merged?)
-> TypedText
-> Line numbers in a lot of descriptions
-> Parquet typed tuple improvements

Make sure Execution.zip fails fast: Make sure Execution.zip fails fast #1412
add counter verification logic: add counter verification logic #1409
Add some return types: Add some return types #1407
add .groupWith method to TypedPipe: add .groupWith method to TypedPipe #1406
Adds a function to test if a sink exists at the version we created: Adds a function to test if a sink exists at the version we created #1404
Add the type in ScroogeReadSupport: Add the type in ScroogeReadSupport #1403
Change hash function in GroupRandomly: Change hash function in GroupRandomly #1401
Improve logging in runtime reducer estimators: Improve logging in runtime reducer estimators #1402
Fixes the scrooge generator tasks not to generate code in the compile…: Fixes the scrooge generator tasks not to generate code in the compile… #1399
Ianoc/configure set converter: Ianoc/configure set converter #1400
RatioBasedEstimator - fix threshold edge case, add tests: RatioBasedEstimator - fix threshold edge case, add tests #1397
Inline parquet-scrooge: Inline parquet-scrooge #1395
Support nesting Options in TypeDescriptor: Support nesting Options in TypeDescriptor #1387
Remove use of hadoop version in estimators: Remove use of hadoop version in estimators #1391
Handle no history case in RatioBasedEstimator: Handle no history case in RatioBasedEstimator #1393
Upgrade parquet to 1.8.1: Upgrade parquet to 1.8.1 #1380
Set hadoop version to dummy value: Set hadoop version to dummy value #1392
Enable Scalding-REPL for Scala 2.11: Enable Scalding-REPL for Scala 2.11 #1388
Updates for some upstream fixes/changes: Updates for some upstream fixes/changes #1390
Ordered Serialization macros for thrift: Ordered Serialization macros for thrift #1338
Don't publish maple when doing 2.11 so we only publish it once -- nee…: Don't publish maple when doing 2.11 so we only publish it once -- nee… #1386
Just move whitespace, add comments, simplify a few methods: Just move whitespace, add comments, simplify a few methods #1383
Upgrade sbt launcher script (sbt-extras): Upgrade sbt launcher script (sbt-extras) #1381
Add monoid and semigroup for Execution: Add monoid and semigroup for Execution #1379
Add NullSink and test: Add NullSink and test #1378
Runtime reducer estimator: Runtime reducer estimator #1358
A serialization error we were seeing in repl usage: A serialization error we were seeing in repl usage #1376
Fix deprecation warnings in TypedDelimited: Fix deprecation warnings in TypedDelimited #1371
Revert typed tsv behavior: Revert typed tsv behavior #1373
Fix TypedPipe.limit to be correct, if slighly slower: Fix TypedPipe.limit to be correct, if slighly slower #1366
Ianoc/revert changes around making file systems: Ianoc/revert changes around making file systems #1372
Fix scala.Function2 showing up in line numbers: Fix scala.Function2 showing up in line numbers #1367
Drop with MacroGenerated from Fields macros: Drop with MacroGenerated from Fields macros #1370
Migrate typedtext: Migrate typedtext #1356
Missing an extends Serializable, causes issues if capture Config's an…: Missing an extends Serializable, causes issues if capture Config's an… #1365
Allow overriding of hadoop configuration options for a single source/…: Allow overriding of hadoop configuration options for a single source/… #1362
Update Build.scala: Update Build.scala #1361
Merge scalding-macros into scalding-core: Merge scalding-macros into scalding-core #1355
Adding a source for the most recent good date path.: Adding a source for the most recent good date path. #20
Consistent style in homepage example: Consistent style in homepage example #1349
Serialization folding: Serialization folding #1351
Collapses scalding-db packages: Collapses scalding-db packages #1353
Adds jdbc macros from internal: Adds jdbc macros from internal #1267
make some repl components extensible: make some repl components extensible #1342
Implement flatMapValues method: Implement flatMapValues method #1348
Fix the execution test: Fix the execution test #1347
Remove the bootstrap section: Remove the bootstrap section #1346
Execution id code: Execution id code #1334
Add line numbers at .group and .toPipe boundaries: Add line numbers at .group and .toPipe boundaries #1335
Change defaults for Scalding reducer estimator: Change defaults for Scalding reducer estimator #1333
faster builds on travis by a bit.: faster builds on travis by a bit. #1280
Change groupRandomly & groupAll to use OrderedSerialization: Change groupRandomly & groupAll to use OrderedSerialization #1307
Make SketchJoin ordered serialization aware: Make SketchJoin ordered serialization aware #1316
Add secondary sorting using ordered serialization: Add secondary sorting using ordered serialization #1321
Hide the deprecated string error for getting ASCII bytes.: Hide the deprecated string error for getting ASCII bytes. #1332
Add a OrderedSerialization.viaTransform with no dependencies, and a B…: Add a OrderedSerialization.viaTransform with no dependencies, and a B… #1329
Precompute int hashes: Precompute int hashes #1330
Added a sealed trait ordered serializer. When it works its great. Not…: Added a sealed trait ordered serializer. When it works its great. Not… #1320
Apply merge strategy for pom.xml files: Apply merge strategy for pom.xml files #1327
Bails out from the length calculation if we don't succeed often, leng…: Bails out from the length calculation if we don't succeed often, leng… #1322
Apply merge strategy for pom.properties files: Apply merge strategy for pom.properties files #1325
Add withDescription for naming MR steps: Add withDescription for naming MR steps #1283
increased number of box instances to 250: increased number of box instances to 250 #1323
Fix testing VersionedKeyValSource#toIterator for non-Array[Byte] types: Fix testing VersionedKeyValSource#toIterator for non-Array[Byte] types #1314
make LongThrift sources TypedSink: make LongThrift sources TypedSink #1313
Make test of Kmeans very very unlikely to fail: Make test of Kmeans very very unlikely to fail #1310
Revert "Add UnitOrderedSerialization": Revert "Add UnitOrderedSerialization" #1306
Add UnitOrderedSerialization: Add UnitOrderedSerialization #1304
Improve TypedParquetTuple Improve TypedParquetTuple #1302: Improve TypedParquetTuple #1302 #1303
Add tests around hashcode collisions: Add tests around hashcode collisions #1299
make serialization modules build on travis: make serialization modules build on travis #1301
Fix performance bug in TypedPipeDiff: Fix performance bug in TypedPipeDiff #1300

The text was updated successfully, but these errors were encountered:

ianoc · 2015-08-10T17:32:20Z

@rubanm , @johnynek ^^

johnynek · 2015-08-10T17:43:22Z

something is goofy with that list. It is including some very old pull requests (like #15)

rubanm · 2015-08-10T17:55:26Z

Looks like those are PR numbers from my fork that got mixed in here :-/

I think @sid-kap's updated runtime-based reducer estimator is almost ready #1411 so can be added here.

The jdbc one might not make it. I think it needs some more work.

johnynek · 2015-08-10T18:05:33Z

This looks pretty good to me. I'd rather either 1) organize the JDBC stuff a bit better or 2) not merge for this release, and push till next time. Just my opinion there.

Also, we should work on some release notes (maybe in a google doc) to get the high level things here.

Some things that pop to mind: separate new features, enhancements and bugfixes.

New Features:
OrderedSerialization deemed production ready (explain what it offers, describe how to use it, mention 30-50% perf wins we see).
TypedText: typesafe flattening of case-classes, tuples and primitives into TSV and CSV formats
Line numbers in config (for map/reduce boundaries, not all methods)
withDescription for naming steps

Enhancements:
Executions are monoids and semigroups.
Failed Executions combined with zip fail faster (a.zip(b) is failed as soon as either a, b fails).
LongThrift sources in scalding-commons are TypedSinks
Scalding REPL supported for 2.11
added TypedPipe.groupWith for a cleaner way to be explicit about the ordering to use.

Bugfix:
VersionedKeyValSource#toIterator fixed for types other than Array[Byte]
TypedPipe.limit (now is an exact computation while before .limit(n) returned AT MOST n items but could return 0 when there were more than 0 items. The fix means it could be slightly slower but fixes issues we've seen with people misusing the API).

Also, I guess we should link to the relevant PRs in the above (the OrderedSerialization may have a ton, maybe that's a bad idea for that one).

Are these the main issues?

sriramkrishnan · 2016-04-06T23:11:39Z

FYI 0.16.0 RC - https://github.com/twitter/scalding/releases/tag/v0.16.0-RC6

CC @piyushnarang

johnynek · 2016-04-07T00:56:46Z

We need release notes... Maybe a PR so we can use github to suggest edits?

piyushnarang · 2016-05-05T00:53:21Z

Closing this as the 0.16.0 release is now out.

sriramkrishnan assigned piyushnarang Apr 7, 2016

piyushnarang closed this as completed May 5, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 0.16.0 #1413

Release 0.16.0 #1413

ianoc commented Aug 10, 2015

ianoc commented Aug 10, 2015

johnynek commented Aug 10, 2015

rubanm commented Aug 10, 2015

johnynek commented Aug 10, 2015

sriramkrishnan commented Apr 6, 2016

johnynek commented Apr 7, 2016

piyushnarang commented May 5, 2016

Release 0.16.0 #1413

Release 0.16.0 #1413

Comments

ianoc commented Aug 10, 2015

ianoc commented Aug 10, 2015

johnynek commented Aug 10, 2015

rubanm commented Aug 10, 2015

johnynek commented Aug 10, 2015

sriramkrishnan commented Apr 6, 2016

johnynek commented Apr 7, 2016

piyushnarang commented May 5, 2016