Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.16.0 #1413

Closed
ianoc opened this issue Aug 10, 2015 · 7 comments
Closed

Release 0.16.0 #1413

ianoc opened this issue Aug 10, 2015 · 7 comments
Assignees

Comments

@ianoc
Copy link
Collaborator

ianoc commented Aug 10, 2015

What do we have not in, what do we want done to cut the next release for OSS?

We have a pretty large backlog of things all done since the last release :
-> Hyperdoop increased stability, ready for GA in its own serialization module
-> Collapsed the macro packages into their parents to improve dependency mgmt
-> Scalding macro based jdbc (this might not be merged?)
-> TypedText
-> Line numbers in a lot of descriptions
-> Parquet typed tuple improvements

@ianoc
Copy link
Collaborator Author

ianoc commented Aug 10, 2015

@rubanm , @johnynek ^^

@johnynek
Copy link
Collaborator

something is goofy with that list. It is including some very old pull requests (like #15)

@rubanm
Copy link
Contributor

rubanm commented Aug 10, 2015

Looks like those are PR numbers from my fork that got mixed in here :-/

I think @sid-kap's updated runtime-based reducer estimator is almost ready #1411 so can be added here.

The jdbc one might not make it. I think it needs some more work.

@johnynek
Copy link
Collaborator

This looks pretty good to me. I'd rather either 1) organize the JDBC stuff a bit better or 2) not merge for this release, and push till next time. Just my opinion there.

Also, we should work on some release notes (maybe in a google doc) to get the high level things here.

Some things that pop to mind: separate new features, enhancements and bugfixes.

New Features:
OrderedSerialization deemed production ready (explain what it offers, describe how to use it, mention 30-50% perf wins we see).
TypedText: typesafe flattening of case-classes, tuples and primitives into TSV and CSV formats
Line numbers in config (for map/reduce boundaries, not all methods)
withDescription for naming steps

Enhancements:
Executions are monoids and semigroups.
Failed Executions combined with zip fail faster (a.zip(b) is failed as soon as either a, b fails).
LongThrift sources in scalding-commons are TypedSinks
Scalding REPL supported for 2.11
added TypedPipe.groupWith for a cleaner way to be explicit about the ordering to use.

Bugfix:
VersionedKeyValSource#toIterator fixed for types other than Array[Byte]
TypedPipe.limit (now is an exact computation while before .limit(n) returned AT MOST n items but could return 0 when there were more than 0 items. The fix means it could be slightly slower but fixes issues we've seen with people misusing the API).

Also, I guess we should link to the relevant PRs in the above (the OrderedSerialization may have a ton, maybe that's a bad idea for that one).

Are these the main issues?

@sriramkrishnan
Copy link
Collaborator

@johnynek
Copy link
Collaborator

johnynek commented Apr 7, 2016

We need release notes... Maybe a PR so we can use github to suggest edits?

@piyushnarang
Copy link
Collaborator

Closing this as the 0.16.0 release is now out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants