Skip to content
This repository has been archived by the owner on Oct 18, 2019. It is now read-only.

Standardize algorithms #24

Open
fervic opened this issue Nov 2, 2015 · 1 comment
Open

Standardize algorithms #24

fervic opened this issue Nov 2, 2015 · 1 comment

Comments

@fervic
Copy link

fervic commented Nov 2, 2015

I see that contributions have taken different approaches for solving the same problem, so at the end the benchmark is no comparing the language itself.

My suggestion would be to set a guideline for contributing which explains the standard approach, like:

  • It should use files
  • Should have the amount of worker/threads to use as a parameter
  • Can buffer for writing but the buffer size has certain size limit.
  • Should use regular expressions or should include both versions: with and without regexps.

Maybe also allow submitting a non-standard approach that takes advantage of specific language features but keep that one marked as the special one.

So at the end it would be two sets of solutions: (1) the standard that follows the rules and (2) the optimized or non-standard.

@dimroc
Copy link
Owner

dimroc commented Nov 15, 2015

That's fantastic suggestion @fervic, I had similar thoughts that I was going to bring up in my next blog post. Here's what I was going to suggest.

Rules of Reference Implementation

  1. Stream input from files.
  2. Use Regular Expressions to check for the presence of knicks.
  3. Have multiple mappers, but one reducer.
  4. Each individual worker holds its results in a hash and sends that final hash back for reduction.

One suggestion you made that I don't have was to limit the # of workers/threads, but that's not always simple depending on the language and framework. Any other suggestions?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants