Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can this library be used with Java ? #11

Open
jdk2588 opened this issue Oct 3, 2017 · 14 comments
Open

Can this library be used with Java ? #11

jdk2588 opened this issue Oct 3, 2017 · 14 comments

Comments

@jdk2588
Copy link

jdk2588 commented Oct 3, 2017

I see the examples are with Scala, can this library be used with Java ?

@MrPowers
Copy link
Collaborator

MrPowers commented Oct 5, 2017

Thanks for the great question @jdk2588 😄

It looks like it's possible to run a Scala JAR file in Java, but I don't know because I've never used Java.

You can download the latest JAR file here if you'd like to give it a shot. Let me know what you find!

@jdk2588
Copy link
Author

jdk2588 commented Oct 5, 2017

Question should be "Is the library tested with Java" ?, they are from JVM family so it can used.

@MrPowers
Copy link
Collaborator

MrPowers commented Oct 5, 2017

Unfortunately, the library is not tested with Java 😢

I'd add the tests, but I unfortunately don't know Java. Sorry about that.

If you are able to test the methods with Java, let me know and I'll be happy to merge any code with master 😄

@MrPowers
Copy link
Collaborator

If anyone is using this library with Java, please let me know how it is going for you! Adding a help wanted tag!

@gregbrowndev
Copy link

@MrPowers I recently picked up your Testing Spark Applications book. Unfortunately, my company insist on using Java to write our Spark apps. So will let you know if this library works out!

This ticket is a couple of years old. Do you know if anyone has had any success in Java?

@almogtavor
Copy link

Any new about this? A support for JUnit would be great

@aggubin
Copy link

aggubin commented Apr 25, 2022

Hi,

better late then never :)

I just started using spark-fast-tests in Java project at work

Steps:

  1. maven dependency

     <dependency>
     	<groupId>com.github.mrpowers</groupId>
     	<artifactId>spark-fast-tests_2.12</artifactId>
     	<version>1.2.0</version>
                      <scope>test</scope>
     </dependency>
    
  2. public class MySparkTest implements DatasetComparer

  3. write the test

  4. assertSmallDatasetEquality(actual, expected, false, false, true, 10);
    since Java has no default params, you gotta set ignoreNullable, ignoreColumnNames, orderedComparison, truncate
    I've set them to defaults (from Scala) except for truncate, as my Datasets are small indeed

  5. I haven't used other asserts yet, might update when I have ...

Cheers,
Alexander

@almogtavor
Copy link

@aggubin can you add the code sample?

@aggubin
Copy link

aggubin commented Apr 25, 2022

@almogtavor
see (2) and (4) above those are your code samples. (3) Create you test Dataset from CSV or TXT or String, compare it to the actual Dataset from your method under test

@MrPowers
Copy link
Collaborator

@aggubin - this is great! Thank you!!!

Any chance you can send me a PR with README instructions for Java users? Adding a little example to the JavaSpark example project would be awesome too. There are a lot of users that would appreciate this info!

@aggubin
Copy link

aggubin commented Apr 27, 2022 via email

@MrPowers
Copy link
Collaborator

@aggubin - here are responses:

not sure what do you mean by PR, but here is .zip with sample code

A PR is a "pull request"

why you asserts have "actual" first and "expected" second, whereas junit asserts are "expected" then "actual"?

Some test frameworks have actual first then expected second. The junit syntax wasn't considered when building this library.

  1. "assertSmallDatasetEquality()" shows only first two columns when test fails, is there a way to print all columns, like DF.show(false)?

Feel free to open up a separate issue to discuss the output of assertSmallDatasetEquality in more detail. For purposes of this discussion, we're focusing on adding documentation for Java users. Changing the output of the lib would be a separate conversation.

Thanks for the questions.

@aggubin
Copy link

aggubin commented May 16, 2022

figured that "smallDataset" in "assertSmallDatasetEquality" is rather "narrow" dataset - 1-2 columns.

Using "assertApproximateDataFrameEquality" now to see row diffs:

assertApproximateDataFrameEquality(actual, expected, 1.0, false, false, false);

that produces OK-looking df that can be inspected

@aggubin
Copy link

aggubin commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants