Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test refactor #864

Merged
merged 49 commits into from
Aug 14, 2024
Merged

Test refactor #864

merged 49 commits into from
Aug 14, 2024

Conversation

Nitish1814
Copy link
Contributor

Refactored ZFrame, Block and StopWord test classes

…zingg-Nitish into testRefactor

# Conflicts:
#	common/client/src/main/java/zingg/common/client/util/PojoToArrayConverter.java
#	common/client/src/main/java/zingg/common/client/util/StructTypeFromPojoClass.java
#	common/client/src/main/java/zingg/common/client/util/WithSession.java
#	common/core/src/test/java/zingg/common/core/block/TestBlockBase.java
#	common/core/src/test/java/zingg/common/core/preprocess/TestStopWordsBase.java
#	spark/client/src/main/java/zingg/spark/client/util/SparkDFObjectUtil.java
#	spark/core/src/test/java/zingg/common/core/block/TestSparkBlock.java
#	spark/core/src/test/java/zingg/common/core/preprocess/TestSparkStopWords.java
#	spark/core/src/test/java/zingg/common/core/util/SparkStopWordRemoverUtility.java
Copy link
Member

@sonalgoyal sonalgoyal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Common naming for test data. We can define model package inside the package of the class we are testing. Can you come up with good naming here?
  2. Assertions should not be repeated. Move to common utility and use throughout.
  3. Other changes requested at different places

common/client/pom.xml Outdated Show resolved Hide resolved

ZFrame<D, R, C> zFrame = dfObjectUtil.getDFFromObjectList(sampleDataSet, Person.class);

List<C> columnList = (List<C>) Arrays.asList("recid", "surname", "postcode");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wont this give List ?

import java.util.ArrayList;
import java.util.List;

public class TestData {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to add the test class which is using this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1,23 @@
package zingg.common.core.model;

public class EventCluster {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to EventPair

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use base class EventBase and exetnd in both Event and EventPair?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
public class TestSparkFrame extends TestZFrameBase<SparkSession, Dataset<Row>, Row, Column, DataType> {
public static final Log LOG = LogFactory.getLog(TestSparkFrame.class);
public static IArguments args;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do oyu need args here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

public static final Log LOG = LogFactory.getLog(TestSparkFrame.class);
public static IArguments args;
public static JavaSparkContext ctx;
public static SparkSession spark;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be part of the base class - when you add the concrete class here, they will automatically have the right strucgture.

}

protected static void setUpSpark() {
try {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a generic sparkbasetester which sets the session, we should use that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I dont see you using the common base class which sets up Spark session. Please move it from enterprise and use it here.

Copy link
Member

@sonalgoyal sonalgoyal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check comments

stmtArgs.setFieldDefinition(fdList);
sparkStopWordsRemovers.add(new SparkStopWordsRemover(context,stmtArgs));

//add second stopWordRemover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting up the stop word removers should be in generic base class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getStopWordRemovers() will provide different kind and number of stopword removers based on Spark or Snow implementation. IStopWordsRemoverutility interface only exposes getStopWordRemovers() method

@@ -8,8 +8,8 @@
<artifactId>zingg-spark-client</artifactId>
<packaging>jar</packaging>
<properties>
<fasterxml.jackson.version>2.15.2</fasterxml.jackson.version>
<fasterxml.jackson.databind.version>2.15.2</fasterxml.jackson.databind.version>
<fasterxml.jackson.version>2.12.0</fasterxml.jackson.version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we changing this?

//sample data classes to be used for testing
public static List<Person> createEmptySampleData() {

return new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No empty diamond operators

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored


import java.util.Arrays;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please avoid changes to classes you are not touching

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only removed unused import

public static List<Event> createSampleEventData() {

int row_id = 1;
List<Event> sample = new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please give class when you instantiate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a class static property for test purpose, there's no reason to make it instance dependent

ZFrame<Dataset<Row>,Row,Column> sf2 = sf.withColumn(newCol, newColVal);
assertTrueCheckingExceptOutput(sf2, df.withColumn(newCol, functions.lit(newColVal)), "SparkFrame.withColumn(c, double) output is not as expected");
}
protected static void setUpSpark() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to use the common testsparkbase or move the enterprise SparkSession weaving here which can inject the session. session for tests should not get created in multiple places

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

setUpSpark();
}

StructType schemaOfSample = new StructType(new StructField[]{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be coming from the base test class. Why are they here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored

ZFrame<Dataset<Row>, Row, Column> filteredData = clusterData.filterNotNullCond(ColName.SOURCE_COL);
assertEquals(3,filteredData.count());
}
protected void assertTrueCheckingExceptOutput(ZFrame<Dataset<Row>, Row, Column> sf1, ZFrame<Dataset<Row>, Row, Column> sf2, String message) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be part of the base test class so it can be used in enterprise and SnowFrame directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored

}

protected static void setUpSpark() {
try {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I dont see you using the common base class which sets up Spark session. Please move it from enterprise and use it here.

fdList.add(eventFD);
IArguments stmtArgs = new Arguments();
stmtArgs.setFieldDefinition(fdList);
sparkStopWordsRemovers.add(new SparkStopWordsRemover(context,stmtArgs));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only this should be done here. the rest of the code is common and should move to an abstract class impl of IStopWordRemoverUtility

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@sonalgoyal sonalgoyal merged commit 257851e into zinggAI:main Aug 14, 2024
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants