Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added singleton SparkSessionProvider #932

Merged
merged 5 commits into from
Nov 12, 2024

Conversation

Nitish1814
Copy link
Contributor

No description provided.


public class TestDocumenter {

public static final Log LOG = LogFactory.getLog(TestDocumenter.class);

Check warning

Code scanning / PMD

Logger should be defined private static final and have the correct class Warning test

Logger should be defined private static final and have the correct class
try {
args = argsUtil.createArgumentsFromJSON(getClass().getResource("/testDocumenter/config.json").getFile());
ArgumentsUtil argsUtil = new ArgumentsUtil();
IArguments args = argsUtil.createArgumentsFromJSON(getClass().getResource("/testDocumenter/config.json").getFile());

Check warning

Code scanning / PMD

Avoid unused local variables such as 'obj'. Warning test

Avoid unused local variables such as 'args'.
private JavaSparkContext javaSparkContext;
private ZinggSparkContext zinggSparkContext;
private IArguments args;
public static final Log LOG = LogFactory.getLog(SparkSessionProvider.class);

Check warning

Code scanning / PMD

Logger should be defined private static final and have the correct class Warning test

Logger should be defined private static final and have the correct class
zinggSparkContext.init(sparkSession);
} catch (Throwable e) {
if (LOG.isDebugEnabled())
e.printStackTrace();

Check warning

Code scanning / PMD

This statement should have braces Warning test

This statement should have braces
Comment on lines +48 to +51
if (sparkSessionProvider == null) {
sparkSessionProvider = new SparkSessionProvider();
sparkSessionProvider.initializeSession();
}

Check warning

Code scanning / PMD

Singleton is not thread safe Warning test

Singleton is not thread safe
@@ -1,11 +1,11 @@
package zingg.common.core.sparkFrame;
package zingg.spark.core.sparkFrame;

Check warning

Code scanning / PMD

Package name contains upper case characters Warning test

Package name contains upper case characters
import java.util.List;
import java.util.stream.IntStream;

public class DataFrameUtility {

Check warning

Code scanning / PMD

This utility class has a non-private constructor Warning test

This utility class has a non-private constructor
for (int n = 0; n < numCols; ++n) {
structType = structType.add("col" + n, DataTypes.DoubleType, false);
rowValues.add(0d);
};

Check warning

Code scanning / PMD

Unnecessary semicolon Warning test

Unnecessary semicolon
@@ -91,7 +92,7 @@ public void testUDFArray() {
df.printSchema();
// register ArrayDoubleSimilarityFunction as a UDF
TestUDFDoubleArr testUDFDoubleArr = new TestUDFDoubleArr();
SparkFnRegistrar.registerUDF2(spark, "testUDFDoubleArr", testUDFDoubleArr, DataTypes.DoubleType);
SparkFnRegistrar.registerUDF2(TestSparkBase.spark, "testUDFDoubleArr", testUDFDoubleArr, DataTypes.DoubleType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not inject spark instance variqable and not make changes throughout? The whole idea is the class code should not know or care about how spark session was created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated



public class TestDocumenterBase extends ZinggSparkTester {
@ExtendWith(TestSparkBase.class)
public class TestDocumenterBase {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should have a ganeric test and a spark specific test for these classes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue created

@@ -227,7 +234,7 @@
});


Dataset<Row> sample = spark.createDataFrame(Arrays.asList(
Dataset<Row> sample = sparkSession.createDataFrame(Arrays.asList(

Check warning

Code scanning / PMD

Consider simply returning the value vs storing it in local variable 'recordDF' Warning test

Consider simply returning the value vs storing it in local variable 'sample'

import static org.junit.jupiter.api.Assertions.fail;

import java.util.Arrays;
import java.util.List;

import org.apache.commons.io.input.TeeInputStream;

Check warning

Code scanning / PMD

Unused import 'zingg.common.client.util.*' Warning test

Unused import 'org.apache.commons.io.input.TeeInputStream'
private static final int NO_OF_RECORDS = 5;
private final Dataset<Row> dataset;
private List<Row> stopwordRow;

Check warning

Code scanning / PMD

Perhaps 'stopWords' could be replaced by a local variable. Warning test

Perhaps 'stopwordRow' could be replaced by a local variable.
private static final int NO_OF_RECORDS = 5;
private final Dataset<Row> dataset;
private List<Row> stopwordRow;
private List<String> stopwordList;

Check warning

Code scanning / PMD

Perhaps 'stopWords' could be replaced by a local variable. Warning test

Perhaps 'stopwordList' could be replaced by a local variable.
private final Dataset<Row> dataset;
private List<Row> stopwordRow;
private List<String> stopwordList;
private Dataset<Row> stopWords;

Check warning

Code scanning / PMD

Perhaps 'stopWords' could be replaced by a local variable. Warning test

Perhaps 'stopWords' could be replaced by a local variable.
@@ -177,7 +198,7 @@
StructType structType = new StructType();
structType = structType.add(DataTypes.createStructField(COL_STOPWORDS, DataTypes.StringType, false));
//create dataframe with given records and schema
Dataset<Row> recordDF = spark.createDataFrame(records, structType);
Dataset<Row> recordDF = sparkSession.createDataFrame(records, structType);

Check warning

Code scanning / PMD

Consider simply returning the value vs storing it in local variable 'recordDF' Warning test

Consider simply returning the value vs storing it in local variable 'recordDF'
@sonalgoyal sonalgoyal merged commit 04f3b2a into zinggAI:main Nov 12, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants