Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping main synced 24 feb #1061

Merged
merged 97 commits into from
Mar 5, 2025

Conversation

Nitish1814
Copy link
Contributor

No description provided.

*/
PINCODE("PINCODE"),
private static final long serialVersionUID = 1L;
public String name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

throws IOException, JsonProcessingException {
try {
jsonGen.writeObject(getStringFromMatchType(matchType));
jsonGen.writeObject(getStringFromMatchType((List<IMatchType>) matchType));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this cast?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed

C joinCondition = unmarkedRecords.and(joinCondition1, joinCondition2);

//we are selecting columns to bring back to original shape
return unmarkedRecords.join(zFieldsFromUpdatedLabelledRecords, joinCondition, "inner").select(unmarkedRecordColumns);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to labeller or separate class. inner to be a constant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@@ -21,20 +22,20 @@ public void testGetFieldDefinitionWithStopwords(){
FieldDefinition def1 = new FieldDefinition();
def1.setFieldName("field1");
def1.setDataType("string");
def1.setMatchTypeInternal(MatchType.FUZZY);
def1.setMatchTypeInternal((MatchType) MatchTypes.FUZZY);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check casting

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fielddef setMatchTypeinternal to take IMatchType...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

try{
for(IPreprocType preprocType: getPreprocOrder().getOrder()) {
if (ProcessingType.SINGLE.equals(preprocType.getProcessingType())) {
for(FieldDefinition def:((IArguments) getArgs()).getFieldDefinition()){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check izargs, iargs

if (ProcessingType.SINGLE.equals(preprocType.getProcessingType())) {
for(FieldDefinition def:((IArguments) getArgs()).getFieldDefinition()){
//creating new instance of the class
ISingleFieldPreprocessor<S,D,R,C,T> ip = (ISingleFieldPreprocessor<S, D, R, C, T>) getPreprocMap().get(preprocType).getDeclaredConstructor().newInstance();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do u need to cast this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

required

}
} else {
//creating new instance of the class
IMultiFieldPreprocessor<S,D,R,C,T> ip = (IMultiFieldPreprocessor<S, D, R, C, T>) getPreprocMap().get(preprocType).getDeclaredConstructor().newInstance();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move construction to separate method and call setContext, init, setFieldDef and preprocess commonly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated


@Override
protected ZFrame<Dataset<Row>, Row, Column> applyCaseNormalizer(ZFrame<Dataset<Row>, Row, Column> incomingDataFrame, List<String> relevantFields) {
String[] incomingDFColumns = incomingDataFrame.columns();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rethink this - zframe abstraction, function calling etc. try to not have platform specific code unless you have to

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

this.udfName = registerUDF();

public SparkStopWordsRemover(){
super();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see if this constructor is needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@@ -49,7 +49,7 @@ public ZFrame<Dataset<Row>, Row, Column> getStopWordsDataset(ZFrame<Dataset<Row>

@Override
public String getStopWordColName() {
return "z_word";
return "z_stopword";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check doc

@Override
protected List<String> getStopWordFileNames() {
String fileName1 = Objects.requireNonNull(
StopWordRemoverUtility.class.getResource("../../../../preProcess/stopwords/stopWords.csv")).getFile();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see the logic of common resources

@@ -0,0 +1,152 @@
package zingg.spark.core.util;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

…4Feb

# Conflicts:
#	common/core/src/test/java/zingg/common/core/executor/validate/BlockerValidator.java
#	spark/core/src/test/java/zingg/spark/core/recommender/TestSparkStopWordsRecommender.java
@sonalgoyal sonalgoyal merged commit fd596bc into zinggAI:main Mar 5, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants