Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HSEARCH-3319 WIP: DRAFT: IDEA: TEST: Type-safe field references #4113

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

marko-bekhta
Copy link
Member

https://hibernate.atlassian.net/browse/HSEARCH-3319

These are just some tests I've tried so far to see what's what and what's where 😃

So far, I'm thinking about enclosing the field path, type and value-convert in a field reference.
An actual entity property would be an extension of this reference where it could be either an attribute (FieldAttributeReference) or an 'object" containing other attributes.
If it is a simple attribute, then it would provide the methods to get ValueConvert.NO or ValueConvert.PARSE variants of references...

I'm also thinking if we should open these details on the interface (the field path, type and value-convert) or hide them in the implementation and do something similar to LuceneSearchPredicate#from but HibernateSearchFieldReference#from and reject non-search references 🤔

I also suspect we do not want to break the API 😈 sooo maybe what I've tried on the match predicate can work?

Copy link
Member

@yrodiere yrodiere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's going in the right direction but... there are a few more things to consider.

I'm afraid this review with leave you with more questions than answers, so... good luck :D


interface FieldObjectReference<T> extends FieldReference<T> {

}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely wouldn't make these nested interfaces FWIW; it makes code unnecessarily verbose.


Class<T> type();

interface FieldAttributeReference<T, V> extends FieldReference<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean "a field that is also an attribute", which should IMO be named AttributeFieldReference rather than FieldAttributeReference... I understand FieldAttributeReference as "a reference to an attribute of a field" :/

But regardless... In the metamodel we have a value/object split rather than attribute/object... So I think this should be ValueFieldReference? IndexValueFieldReference maybe? SearchValueFieldReference?

*
* @param <N> The type of the next step.
*/
public interface MatchPredicateFRMatchingStep<T, N extends MatchPredicateOptionsStep<?>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously you could simply add generic type parameters to existing DSL steps, but just as obviously this would break compatibility with existing apps, which technically we're not allowed to do.

I wish I had put things like this in every DSL step, but now it's too late for that :/

So, additional interfaces are the way to go... but. IMO we'd need to plan for merging the new interfaces with the existing ones eventually. We can't keep duplicating them forever.

I would suggest making the new interfaces superinterfaces of the existing ones.
I.e. MatchPredicateMatchingStep<N> extends MatchPredicateMatchingGenericStep<T, N, Object>. That way:

  1. MatchPredicateMatchingStep does not need to define any method. You're not duplicating that code and javadoc.
  2. You can deprecate MatchPredicateMatchingStep, suggesting to use MatchPredicateMatchingGenericStep instead, and in the next major you can remove MatchPredicateMatchingStep altogether.

* @param <S> The "self" type (the actual exposed type of this step).
* @param <N> The type of the next step.
*/
public interface MatchPredicateFRFieldMoreStep<T,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, existing DSL steps involving the type of a value put the corresponding parameter (T) in last position. See for example public interface FieldProjectionValueStep<N extends FieldProjectionOptionsStep<?, T>, T>.

* @param <S> The "self" type (the actual exposed type of this step).
* @param <N> The type of the next step.
*/
public interface MatchPredicateFRFieldMoreStep<T,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, existing DSL steps involving the type of a value put the corresponding parameter (T) in last position. See for example public interface FieldProjectionValueStep<N extends FieldProjectionOptionsStep<?, T>, T>.

* @param <T> The expected returned type.
*/
@Incubating
public interface FieldReference<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you look into how the static metamodel in JPA works? If not, I think you might want to; there are a few things we might want to take inspiration from.

One aspect that, I think, we must consider right away is the type holding the field. In JPA you have SingularAttribute<X, T>, where T is the type of the attribute, and X is the type holding the attribute (entity, embeddable).

In our case, implementing things like this raises a few questions:

  • For fields within object fields, what is X? Does it even make sense?
  • In Hibernate Search we often use chains of fields, i.e. authors.name. Would it be possible to continue to do this in a way that is not overly verbose? I.e. by default Book_.authors.name would have X = Book, and there would be a syntax to make the reference relative, e.g. Book_.authors.relative(a -> a.name) or Book_authors.name.relative() (simpler but quite limited) or ...
  • Would it even be possible for us to take advantage of X, i.e. to require that one only ever use a field in a context where the parent type/object field is X? Would it be possible without breaking APIs (I'm thinking of things such as org.hibernate.search.engine.search.predicate.dsl.GenericBooleanPredicateClausesStep#should(java.util.function.Function<? super org.hibernate.search.engine.search.predicate.dsl.SearchPredicateFactory,? extends org.hibernate.search.engine.search.predicate.dsl.PredicateFinalStep>))?
    Related: org.hibernate.search.engine.search.predicate.dsl.SearchPredicateFactory#withRoot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for taking a look!!! 😃

yeah, I was looking a bit at the model generated by the ORM. And I thought that it probably can be a good next step to validate some of these assumptions for the Search model. For now, I was thinking about something along the lines:

public abstract class MySearchEntity_ {
	//I'm not sure that we need the `MySearchEntity` here, but maybe we can make use of it in combination with the scope? i.e. we know the most common scope type and do some compile time checks to make sure that things match? 

	public static volatile SearchValueFieldReference<MySearchEntity, Long, String> id;
	public static volatile SearchValueFieldReference<MySearchEntity, LocalDateTime, Instant> dateTime;
	public static volatile SearchValueFieldReference<MySearchEntity, String, String> string;
	public static volatile SearchValueFieldReference<MySearchEntity, Integer, Integer> integer;
	public static volatile SearchValueFieldReference<MySearchEntity, SomeEnum, String> enuM;

	public static volatile MySearchEntityNestedObject_ nested;
}

public abstract class MySearchEntityNestedObject_ extends SearchObjectFieldReference<? not sure what or if needed at all> {
	public static volatile SearchValueFieldReference<MySearchEntityNestedObject, String, String> string;
	// ... 
}

So, the classes that are embedded "implement" the object reference interface, and we simply add them to the generated model. That way we'd be able to do something like MySearchEntity_.nested.string . We'd need to be careful to address the cycles 🙈

I'm thinking this way ^ we'd be able to say f -> f.nested(MySearchEntity_.nested) as well as f -> f.match().field(MySearchEntity_.nested.string)

I would probably prefer if we could make these "paths" to be simply chained through the dots... as path strings are 😃 and don't do method calls if possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

	//I'm not sure that we need the `MySearchEntity` here, but maybe we can make use of it in combination with the scope? i.e. we know the most common scope type and do some compile time checks to make sure that things match? 

Yes that's what I was suggesting. However, that would require adding generic type parameters to e.g. SearchPredicateFactory and resulting DSL steps, so that you can restrict what type of fields can be used:

* Would it even be possible for us to take advantage of `X`, i.e. to require that one only ever use a field in a context where the parent type/object field is `X`? Would it be possible without breaking APIs (I'm thinking of things such as `org.hibernate.search.engine.search.predicate.dsl.GenericBooleanPredicateClausesStep#should(java.util.function.Function<? super org.hibernate.search.engine.search.predicate.dsl.SearchPredicateFactory,? extends org.hibernate.search.engine.search.predicate.dsl.PredicateFinalStep>)`)?
  Related: `org.hibernate.search.engine.search.predicate.dsl.SearchPredicateFactory#withRoot`.

Alternatively, we could consider an alternative entry point into the DSL when using field references.

E.g. instead of:

f.match().field("foo").matching("sometext")
f.match().field("foo").boost(2.0f).field("bar").matching("sometext")

People would do:

f.field(MySearchEntity_.foo).matching("sometext")
f.field(MySearchEntity_.foo).boost(2.0f).field(MySearchEntity_.bar).matching("sometext")

Pro: more natural syntax, closer to what we do for sorts/projections; shorter syntax (see how there's no match() anymore?); no breaking change required.
Con: we now have two relatively different syntaxes in the DSL... And I'd personally feel bad deprecating the old one given the big change we did in 6.0. But maybe we can live with it?

public abstract class MySearchEntityNestedObject_ extends SearchObjectFieldReference<? not sure what or if needed at all> {

Quite. Given object fields do not necessarily have a Java equivalent, what would we put in the generic type argument?
Related: do we want to use these references for projections, and if so what would be the type of the resulting projection?

I'm thinking this way ^ we'd be able to say f -> f.nested(MySearchEntity_.nested) as well as f -> f.match().field(MySearchEntity_.nested.string)

Sure, makes sense.

However, ideally this should also allow us to do f.withRoot(MySearchEntity_.nested).match().field(MySearchEntityNestedObject_.string). I'm not saying people must do this, just that they should be able to do it.

Comment on lines 186 to 187
* TODO: for projections that require path, we return type does not depend on it we can prevent calling a projection if
* it is not applicable:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry what? xD

@@ -193,6 +208,10 @@ default FieldProjectionValueStep<?, Object> field(String fieldPath) {
*/
CompositeProjectionInnerStep object(String objectFieldPath);

default <T> CompositeProjectionInnerStep object(FieldReference.FieldObjectReference<T> objectFieldPath) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's T for an object field though? 🤔

@marko-bekhta marko-bekhta force-pushed the feat/HSEARCH-3319-Type-safe-field-references branch from 277a66e to e5c26f9 Compare April 23, 2024 08:50
Comment on lines 185 to 338
private static class IndexedEntity {
private int primitiveInteger;
private Double wrapperDouble;
private String string;
private LocalDate localDate;
private float[] vector;

private List<String> stringList;

private EmbeddedEntity nestedEmbeddedEntity;
private EmbeddedEntity flattenedEmbeddedEntity;

private List<EmbeddedEntity> embeddedEntityList;

}

private static class EmbeddedEntity {
private String string;
private SomeEnum someEnum;
}

private enum SomeEnum {
VALUE_1, VALUE_2, VALUE_3
}

public static class IndexedEntity_ {
public static ValueFieldReference<Integer, Integer> primitiveInteger;
public static ValueFieldReference<Double, Double> wrapperDouble;
public static ValueFieldReference<String, String> string;
public static ValueFieldReference<LocalDate, LocalDate> localDate;
public static ValueFieldReference<float[], float[]> vector;

public static ValueFieldReference<String, String> stringList;

public static EmbeddedEntity_.NESTED nestedEmbeddedEntity;
public static EmbeddedEntity_.FLATTENED flattenedEmbeddedEntity;

public static EmbeddedEntity_.NESTED embeddedEntityList;

static {
primitiveInteger = ValueFieldReference.of( "primitiveInteger", Integer.class, Integer.class );
wrapperDouble = ValueFieldReference.of( "wrapperDouble", Double.class, Double.class );
string = ValueFieldReference.of( "string", String.class, String.class );
localDate = ValueFieldReference.of( "localDate", LocalDate.class, LocalDate.class );
vector = ValueFieldReference.of( "vector", float[].class, float[].class );
stringList = ValueFieldReference.of( "stringList", String.class, String.class );
nestedEmbeddedEntity = EmbeddedEntity_.nested( "nestedEmbeddedEntity" );
flattenedEmbeddedEntity = EmbeddedEntity_.flattened( "flattenedEmbeddedEntity" );
embeddedEntityList = EmbeddedEntity_.nested( "embeddedEntityList" );
}
}

public static class EmbeddedEntity_ extends ObjectFieldReferenceImpl {

public ValueFieldReference<String, String> string;
public ValueFieldReference<SomeEnum, String> someEnum;

static NESTED nested(String absolutePath) {
return new NESTED( absolutePath );
}

static FLATTENED flattened(String absolutePath) {
return new FLATTENED( absolutePath );
}

private EmbeddedEntity_(String absolutePath) {
super( absolutePath );
this.string = ValueFieldReference.of( absolutePath + ".string", String.class, String.class );
this.someEnum = ValueFieldReference.of( absolutePath + ".someEnum", SomeEnum.class, String.class );
}

public static class NESTED extends EmbeddedEntity_ implements NestedObjectFieldReference {

private NESTED(String absolutePath) {
super( absolutePath );
}
}

public static class FLATTENED extends EmbeddedEntity_ implements FlattenedObjectFieldReference {

private FLATTENED(String absolutePath) {
super( absolutePath );
}
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @yrodiere 😃
I've been trying various ideas using this test. it is based on two entities -- indexed and embedded and two model classes with _.
It all seemed fine till I thought about the cyclic references, in which case this won't work 😔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is wrong:

		public static EmbeddedEntity_.NESTED nestedEmbeddedEntity;

What you should have here is not the metamodel of the embedded entity, but the metamodel of the object field. And the object field does not necessarily contain all fields of the embedded entity.

I think this also explains why you get cyclic dependencies.

Comment on lines +70 to +110
assertThatQuery( index.query()
.select( f -> f.field( IndexedEntity_.stringList ).multi() )
.where( f -> f.matchAll() ) )
.hasHitsAnyOrder(
List.of( "string_list_0_0", "string_list_1_0", "string_list_2_0" ),
List.of( "string_list_0_1", "string_list_1_1", "string_list_2_1" ),
List.of( "string_list_0_2", "string_list_1_2", "string_list_2_2" )
);


assertThatQuery( index.query()
.select(
f -> f.object( IndexedEntity_.nestedEmbeddedEntity )
.from( f.field( IndexedEntity_.nestedEmbeddedEntity.string ),
f.field( IndexedEntity_.nestedEmbeddedEntity.someEnum ) )
.asList()
)
.where( f -> f.matchAll() ) )
.hasHitsAnyOrder(
List.of( "string_n_0", SomeEnum.VALUE_1 ),
List.of( "string_n_1", SomeEnum.VALUE_2 ),
List.of( "string_n_2", SomeEnum.VALUE_3 )
);

assertThatQuery( index.query()
.select(
f -> f.field( IndexedEntity_.vector )
)
.where( f -> f.knn( 10 ).field( IndexedEntity_.vector )
.matching( new float[] { 1.0f, 1.0f, 1.0f } ) ) )
.hasHitsAnyOrder(
new float[] { 0.0f, 0.0f, 0.0f },
new float[] { 10.0f, 100.0f, 1000.0f },
new float[] { 20.0f, 200.0f, 2000.0f }
);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here I've been trying to use references for projections...
ValueFieldReference has two types, one is the type of the property in the entity and the other is the type in the index: ValueFieldReference<SomeEnum, String> someEnum; so we can use that to figure out what should be returned

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValueFieldReference has two types, one is the type of the property in the entity and the other is the type in the index

This is an approximation though. Some fields don't even have a (single) corresponding property in the entity.

What you need is the type of the field in the index, the type accepted by the dslConverter, and the type returned by the projectionConverter. Which can all be configured independently from each other.

Also... the strategy for object fields will have to be completely different, since projection on object fields is completely different.

@marko-bekhta marko-bekhta force-pushed the feat/HSEARCH-3319-Type-safe-field-references branch from e5c26f9 to d763543 Compare April 24, 2024 12:02
Copy link
Member Author

@marko-bekhta marko-bekhta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this version, I've tried to cover the projection and dsl converts scenario + changed how the objects are represented in the "generated model" 🙈

Comment on lines +112 to +132
assertThatQuery( index.query()
.select( f -> f.field( IndexedEntity_.stringProjectedAsBytes ) )
.where( f -> f.matchAll() )
).hasHitsAnyOrder(
new byte[] { 97, 95, 48 },
new byte[] { 97, 95, 50 },
new byte[] { 97, 95, 49 }
);

assertThatQuery( index.query()
.select( f -> f.field( IndexedEntity_.stringProjectedAsBytes.noConverter() ) )
.where( f -> f.matchAll() )
).hasHitsAnyOrder(
"a_0", "a_1", "a_2"
);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one (IndexedEntity_.stringProjectedAsBytes) has a projection converter that returns bytes, so if we just pass it to the projection it'll work with byte[], while no-converter option will produce the string

// // ... (other paths so that only string and vector fields are included)
// }
// )
private IndexedEntity parent;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a self-reference so we can "model" the cycle here

Comment on lines 172 to 180
public static ValueFieldReference<String, String, String> stringList;
public static NestedEmbeddedEntity_ nestedEmbeddedEntity;
public static EmbeddedEntityList_ embeddedEntityList;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I got your idea correctly here, instead of working with EmbeddedEntity we are generating a type with references for each of the object fields depending on what will actually end up in them.

Comment on lines +312 to +336
public static class Parent__ extends ObjectFieldReferenceImpl implements NestedObjectFieldReference {

public ValueFieldReference<String, String, String> string;
public ValueFieldReference<float[], float[], float[]> vector;

private Parent__(String absolutePath) {
super( absolutePath );
this.string = ValueFieldReference.of( absolutePath + ".string", String.class, String.class, String.class );
this.vector =
ValueFieldReference.of( absolutePath + ".vector", float[].class, float[].class, float[].class );
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, this parent only has two fields since others are excluded. And we stop here because of the specified depth.

Comment on lines +15 to +21
public interface SearchValueFieldReference<T, P> extends ProjectionTypedFieldReference<P>, TypedFieldReference<T> {

default ValueConvert valueConvert() {
return ValueConvert.YES;
}

}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This SearchValueFieldReference takes TypedFieldReference to work in the predicates etc, and ProjectionTypedFieldReference to work in projections. This way, we can use the same reference in all places and have them use different types depending on the configured converter.

parent = new Parent_( "parent" );
}

public static class NestedEmbeddedEntity_ extends ObjectFieldReferenceImpl implements NestedObjectFieldReference {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with an approach to make all these object fields classes as nested classes.
I was considering making them all at the same level, but then the class names will grow the deeper the fields are e.g. IndexedEntity_Parent_Parent_NestedEmbeddedEntity_

@marko-bekhta marko-bekhta force-pushed the feat/HSEARCH-3319-Type-safe-field-references branch 2 times, most recently from d2c7e7c to 4ba16cb Compare April 29, 2024 17:28
@marko-bekhta
Copy link
Member Author

marko-bekhta commented Apr 30, 2024

It got me thinking... What if we create a set of interfaces:

interface AggregableTypedFieldReference<T>
interface SearchableTypedFieldReference<T>
interface ProjectableTypedFieldReference<T>
interface HighlightableTypedFieldReference<T>
interface SortableTypedFieldReference<T>
interface TypedFieldReference<T>

And then, for each field, we create a custom implementation of the field reference, adding the interfaces based on what the field allows...
This way, if the field is not projectable, the user will not be able to pass it to the field projection as the interface will be missing, and so on
Or it'll just generate too many classes and isn't really needed? (alternatively, we create classes for all possible combinations of the options and use them (Maybe we can generate these as part of the metamodel, rather than maintain them manually).


@yrodiere just pinging to append this to your TODOs 🙈 😃

@yrodiere
Copy link
Member

yrodiere commented May 2, 2024

Context for @gavinking : we're discussing the "metamodel" interfaces to be used in Hibernate Search for "type-safe" query building -- basically the equivalent of JPA's Attribute/SingularAttribute/etc.

It got me thinking... What if we create a set of interfaces:

interface AggregableTypedFieldReference<T>
interface SearchableTypedFieldReference<T>
interface ProjectableTypedFieldReference<T>
interface HighlightableTypedFieldReference<T>
interface SortableTypedFieldReference<T>
interface TypedFieldReference<T>

And then, for each field, we create a custom implementation of the field reference, adding the interfaces based on what the field allows... This way, if the field is not projectable, the user will not be able to pass it to the field projection as the interface will be missing, and so on

I had thought of that, and indeed it seems very nice... I'd be interested in @gavinking's opinion though. I suspect there's a very good reason not to have done this in JPA and we're simply missing it :)

One thing that makes this solution kind of weak is that the proposed interfaces are still quite broad-scoped. A field being "aggregable" doesn't necessarily mean you can run a terms aggregation on it -- it has to be a text field for that.
We could create one interface per possible operation -- basically one interface per trait -- but would this be wise?

Or it'll just generate too many classes and isn't really needed? (alternatively, we create classes for all possible combinations of the options and use them (Maybe we can generate these as part of the metamodel, rather than maintain them manually).

Definitely too many classes if done naively. We need to use the same class definition for all fields that share the same traits (not the same type, just the same traits!). I suspect that would only lead to about dozen classes even on complex applications, even if we went with one interface per trait.

@gavinking
Copy link
Member

I had thought of that, and indeed it seems very nice... I'd be interested in @gavinking's opinion though. I suspect there's a very good reason not to have done this in JPA and we're simply missing it :)

Well, we actually do have something like that in the Jakarta Data static metamodel.

The reason it's not particularly useful in JPA is that it maps to SQL and in SQL essentially every type is sortable, projectable, aggregable, etc.

That said, there is value in having things like TextualAttribute, NumericAttribute, etc, though it's very much specific to the type system you're trying to model.

@yrodiere
Copy link
Member

yrodiere commented May 2, 2024

Right, okay. So I think we could go with one interface per trait @marko-bekhta .

I'd suggest creating a follow-up issue for backend-specific traits, because that is going to be lots of fun. We may end up merging pojo-base into the engine. => Actually forget this, it's a really dumb suggestion, I was confused.

@gavinking
Copy link
Member

Note that I guess I kinda slightly oversimplified the situation with JPA, since actually there are some constraints on what you can do with different types of expressions, but they are encoded into constraints on type parameters of methods of CriteriaBuilder.

This works, but it's a solution that does involve some verbosity. @beikov has proposed introducing typed expression interfaces to JPA,though they're not really precisely the sort of "traits" you're thinking about here.

@marko-bekhta marko-bekhta force-pushed the feat/HSEARCH-3319-Type-safe-field-references branch from 4ba16cb to bdedc9b Compare May 7, 2024 09:00
Comment on lines +66 to +83
SearchScope<EntityB> scope = searchSession.scope( List.of( EntityB.class, EntityC.class ) );
SearchPredicate searchPredicate = scope.predicate().match().field( EntityB_.stringA ).matching( "b" ).toPredicate();
SearchPredicate searchPredicate2 =
scope.predicate().match().field( EntityC_.stringC ).matching( "c" ).toPredicate();

assertThat(
searchSession.search( scope )
.select( f -> f.field( EntityC_.stringA ) )
.where( searchPredicate )
.fetchHits( 20 )
).containsOnly( "b" );

assertThat(
searchSession.search( scope )
.select( f -> f.field( EntityC_.stringA ) )
.where( searchPredicate2 )
.fetchHits( 20 )
).containsOnly( "c" );
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular example worked nice, but ....

if we'd have a

@Entity
@Indexed
public static class EntityA {
    @Id
    Long id;

    @FullTextField(projectable = Projectable.YES)
    String stringA;
}

@Entity
@Indexed
public static class EntityB {
    @Id
    Long id;

    @FullTextField(projectable = Projectable.YES)
    String stringA;

    @FullTextField(projectable = Projectable.YES)
    String stringB;
}

@Entity
@Indexed
public static class EntityC {
    @Id
    Long id;

    @FullTextField(projectable = Projectable.YES)
    String stringA;

    @FullTextField(projectable = Projectable.YES)
    String stringC;
}

then the best scope limit we'd have would be an Object:

SearchScope<Object> scope = searchSession.scope( List.of( Entity2B.class, Entity2C.class ) );

and with that, anything can be applied:

// while the path is there, the EntityB isn't in the scope
SearchPredicate searchPredicate = scope.predicate().match().field( EntityB_.stringA ).matching( "b" ).toPredicate();

// while the path is there, what is EntityC ? 
.select( f -> f.field( EntityC_.stringA ) )

I was thinking about using something else instead of entity type, maybe the generated root type EntityA_ but for that I haven't found a nice entry point to pass the type. Unless we also introduce:

searchSession.search( EntityA_.self() )

// where self returns something like
interface ScopeReference<T, E> {
    Class<T> rootReferenceType(); // EntityA_
    Set<Class<? extends E> scopeClasses(); // EntityA.class ... 
}

and similar things for the scope methods receiving this generated "root" (which should also work with those unions we've been talking about on the call yesterday)...

WDYT @yrodiere ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have expected field(...) to accept a *Reference<? super E>, see my other comment; but maybe I forgot what our conclusions were yesterday.
In any case, <? super E> instead of <? extends E> would solve your specific problem here.

For targeting multiple types in the same query, we would address that as a follow-up with the "generated union types".

For targeting fields that are only present in some subtypes of the target type, we would address that as part of https://hibernate.atlassian.net/browse/HSEARCH-3434 (though it would be about much more than predicates, obviously).

Copy link
Member Author

@marko-bekhta marko-bekhta May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look at this😃 !
+1 on ? super E

For targeting multiple types in the same query, we would address that as a follow-up with the "generated union types".

Yeah, that's what I've started to think about, and how we could do it, so we wouldn't need to go through all these interfaces one more time 🙈 😈
That's why I was thinking if we do this:

searchSession.search( EntityA_.scope() )

// where scope returns something like
interface ScopeReference<R, T> {
    Class<R> rootReferenceType(); // EntityA_
    Set<Class<T> scopeClasses(); // EntityA.class ... 
}

And then the "generated union type" is e.g.:

public static class Entity2A_union_Entity2B_ {
	public static ValueFieldReference1<Entity2A_union_Entity2B_, String, String, String> stringA;

	ReferenceScope<Entity2A_union_Entity2B_, Object> scope() {
		return new ReferenceScope<>() {
			@Override
			public Class<Entity2A_union_Entity2B_> rootReferenceType() {
				return Entity2A_union_Entity2B_.class;
			}

			@Override
			public Set<Class<?>> scopeClasses() {
				return Set.of( Entity2A.class, Entity2B.class );
			}
		};
	}
	
	static {
		stringA = ValueFieldReference1.of( "stringA", Entity2A_union_Entity2B_.class, String.class, String.class,
				String.class );
	}
}

and for non-union:

public static class Entity2A_ {
	public static ValueFieldReference1<Entity2A_, String, String, String> stringA;

	ReferenceScope<Entity2A_, Entity2A> scope() {
		return new ReferenceScope<>() {
			@Override
			public Class<Entity2A_> rootReferenceType() {
				return Entity2A_.class;
			}

			@Override
			public Set<Class<Entity2A>> scopeClasses() {
				return Set.of( Entity2A.class );
			}
		};
	}

	static {
		stringA = ValueFieldReference1.of( "stringA", Entity2A_.class, String.class, String.class, String.class );
	}
}

with that, we won't even need ? super R it'll all just be strictly limited to R itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why I was thinking if we do this:

Makes sense, but using the word scope is likely to cause confusion... We'll need to think about it.

And then the "generated union type" is e.g.:

LGTM.

with that, we won't even need ? super R it'll all just be strictly limited to R itself.

What about when an entity extends another, though? We may want to use fields from the supertype's static metamodel.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#4156

I've tried out this idea in this other PR ^.

searchSession.search( EntityC_.scope )
	.select( f -> f.field( EntityC_.stringA ) )
	.where( f -> f.match().field( EntityC_.stringC ).matching( "c" ) )
	.fetchHits( 20 )
SearchScope<EntityB_union_EntityC_, EntityB> scope = EntityB_union_EntityC_.scope.scope( searchSession );
searchSession.search( scope )
	.select( f -> f.field( EntityB_union_EntityC_.stringA ) )
	.where( searchPredicate )
	.fetchHits( 20 )

Makes sense, but using the word scope is likely to cause confusion... We'll need to think about it.

yeah ... that's true 😃
so maybe instead of EntityB_union_EntityC_.scope.scope( searchSession ); do EntityB_union_EntityC_.scope.create( searchSession );

What about when an entity extends another, though? We may want to use fields from the supertype's static metamodel.

In that other PR everything is tied to the generated class, e.g. EntityC_, and I was thinking that if we start with the EntityC_ there's no reason to switch to a different type as this EntityC_ should contain all that is accessible within the scope created by that same generated class EntityC_

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that other PR everything is tied to the generated class, e.g. EntityC_, and I was thinking that if we start with the EntityC_ there's no reason to switch to a different type as this EntityC_ should contain all that is accessible within the scope created by that same generated class EntityC_

There definitely is a reason: reuse. Think of people having entities that extend e.g. some AuditableEntity with a creationDate and lastUpdateDate. I can definitely see them define a util to add a predicate on any subtype to restrict by last update date, and that util will not be able to take advantage of the metamodel of the specific subtype, it'll have to rely on AuditableEntity_.

... Which I guess raises the question of @MappedSuperclass xD Not today, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ok 😃 we probably can make it work for the "root" types, the inheritance part, I mean.. but not for any embeddable. (I'll give it a try in that PR)

public Optional<ElasticsearchSearchAggregationFactory> extendOptional(
SearchAggregationFactory original) {
public Optional<ElasticsearchSearchAggregationFactory<E>> extendOptional(
SearchAggregationFactory<?> original) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SearchAggregationFactory<?> original) {
SearchAggregationFactory<E> original) {

See SearchProjectionFactoryExtension.

*/
@SuppressWarnings("unchecked")
default <T> MatchPredicateFieldMoreGenericStep<E, ?, ?, T, MatchPredicateFieldReference<? extends E, T>> field(
MatchPredicateFieldReference<? extends E, T> field) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be ? super E? I mean, logically?

This is precisely what we've been talking about, but I don't remember the conclusion we reached.


/**
* The initial step in a "match" predicate definition, where the target field can be set.
*/
public interface MatchPredicateFieldStep<N extends MatchPredicateFieldMoreStep<?, ?>> {
public interface MatchPredicateFieldStep<E, N extends MatchPredicateFieldMoreStep<E, ?, ?>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to pick a different name for the generic type parameter, as the root is not always an entity type.
Maybe R?

Comment on lines +66 to +83
SearchScope<EntityB> scope = searchSession.scope( List.of( EntityB.class, EntityC.class ) );
SearchPredicate searchPredicate = scope.predicate().match().field( EntityB_.stringA ).matching( "b" ).toPredicate();
SearchPredicate searchPredicate2 =
scope.predicate().match().field( EntityC_.stringC ).matching( "c" ).toPredicate();

assertThat(
searchSession.search( scope )
.select( f -> f.field( EntityC_.stringA ) )
.where( searchPredicate )
.fetchHits( 20 )
).containsOnly( "b" );

assertThat(
searchSession.search( scope )
.select( f -> f.field( EntityC_.stringA ) )
.where( searchPredicate2 )
.fetchHits( 20 )
).containsOnly( "c" );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have expected field(...) to accept a *Reference<? super E>, see my other comment; but maybe I forgot what our conclusions were yesterday.
In any case, <? super E> instead of <? extends E> would solve your specific problem here.

For targeting multiple types in the same query, we would address that as a follow-up with the "generated union types".

For targeting fields that are only present in some subtypes of the target type, we would address that as part of https://hibernate.atlassian.net/browse/HSEARCH-3434 (though it would be about much more than predicates, obviously).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants