Skip to content
This repository has been archived by the owner on Nov 28, 2024. It is now read-only.

use an index name that is guaranteed to be no more than 255 chars #21

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

zcourts
Copy link

@zcourts zcourts commented Jun 30, 2018

Ignite indices have a char limit of 255 on the name.
The current naming scheme easily exceeds that in complex models because it uses the field names on the entity.
This PR uses a random UUID instead, ensuring that limit is never hit.
I considered using "tableName + UUID" but that just makes it less likely since it's possible to have entity names longer than the 127 chars that the UUID leaves us with.

For reference an example of the current scheme failing manifests itself as in the attached file/stacktrace.
ignite-ogm-index-name-error.txt

@Hibernate-CI
Copy link
Contributor

Can one of the admins add this person to the trusted builders? (reply with: "add to whitelist" or "ok to test")

@schernolyas
Copy link
Contributor

schernolyas commented Jun 30, 2018

Hi @zcourts !

Thanks for PR!
But ... before merge the PR ... Could you please create a issue in project's jira ?
Project's Jira URL is http://hibernate.atlassian.net

Also ... Do you use the dialect for your project? What is your impressions?
Thanks for feedback!

@zcourts
Copy link
Author

zcourts commented Jun 30, 2018

Hi @schernolyas
I've created https://hibernate.atlassian.net/browse/OGM-1507

We're migrating an in-house ORM-like library we built to do this with Ignite.
So far it has been ok. No major issue or complaint.
Having never used Hibernate ORM, or OGM at all before this project, it's been more straightforward than I'd expect.
The ignite dialect itself I've only just switched over to using, the initial spike was to investigate only and I did the spike using the Infinispan embedded dialect. The switch over was quick, having learnt what was needed using Infinispan.

The docs are a pain point, If I hadn't done Infinispan first I suspect it'd be a lot harder to use this dialect. The asci docs in the documentation module was invaluable in learning what configs were available to start.

Having the ability to choose an existing Ignite instance started earlier in the JVM was crucial, I'd have had to add that option if it wasn't already present. We have a huge amount of infrastructure already built around Ignite and customising various pieces.

My initial test works fine, but having switched to the patch branch I'm actually now getting an IndexOutOfBoundsException, value.getColumnIterator() in https://github.com/hibernate/hibernate-ogm-ignite/blob/master/ignite/src/main/java/org/hibernate/ogm/datastore/ignite/impl/IgniteCacheInitializer.java#L284 returns more entries than ( (ComponentType) type ).getSubtypes().
I'm investigating, will either report a bug later or another PR/fix if it's not my fault.

My next point of call is going to be investigating how I can provide a custom CacheConfiguration. I've already found https://github.com/hibernate/hibernate-ogm-ignite/blob/master/ignite/src/main/java/org/hibernate/ogm/datastore/ignite/impl/IgniteCacheInitializer.java#L196 but I'm still trying to find my way around the code so it's unclear at the moment how I'll approach this but definitely need to customise the cache configs. Potentially another PR (I keep thinking there's probably a means of exposing dialect specific stuff but haven't found it. I know about unwrap but haven't investigated as a potential route yet)

It's still early days (though I'm hoping a migration to this is completed and ready for integration tests next week). Once I've spent more time, happy to provide feedback.

EDIT:

  1. While I'm at it, is there a reason OGM-1472 Rework of rendering JPQL into Ignite SQL #19 hasn't been merged?
  2. The version in the pom is 5.3.0-SNAPSHOT but 5.3.1-Final is already in the tree. Is that intentional?

@DavideD
Copy link
Member

DavideD commented Jul 4, 2018

While I'm at it, is there a reason #19 hasn't been merged?

There were some conflicts when I review it the first time and then I went on Holiday. I just came back going through the accumulated emails and new PRs.

The version in the pom is 5.3.0-SNAPSHOT but 5.3.1-Final is already in the tree. Is that intentional?

I will check, it seems an error. Thanks for telling us

Copy link
Member

@DavideD DavideD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for providing a PR.
A few notes about the way we work:

  • A PR should also have a test case, in particular we would need a way to verify what happens when the problem occurs
  • For commit messages we use the guidelines here: https://chris.beams.io/posts/git-commit/ with the following template (issue code + first letter Uppercase): [OGM-1500] Use an index ...

I'm not convinced about this solution, can't the user choose a different index name using the javax.persistence.@Index annotation?

@@ -180,7 +181,7 @@ private void appendIndex(QueryEntity queryEntity, AssociationKeyMetadata associa
fields.put( realColumnName, true );
}
queryIndex.setFields( fields );
queryIndex.setName( queryEntity.getTableName() + '_' + org.hibernate.ogm.util.impl.StringHelper.join( fields.keySet(), "_" ) );
queryIndex.setName( UUID.randomUUID().toString().replace( "-", "_" ) );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A random name will work but it won't be very user friendly, making it hard to debug when an error occurs or reproduce problems.
Let's use the previous approach and only if the string is bigger than 255 chars do something different. What happens if index name is bigger than 255 chars? Error?

@DavideD
Copy link
Member

DavideD commented Jul 4, 2018

@zcourts I left a few comments about this PR, thanks a lot for your help.

Feel free to ask us for further questions if you something is not clear.
Cheers

@Salauyou
Copy link
Contributor

Salauyou commented Jul 4, 2018

Agree with @DavideD. Index names should be more readable.
For example, if length > 255, you take first N chars and append an autoincremented number to ensure uniqueness, like it's done for aliases in queries:
TABLE_NAME_VERY_LONG...._COL_001
TABLE_NAME_VERY_LONG...._COL_002
etc.
If length <= 255, it would be left uncnahged.

@DavideD
Copy link
Member

DavideD commented Jul 4, 2018

That would work, a simpler solution could be to throw an exception at startup and let the user rewrite the default name. I don't think situation will happen often

@DavideD
Copy link
Member

DavideD commented Jul 4, 2018

In general, we prefer to give the user enough information so that it can easily solve issues based on what's best for him.

@Salauyou
Copy link
Contributor

Salauyou commented Jul 4, 2018

@DavideD agree. A better way would be to throw an exception proposing to reduce table/column names or define this index manually in @Table(indexes =...).

@zcourts
Copy link
Author

zcourts commented Jul 10, 2018

That's fair enough. I'll add a test case for > 255 chars raising an exception and a success case for using javax.persistence.@Index to manually name the index. Will update the PR to address the comments raised.

@schernolyas
Copy link
Contributor

Hi @zcourts !

What is status of the PR ?

@zcourts
Copy link
Author

zcourts commented Aug 9, 2018

Hi @schernolyas
I was busy that week and this slipped off my radar. I just attempted the fix with some tests. It needs a bit of guidance - see the questions I've raised above

@zcourts
Copy link
Author

zcourts commented Aug 15, 2018

@schernolyas / @DavideD ping ^^

@DavideD
Copy link
Member

DavideD commented Aug 15, 2018

  • see the questions I've raised above

What questions?

String indexName = queryEntity.getTableName() + '_' + org.hibernate.ogm.util.impl.StringHelper.join( fields.keySet(), "_" );
Class<?> tableClass = context.getTableEntityTypeMapping().get( associationKeyMetadata.getAssociatedEntityKeyMetadata().getEntityKeyMetadata().getTable() );
javax.persistence.Table tableAnnotation = tableClass.getAnnotation( javax.persistence.Table.class );
if( tableAnnotation != null && tableAnnotation.indexes().length > 0) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DavideD / @schernolyas how do you propose this should be handled?
Here I'm just taking the first non-empty @Index name and using it...I'm not convinced that's the way to go but @Index cannot be applied to the classes directly.

Also, should the "columnList" be used as part of the name, if so, how?

edit
What about org.hibernate.search.annotations.Index - it's allowed on classes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about org.hibernate.search.annotations.Index - it's allowed on classes?

It's for full-text searches and only if one want to use Hibernate Search.
It's not suitable for this case, I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what's going on with this part of the code, why would you take the first empty index name? Isn't an index related to a field? What's the logic here?

Copy link
Member

@DavideD DavideD Aug 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible to create indexes in two ways:

  • @Table( indexes = {@Index( columnList= "field1,field2", name="" )})
  • @Index( ... ) on a field of the class

*/
public class LongIndexNameTest extends OgmTestCase {

@Test(expected = HibernateException.class)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DavideD / @schernolyas - I need a bit of guidance on how to achieve this.
The exception HibernateException is raised as expected but because the OgmTestCase does setup before calling the test itself, this doesn't get caught and the exception causes the test to fail. How do I allow the exception to be raised and verify that it did to pass the test?

Copy link
Member

@DavideD DavideD Aug 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several examples in our test base about checking for exceptions, look for the following line in test classes:

	@Rule
	public ExpectedException thrown = ExpectedException.none();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -180,7 +181,21 @@ private void appendIndex(QueryEntity queryEntity, AssociationKeyMetadata associa
fields.put( realColumnName, true );
}
queryIndex.setFields( fields );
queryIndex.setName( queryEntity.getTableName() + '_' + org.hibernate.ogm.util.impl.StringHelper.join( fields.keySet(), "_" ) );
String indexName = queryEntity.getTableName() + '_' + UUID.nameUUIDFromBytes( org.hibernate.ogm.util.impl.StringHelper.join( fields.keySet(), "_" ).getBytes() );
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this sufficient without the use of the @Index? It makes it extremely unlikely for this to happen and it was already a rare case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't you simply concatenating the fields? What's the benefit in adding the UUID as suffix?

@zcourts
Copy link
Author

zcourts commented Aug 16, 2018

Didn't realise "PENDING" meant I had to submit before anyone else could see the comments

}
}
}
if ( indexName.getBytes().length > 255 ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come aren't you using indexName.length()?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Ignite's limit is of bytes not char length. String.length will give num of chars in the string but isn't necessarily the same as num of bytes.
Just verified in the Ignite code base as well and realised this still has a bug, Ignite specifically uses UTF_8 https://github.com/apache/ignite/blob/cc370d6cfef4a9d82761cc70fcb3bbeb0f91ab94/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/persistence/IndexStorageImpl.java#L108 so I'll update this to do the same

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine a table MyTable with 100 columns, and at some point you see the error:
Encoded index name is too long for index: MyTable_189bbbb0-0c5f-3fb7-bba9-ad9285f193d1.

How do you know which column caused the issue?

Wouldn't you prefer something like:
Encoded index name is too long for index MyTable_Column23_column34

We could also add in the message the name of the table and the column used to create the index as additional information.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also didn't add the charset StandardCharsets.UTF_8:
byte[] idxNameBytes = idxName.getBytes(StandardCharsets.UTF_8);

This means that if OGM runs on a machine with a different charset the validation might not be correct.

}
}

@Entity(name = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec dapibus cursus vestibulum. Quisque eu justo non mi tincidunt sagittis. Donec tincidunt facilisis placerat. Sed placerat urna eget tristique faucibus. Curabitur maximus gravida enim, vitae sed")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be @Table( name = "...") ? I don't think the attribute name in @Entity will affect the mapping in Ignite

@OneToMany(targetEntity = LongEntityName.class)
private Set<LongEntityName> longEntityNames = new HashSet<>();

@javax.persistence.JoinTable(name = "joinLongEntityName")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this annotation required?

try ( OgmSession session = openSession() ) {
Set<QueryIndex> indexes = IgniteTestHelper.getIndexes( session.getSessionFactory(), EntityWithCustomIndex.class );
assertThat( indexes.size() ).isEqualTo( 1 );
assertThat( indexes.iterator().next().getName() ).isEqualToIgnoringCase( "SimpleIndexName" );
Copy link
Member

@DavideD DavideD Aug 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

You could make this test even better using:
assertThat( indexes ).onProperty( "name" ).containsOnly( "SimpleIndexName" )

instead of

assertThat( indexes.size() ).isEqualTo( 1 );
assertThat( indexes.iterator().next().getName() ).isEqualToIgnoringCase( "SimpleIndexName" ); 

By the way, is Ignite case insensitive? What happens if we have two indexes on two different columns with the same name and dfferent case. For example: "IndexNAME" and "Indexname".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need some additional test cases:

  1. What happens when the index name looks like something like :
    Table(indexes = {@Index(name = "Idx @#\\@    weird chars", columnList = "id")})
    
    Is that even possible?
  2. What happens if there are two indexes created on different fields in the same class?
  3. What happens if two indexes have the same name except for the the case? Example: "IndexNAME" and "Indexname"
  4. What happens when the index span multiple columns? Will it work using @Index( columnList = "field1,field2")?

This is just a question because I'm not familiar with it: Can we support the unique property in the @Index annotation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Test that the the index with the default name has the name we expect


@Test(expected = HibernateException.class)
public void testLongEntityIndexName() throws Exception {
fail( "The length of the registered entity's cache name should've failed this already" );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is not good enough, The failure might be caused by something else.
How do you know that it failed because of the longer index? Actually, I don't think that's the cause of the failure. I think It's because you use @Entity( name="..." ) instead of @Table(name-=""). But I might be wrong because I didn't run the test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think JPAAPIWrappingTest#testUndefinedPU() might look similar to what you need.
You might want to have the following assertions (using the JUnit rule):

thrown.expect( HibernateException.class );
thrown.expectMessage( ... );

THere are multiple example of tests in hibernate ogm core using ExpectedException junit rule in the sources of hibernate-ogm core project.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, sometimes the JUnit rule is not easy to use. he try catch block is still a valid option

}
}
if ( indexName.getBytes().length > 255 ) {
throw log.indexNameTooLong( indexName, queryEntity.getTableName() );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to collect all the indexes with an invalid name and then throw an exception at the end?
It's not a critical problem but when there is more than one error, the user can see all the problems in one go instead of
fix and rerun the application everytime an index fails the validation.

Copy link
Member

@DavideD DavideD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

There are still some problems with this PR.

I'm not convinced about the generated name because it seems more complicated than we need (but I'm open to discuss it).

I hope the test I linked might be a good example to solve some of the problems in the test you added.

Cheers

@zcourts
Copy link
Author

zcourts commented Aug 23, 2018

That's great feedback, thank you and thanks for the pointer to that file. I'm out of the office until Tuesday, I'll take a look at all of the points you've raised and address them next week.

Base automatically changed from master to main March 19, 2021 14:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants