Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update PolarisEclipseLinkStore to work with more databases #247

Merged
merged 33 commits into from
Sep 16, 2024

Conversation

eric-maynard
Copy link
Contributor

@eric-maynard eric-maynard commented Sep 3, 2024

Description

Some of the code used by EclipseLink was not compatible with MySQL (or engines other than Postgres). This PR introduces several changes to the persistence layer while maintaining backwards compatibility.

Long VARCHARs

Many databases have more restrictive limits on column sizes than Postgres.

To support them, long VARCHARs in the persistence layer (properties and internalProperties) have been changed to TEXT.

Sequence Support

POLARIS_SEQ relies on a NEXTVAL native query that is not supported by most databases. It's removed, and backwards compatibility is preserved.

The sequence POLARIS_SEQ is deprecated and will get renamed to POLARIS_SEQ_UNUSED the first time it's used.

After POLARIS_SEQ is used that last time, POLARIS_SEQUENCE will be used as a sequence table and will track all IDs that have been assigned. From that point forward, the POLARIS_SEQUENCE table will be used to obtain new IDs.

If Polaris starts up and POLARIS_SEQ was never created or it was already renamed, then it should just be ignored.

Fixes #241

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Documentation update
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

With this change, I am able to bootstrap Polaris backed by MySQL (view previous commits for e.g. persistence.xml changes)

Checklist:

Please delete options that are not relevant.

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings

@eric-maynard eric-maynard marked this pull request as draft September 3, 2024 22:29
@eric-maynard eric-maynard marked this pull request as ready for review September 4, 2024 00:40
@eric-maynard
Copy link
Contributor Author

I've refactored and will change the description to reflect the new approach.

@@ -280,8 +283,8 @@ public void runActionInTransaction(
LOGGER.debug("transaction committed");
}
} catch (Exception e) {
LOGGER.debug("Rolled back transaction due to an error", e);
tr.rollback();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this rollback fails, we never saw any logging previously.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we retry in certain case(e.g., database isn't available temporily) of rollback failure? It's not a blocker for this PR though. We can fix it in a follow-up PR if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed this is not in scope for this PR, but maybe we could consider it. As long as we maintain transactionality, anything is ok.

However my guess is that most (99+%) of failures will be non-recoverable. Before changing this logic, I would advocate for us collecting stats on recoverable failures and perhaps trying to tailor the retry to only recoverable failures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we would retry if OptimisticLockException is hit - concurrent access. Yeah, we can collect what can be retried here. I remember the caller will do retry. Is that right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my tests, no the caller did not retry if this query failed.

Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eric-maynard for working on this. LGTM overall. Left some minor comments.

@@ -280,8 +283,8 @@ public void runActionInTransaction(
LOGGER.debug("transaction committed");
}
} catch (Exception e) {
LOGGER.debug("Rolled back transaction due to an error", e);
tr.rollback();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we retry in certain case(e.g., database isn't available temporily) of rollback failure? It's not a blocker for this PR though. We can fix it in a follow-up PR if needed.

* Generates a new ID from `POLARIS_SEQUENCE` or `POLARIS_SEQ` depending on availability. If
* `POLARIS_SEQ` exists, it will be renamed to `POLARIS_SEQ_UNUSED`.
*/
public static Long getNewId(EntityManager session) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renaming suggestion: getNewId() -> nextSequence()? Id is OK to use, but here we may use sequence for consistency.

Copy link
Contributor Author

@eric-maynard eric-maynard Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sequence is kind of an overloaded term. The table itself can be considered like a sequence. Previously, we used the postgres object of type "SEQUENCE" which is what caused issues.

Really, the user wants a single ID (BIGINT type) and they do not want a SEQUENCE object. So the PolarisSequenceManager, which manages the lifecycle of the sequence object/table, uses that sequence to generate and return an ID here.

Comment on lines 53 to 88
private static synchronized Optional<Long> getSequenceId(EntityManager session) {
DatabasePlatform databasePlatform = getDatabasePlatform(session);
if (databasePlatform instanceof PostgreSQLPlatform) {
Optional<Long> result = Optional.empty();
if (!sequenceCleaned.get()) {
try {
LOGGER.info("Checking if the sequence POLARIS_SEQ exists");
String checkSequenceQuery =
"SELECT COUNT(*) FROM information_schema.sequences WHERE sequence_name IN ('polaris_seq', 'POLARIS_SEQ')";
int sequenceExists =
((Number) session.createNativeQuery(checkSequenceQuery).getSingleResult()).intValue();

if (sequenceExists > 0) {
LOGGER.info("POLARIS_SEQ exists, calling NEXTVAL");
long queryResult =
(long) session.createNativeQuery("SELECT NEXTVAL('POLARIS_SEQ')").getSingleResult();
result = Optional.of(queryResult);
} else {
LOGGER.info("POLARIS_SEQ does not exist, skipping NEXTVAL");
}
} catch (Exception e) {
LOGGER.info(
"Encountered an exception when checking sequence or calling `NEXTVAL('POLARIS_SEQ')`",
e);
}
if (result.isPresent()) {
removeSequence(session);
}
sequenceCleaned.set(true);
}
return result;
} else {
LOGGER.info("Skipping POLARIS_SEQ / NEXTVAL check for platform " + databasePlatform);
return Optional.empty();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may extract a new method for Postgres like this, so that we confine the database specific code into to a separated method.

  private static synchronized Optional<Long> getSequenceId(EntityManager session) {
    DatabasePlatform databasePlatform = getDatabasePlatform(session);
    if (databasePlatform instanceof PostgreSQLPlatform) {
      return getSequenceFromPostgres(session);
    } else {
      return Optional.empty();
    }
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not strictly Postgres that supports SEQUENCE -- H2 also supports it, and other DBs support similar logic with different syntax.

Really the goal here is to patch this ASAP and then migrate all new usage to the table. Eventually, we can yank all this postgres-specific code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I wasn't clear in my first comment. The code is mainly for migration in case anyone is already using Postgres in production. We have two approaches:

  1. Ignore it, assuming no one is actually using it in production yet—especially since Apache Polaris hasn't had its first release.
  2. Wrap the migration logic into a method so we can easily remove it in the future.

I prefer the second option. With that, we could make the following changes:

  1. Call the migration code only in the initialization method, instead of reusing it in getNewId(), since the migration happens only once. In case of migration failure, we should just throw.
  2. Simplify getNewId(), as we don't need to distinguish if it's migrated or not. Specifically, we can remove the line getSequenceId(session).ifPresent(modelSequenceId::setId);.
  3. Eliminate PolarisSequenceUtil::sequenceCleaned and PolarisEclipseLinkStore::initialized, as we throwed in case of migration failure, there is no need to check if migration actually happened.

Copy link
Contributor Author

@eric-maynard eric-maynard Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration code is getSequenceId itself. Eventually, it can be replaced with return Optional.empty() or the method can be ripped out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved most of this into initialize and shifted some things around -- let me know if this more closely matches your expectations

Comment on lines 106 to 108
// If a legacy sequence ID is present, use that as an override:
getSequenceId(session).ifPresent(modelSequenceId::setId);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it is easier to understand if we do this

    ModelSequenceId modelSequenceId = new ModelSequenceId();

    DatabasePlatform databasePlatform = getDatabasePlatform(session);
    if (databasePlatform instanceof PostgreSQLPlatform) {
      getSequenceFromPostgres(session).ifPresent(modelSequenceId::setId);
    }
    
    // Persist the new ID:
    session.persist(modelSequenceId);
    session.flush();

Copy link
Contributor Author

@eric-maynard eric-maynard Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to encapsulate the implementation of getSequenceId in that private method.

The responsibility of getNewId is simply to create a ModelSequenceId and persist it. The implementation of how the ModelSequenceId gets its ID is not that relevant to readers of getNewId. I would be opposed to leaking getSequenceId implementation details (e.g. databases X, Y support method A, database Z supports method B) into this getNewId.

@@ -280,8 +283,8 @@ public void runActionInTransaction(
LOGGER.debug("transaction committed");
}
} catch (Exception e) {
LOGGER.debug("Rolled back transaction due to an error", e);
tr.rollback();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we would retry if OptimisticLockException is hit - concurrent access. Yeah, we can collect what can be retried here. I remember the caller will do retry. Is that right?

* available it will be used then cleaned up. In all other cases the `POLARIS_SEQUENCE` table is
* used directly.
*/
class PolarisSequenceUtil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand the purpose of this change? Is this to handle different syntax for different databases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing code seems to only work with Postgres or a very limited set of databases because of the native query. We need to move away from that and choose a construction that works with all/most databases that EclipseLink supports. At the same time, we need to preserve the IDs already generated by any existing Postgres deployments.

Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@flyrain
Copy link
Contributor

flyrain commented Sep 16, 2024

Thanks @eric-maynard for the PR. As a followup, we may figure out a deprecating strategy for the one-time migration. Hopefully we can have a release soon, so that we have a solid deprecating way based on the versions.

Thanks @aihuaxu for the review.

@sfc-gh-aixu
Copy link
Contributor

@eric-maynard We have testCreateTasksInParallel() to test the id generation to make sure they are unique. Can you help run PolarisEclipseLinkMetaStoreManagerTest.testCreateTasksInParallel() against mysql database if we haven't done that?

@eric-maynard
Copy link
Contributor Author

@aihuaxu after this change, Postgres should go through the new code path as well. So the existing test actually tests the new code path. What is not necessarily covered is the migration path, where you start using the old code and then upgrade. Big +1 to @flyrain's comment that we need a versioning & deprecation strategy. I think that will also help test things like this (cross-version compatibility)

@sfc-gh-aixu
Copy link
Contributor

+1.

@flyrain flyrain merged commit 537eee6 into apache:main Sep 16, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG]Bootstrapping errors for MySQL
3 participants