-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28996: Implement Custom ReplicationEndpoint to Enable WAL Backup to External Storage (Option 3) #6633
base: HBASE-28957
Are you sure you want to change the base?
HBASE-28996: Implement Custom ReplicationEndpoint to Enable WAL Backup to External Storage (Option 3) #6633
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@vinayakphegde Please make this patch ready for review, if you're done with it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good to me, though I'm not sure if would split this patch into two by separating the framework related changes.
I would add a few trace level log lines for debugging backup events. Usually I just leave my custom logging in the code that I used for developing the patch. They would be useful for future debugging.
import org.slf4j.LoggerFactory; | ||
|
||
@InterfaceAudience.Private | ||
public class BackupFileSystemManager { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add javadoc comment to this class.
public class ContinuousBackupReplicationEndpoint extends BaseReplicationEndpoint { | ||
private static final Logger LOG = | ||
LoggerFactory.getLogger(ContinuousBackupReplicationEndpoint.class); | ||
public static final String CONF_PEER_UUID = "hbase.backup.wal.replication.peerUUID"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this configurable? Shouldn't we just generate randomly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think for a particular target, it should be unique and same all the time.
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java#L140-#L146
if (fsDataOutputstream == null) { | ||
return; // Presume closed | ||
} | ||
fsDataOutputstream.flush(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have any effect on S3?
Don't remove, just asking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as I understand,
When multipart upload is enabled and the buffer type is set to disk:
- Data is written to the active block using the write() API.
- When flush() is called, it flushes the current block, which in this case writes to the underlying disk.
- When the current block is full during a write() operation, the upload for that block is initiated asynchronously.
- Finally, when close() is called, it uploads the last block, waits for all block uploads to complete, and then finalizes the upload process.
so, flush will persist the data of the current block to the disk.
But it doesn't actually sync the data to s3.
so, I think we don't need that.
also, we don't need to call the sync method as well I guess.
No description provided.