-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crosscluster/logical: hydrate table descriptors with types for event decoding #131130
Conversation
It looks like your PR touches SQL parser code but doesn't add or edit parser tests. Please make sure you add or edit parser tests if you edit the parser. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
8ba5e8b
to
3c8a2b1
Compare
Epic: none Release note: none
313ec88
to
afcc0fc
Compare
@@ -52,7 +52,7 @@ func makeImportTypeResolver(typeDescs []*descpb.TypeDescriptor) importTypeResolv | |||
// Note that if a table happens to have multiple types with the same name (but | |||
// different schemas), this implementation will return a "feature unsupported" | |||
// error. | |||
func (i importTypeResolver) ResolveType( | |||
func (i ImportTypeResolver) ResolveType( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if there's a better TypeResolver
is should use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type resolver is insufficient for the sql write path. Ideally the type resolver used by LDR can take in foreign types with mismatched OIDs and allow the sql path to just work. Currently the sql path writes panic because we hydrate the datums ingested using the source side UDTs, which have a different OID than on the destination side. I can write up a separate issue to address this
} | ||
typeIDs, _, err := td.GetAllReferencedTypeIDs(dbDesc, | ||
func(id descpb.ID) (catalog.TypeDescriptor, error) { | ||
if id > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the deal with this function passed to GetAllREferencedTypeIDs
? as the panic suggests, it doesn't seem to get called.
@@ -341,10 +342,17 @@ func (p *partitionedStreamClient) PlanLogicalReplication( | |||
descMap[int32(desc.ID)] = desc | |||
} | |||
|
|||
// TODO(msbutler): unclear if we should persist the types as pointers in the protos. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ignore this comment. i like how i stored the types in stream.proto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1, 3 of 3 files at r2, 4 of 4 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @msbutler, and @rafiss)
pkg/ccl/crosscluster/logical/logical_replication_job.go
line 363 at r3 (raw file):
// TODO(msbutler): is this import type resolver kosher? Should put in a new package. importResolver := importer.MakeImportTypeResolver(plan.SourceTypes)
Probably want to in a separate PR move this out / rename this later.
pkg/ccl/crosscluster/producer/replication_manager.go
line 163 at r2 (raw file):
Previously, msbutler (Michael Butler) wrote…
what's the deal with this function passed to
GetAllREferencedTypeIDs
? as the panic suggests, it doesn't seem to get called.
This code is only focused on default / update expressions? Have your tests included that. i.e. Have a cast in your default expression.
pkg/ccl/crosscluster/producer/replication_manager.go
line 170 at r2 (raw file):
} foundTypeDescriptors[id] = struct{}{} return descriptors.ByIDWithoutLeased(r.txn.KV()).WithoutNonPublic().Get().Type(ctx, id)
Why do we avoid leasing here?
pkg/sql/importer/import_type_resolver.go
line 55 at r1 (raw file):
Previously, msbutler (Michael Butler) wrote…
This type resolver is insufficient for the sql write path. Ideally the type resolver used by LDR can take in foreign types with mismatched OIDs and allow the sql path to just work. Currently the sql path writes panic because we hydrate the datums ingested using the source side UDTs, which have a different OID than on the destination side. I can write up a separate issue to address this
I think this is probably the better option. We can probably make this more generic implementation more generic. i.e. Extend it so that it can rewrite OIDs upon hydration as a feature, which only LDR needs.
Or implement a LDR specific version on top with that logic.
pkg/ccl/crosscluster/logical/logical_replication_job_test.go
line 1974 at r3 (raw file):
dbB.Exec(t, "CREATE TYPE my_enum AS ENUM ('one', 'two', 'three')") dbA.Exec(t, "CREATE TABLE data (pk INT PRIMARY KEY, val my_enum)")
To hit some of the code paths above we can have fun and add default expressions which use the type.
…ning Epic: none Release note: none
afcc0fc
to
56483c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @fqazi, and @rafiss)
pkg/ccl/crosscluster/logical/logical_replication_job.go
line 363 at r3 (raw file):
Previously, fqazi (Faizan Qazi) wrote…
Probably want to in a separate PR move this out / rename this later.
hrm, are you ok that I make the ImportTypeResolver public temporarily in this PR. Once we have the LDR specific type resolver, we can make it private again.
pkg/ccl/crosscluster/producer/replication_manager.go
line 163 at r2 (raw file):
Previously, fqazi (Faizan Qazi) wrote…
This code is only focused on default / update expressions? Have your tests included that. i.e. Have a cast in your default expression.
ah yes, i hit this panic after i added a default expression in my unit test. I feel like this GetAllReferencedTypeIDs
function could be better documented to explain that the getType
function only gets called when a column with the UDT in that table has a default expression.
pkg/ccl/crosscluster/producer/replication_manager.go
line 170 at r2 (raw file):
Previously, fqazi (Faizan Qazi) wrote…
Why do we avoid leasing here?
This was copy pasta. We grab the descriptor with the lease now.
pkg/ccl/crosscluster/logical/logical_replication_job_test.go
line 1974 at r3 (raw file):
Previously, fqazi (Faizan Qazi) wrote…
To hit some of the code paths above we can have fun and add default expressions which use the type.
Done.
…decoding This hydrates the source side table descriptors with their referencing type descriptors if they have any. This allows ldr ingestion to properly decode tables with user defined types. Note that LDR will still fast fail on replicating a table with user defined types (unless the SKIP SCHEMA CHECK OPTION is used) because we still need to teach LDR to: - validate that the source and destination enum's are _physically_ identical (contain the same map from int to enum values) - support UDT replication on sql path, which currently fails because the passed in datum points to the source side udt instead of the destination udt. As another step down the line, we should relax the requirement that UDTs must physically equivalent. Instead, they only need to be _logically_ equivalent (i.e. have the same user facing enum values). Supporting this requires remapping the ints from two logically equivalent enums. Epic: none Release note: none
56483c6
to
849d013
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @fqazi, and @rafiss)
pkg/sql/importer/import_type_resolver.go
line 55 at r1 (raw file):
Previously, fqazi (Faizan Qazi) wrote…
I think this is probably the better option. We can probably make this more generic implementation more generic. i.e. Extend it so that it can rewrite OIDs upon hydration as a feature, which only LDR needs.
Or implement a LDR specific version on top with that logic.
wrote up this issue and assigned it to you #132164
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r4, 3 of 3 files at r5, 3 of 4 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @dt and @rafiss)
pkg/ccl/crosscluster/logical/logical_replication_job.go
line 363 at r3 (raw file):
Previously, msbutler (Michael Butler) wrote…
hrm, are you ok that I make the ImportTypeResolver public temporarily in this PR. Once we have the LDR specific type resolver, we can make it private again.
Done.
Yeah, I'm good with making this public. And then come up with a clean interface later.
This will allow us to do type descriptor validation when we check for replicating table equivalency. Epic: none Release note: none
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r8, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt, @msbutler, and @rafiss)
pkg/ccl/crosscluster/producer/replication_manager.go
line 147 at r8 (raw file):
) ([]descpb.TypeDescriptor, map[descpb.ID]struct{}, error) { descriptors := txn.Descriptors() dbDesc, err := descriptors.MutableByID(txn.KV()).Database(ctx, td.GetParentID())
We can just get a leased one for the database descriptor, since I don't think these get modified.
pkg/ccl/crosscluster/producer/replication_manager.go
line 165 at r8 (raw file):
foundTypeDescriptors[typeID] = struct{}{} typeDesc, err := descriptors.MutableByID(txn.KV()).Type(ctx, typeID)
If we don't intend to write these back out and just want a modifiable copy, we can use a leased descriptor and the builder API. Just something to consider later, since it will avoid an unnecessary KV roundtrip to get the descriptor.
TFTR! bors=fqazi |
bors r+ |
dbB.Exec(t, "INSERT INTO data VALUES (2)") | ||
|
||
var jobAID jobspb.JobID | ||
dbA.QueryRow(t, "CREATE LOGICAL REPLICATION STREAM FROM TABLE data ON $1 INTO TABLE data with skip schema check", dbBURL.String()).Scan(&jobAID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we remove the check from here:
cockroach/pkg/sql/catalog/tabledesc/logical_replication_helpers.go
Lines 163 to 168 in 9f61a38
if dstCol.Type.UserDefined() { | |
return errors.Newf( | |
"destination table %s column %s has user-defined type %s", | |
dst.Name, dstCol.Name, dstCol.Type.SQLStringForError(), | |
) | |
} |
that way we don't need the skip schema check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should remove it once we fix this issue #132167
This hydrates the source side table descriptors with their referencing type
descriptors if they have any. This allows ldr ingestion to properly decode
tables with user defined types.
Note that LDR will still fast fail on replicating a table with user defined
types (unless the SKIP SCHEMA CHECK OPTION is used) because we still need to
teach LDR to:
(contain the same map from int to enum values)
in datum points to the source side udt instead of the destination udt.
As another step down the line, we should relax the requirement that UDTs must
physically equivalent. Instead, they only need to be logically equivalent
(i.e. have the same user facing enum values). Supporting this requires
remapping the ints from two logically equivalent enums.
Epic: none
Release note: none