Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding SPLAC under PLAC #527

Draft
wants to merge 4 commits into
base: v7.1
Choose a base branch
from
Draft

Adding SPLAC under PLAC #527

wants to merge 4 commits into from

Conversation

dthaler
Copy link
Collaborator

@dthaler dthaler commented Aug 6, 2024

Conversation draft of adding place records to 7.1

This puts SPLAC as a substructure of PLAC, like PERSONAL_NAME_PIECES are under PERSONAL_NAME_STRUCTURE. See PR #520 for another alternative, and diffs between the two proposals can be seen here.

Putting SPLAC under (rather than alongside/instead of) PLAC is proposed in this PR for two reasons:

  1. For consistency with actual usage of GEDCOM-L's _LOC extension.
  2. So that a FamilySearch GEDCOM 7.0 (or even 5.5.1) application can read a 7.1 GEDCOM file using SPLAC without losing the place names. The analogy is similar to how name parts can be optionally used whereas applications that don't support them still use the NAME payload.

This PR only addresses the record-vs-substructure topic. The additional substructures that have also been considered for location records (for example, see GEDCOM-L's _LOC extension) would presumably be added to the new <<PLACE_DETAILS>> production in a future PR if we decide that this organization is the right approach.

tychonievich and others added 4 commits July 24, 2024 13:03
This puts PLAC and SPLAC side-by-side, like NOTE and SNOTE. That is not the only possibility, and alternative PRs with other designs are anticipated to allow better comparison of the options.

This only addresses the record-vs-substucture topic. The additional substuctures that have also been considered for PLAC/_LOC/etc are prepared for by creating a `<<PLACE_DETIALS>>` production to be the home for such additions in a future PR.
Conversation draft of adding place records to 7.1

This puts SPLAC as a substructure of PLAC, like PERSONAL_NAME_PIECES are
under PERSONAL_NAME_STRUCTURE.

Signed-off-by: Dave Thaler <[email protected]>
@dthaler dthaler marked this pull request as draft August 6, 2024 16:59
@dthaler dthaler closed this Aug 6, 2024
@dthaler dthaler deleted the splac-under-plac branch August 6, 2024 18:08
@dthaler dthaler restored the splac-under-plac branch August 6, 2024 18:09
@dthaler dthaler reopened this Aug 6, 2024
@tychonievich
Copy link
Collaborator

Discussed in steering committee 6 AUG 2024

  • Those present were happy with both this and Adding SPLAC beside PLAC #520, but would want input from the GEDCOM-L community that uses the SPLAC-like _LOC extension now before committing to either one.
  • This proposal has the benefit of being more like how _LOC works (a substructure of PLAC).
  • The other proposal has the benefit of a clearer path towards deprecating PLAC.
  • Either Adding SPLAC beside PLAC #520 and Adding SPLAC under PLAC #527 would also want to have more substructures added before being released.

@tychonievich
Copy link
Collaborator

tychonievich commented Aug 13, 2024

The GEDCOM-L discussed this and reported back to the steering committee. If SPLAC is added in 7.1, they prefer this solution to #520. But they'd rather minimize the number of times things change, so if we plan to have something unlike this in 8.0 (i.e. remove PLAC entirely), then they'd prefer to not do anything in 7.1 and wait until 8.0. But this might change when we pick the PLACE_DETAILS and have actual 7.1 and 8.0 drafts to compare.

@dthaler
Copy link
Collaborator Author

dthaler commented Aug 13, 2024

Discussion in GEDCOM Steering Committee 13 AUG 2024:

  • GEDCOM-L discussed issues with merging between one file using only SPLAC and another using only PLAC, in the other proposal which has no specific association between PLAC and SPLAC values, whereas this proposal doesn't have that issue since SPLAC requires PLAC still exist.
  • Legacy and TNG and MagiCensus are other apps that store per-place (not per instance of usage) data in GEDCOM files so additional input would be useful. (MagiCensus developers have no strong opinion here and are ok with either option discussed by the committee.)
  • It would be good to see other PRs that add the other capabilities from the _LOC extension (date, etc.) to complete the proposals.
  • It would be good to see a v8 proposal draft that removes PLAC, to allow getting Legacy, TNG, etc. to weigh in across the set.

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Aug 13, 2024

Luther, Can you please explain the discussion better?
First, I was hoping that I would be party to the discussion on PLAC vs SPLAC, I was asked to attend but have been camping for the last 6 weeks and been away from regular internet access during that time.

I had hoped that SPLAC would stand along side PLAC in v7.1 so that either option could be used, and later the PLAC tag could be deprecated.

One of the reasons I've disliked SPLAC (and _LOC) as a substructure to PLAC is when the PLAC value does not match the dates parameters of the _LOC structure. Which is correct? What happens when the _LOC hierarchy is corrected when a a location changes its parent but the PLAC does not change. Have the data in two places, possibly used for different reasons (1) Historical accuracy, (2) Modern Map Linking to Mapping software?

@dthaler
Copy link
Collaborator Author

dthaler commented Aug 20, 2024

Discussion in GEDCOM Steering Committee 20 AUG 2024:

  • The committee discussed examples like @Norwegian-Sardines mentioned above where PLAC and SPLAC could get out of sync in terms of the hierarchy.
  • If you just put the lowest level in the PLAC payload then this problem doesn't occur, but it is not backwards compatible without data loss.
  • If information is in both places for backwards compatibility, there is always the danger of being out of sync with each other. One could perhaps declare a precedence between the two as to which one wins, but it's unclear if that's helpful.
  • If information is not in both places, one loses backwards compatibility with apps/websites that don't yet support the new structure.
  • We want this to be a topic of discussion in a public meeting that will be scheduled before making any decisions. Watch the discussion board for details. The purpose of these draft PRs is to generate such discussion.

@fisharebest
Copy link
Contributor

A supposed "advantage" of the hierarchical _LOC structure is the ability to add dates to the relationships. Thus you can represent situtations such as Gdansk being variously in Poland, Germany, Russia, Prussia, etc.

I'm going to argue that this is a disadvantage, not an advantage.

By allowing this "historical geographic database" to be included in a GEDCOM file, you are effectively requiring it. If I want to record events in Gdansk, I need to know/include the exact dates for all of its history.

Now every GEDCOM file will needs to contain the same historic geographic database.

This feels completely wrong. The historic geographic database ought to be part of a genealogy application - not included in every genealogy file.

IMHO, PLAC (and SPLAC, etc.) should contain the current/contemporary place names. Historic place names would be used in source-citations etc.

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Aug 27, 2024

A supposed "advantage" of the hierarchical _LOC structure is the ability to add dates to the relationships. Thus you can represent situtations such as Gdansk being variously in Poland, Germany, Russia, Prussia, etc.

I'm going to argue that this is a disadvantage, not an advantage.

By allowing this "historical geographic database" to be included in a GEDCOM file, you are effectively requiring it. If I want to record events in Gdansk, I need to know/include the exact dates for all of its history.

Now every GEDCOM file will needs to contain the same historic geographic database.

This feels completely wrong. The historic geographic database ought to be part of a genealogy application - not included in every genealogy file.

IMHO, PLAC (and SPLAC, etc.) should contain the current/contemporary place names. Historic place names would be used in source-citations etc.

I understand your concern, and it is a valid to a point. We are trained to record the exact "place" an individual was born/died. So if a person was born in Gdansk, Poand and an ancestor was born in Gdansk, Prussia you are saying that both individuals would have Gdansk, Poland as their birth place, giving a wrong impression of the fact for a reader 10 years down the road who only has access to the report or book produced out of the GEDCOM.

So maybe a better concept is to not have a hierarchical concept at all, but a concept where the PLAC tag holds the recorded place and an optional link to a "Historical_Place_Record" (HPLAC) that is a single record which contains detail about the place listing all of its names, dates and history. This would allow the PLAC to record the proper birth place used in archived reports/books, and if present the HPLAC would tell readers the coordinates, alternate name list (with dates?) and a short history with pictures of the place for people who want to know about the place and could be included in a report as supplemental information.

Cities, Towns, Countries, and other places can have varied names over time (the town my grandfather was born has changed names 3 times since he was born there and move location once) and that information needs to be recorded (the birth town) and explained for readers down the road (a history section), particularly when the birth certificate says one thing and the book says another!

@dthaler
Copy link
Collaborator Author

dthaler commented Aug 27, 2024

A supposed "advantage" of the hierarchical _LOC structure is the ability to add dates to the relationships. Thus you can represent situtations such as Gdansk being variously in Poland, Germany, Russia, Prussia, etc.

I'm going to argue that this is a disadvantage, not an advantage.

By allowing this "historical geographic database" to be included in a GEDCOM file, you are effectively requiring it. If I want to record events in Gdansk, I need to know/include the exact dates for all of its history.

Now every GEDCOM file will needs to contain the same historic geographic database.

This feels completely wrong. The historic geographic database ought to be part of a genealogy application - not included in every genealogy file.

Interesting... I understand the above argument (which seems valid to me) to be arguing against an SPLAC record and instead just using the mechanism we already have in 7.0 of using PLAC.EXID with (say) a FamilySearch Place ID as registered at https://gedcom.io/_pages/exid-type/ as the "historic geographic database". That is, the database itself need not be "part of" the genealogy application per se, as long as the application can access the database online, which the existing EXID does enable. Of course there could be other historic geographic databases besides the FamilySearch Place one, and one would then want an EXID type registered for whatever else too. But I understand you're arguing to do that and not add that database into each GEDCOM file itself. That seems like a valid point to me, as an implementer. (FWIW, my app stores the familysearch ID of the place in GEDCOM 5.5.1 files using an extension, so your suggestion in a sense meets the "Used" criteria just as the _LOC extension used by other apps also does.)

IMHO, PLAC (and SPLAC, etc.) should contain the current/contemporary place names. Historic place names would be used in source-citations etc.

I think this point is orthogonal to the rest of the point. And this one I disagree with... some users would enter the place as it was in the record (i.e., the historic name) and other users would enter the place using the current place name. Since the GEDCOM 7.0 (and earlier) specs did not say either way, that means I would argue we must permit both uses until 8.0 (the earliest we could make a breaking change) and hence you can never tell from the PLAC payload which name it is - current vs historic. Once you use the EXID to point to a historic geographic database entry though, you can look up the name as of any given date and solve the problem though.

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Aug 27, 2024

I think this point is orthogonal to the rest of the point. And this one I disagree with... some users would enter the place as it was in the record (i.e., the historic name) and other users would enter the place using the current place name. Since the GEDCOM 7.0 (and earlier) specs did not say either way, that means I would argue we must permit both uses until 8.0 (the earliest we could make a breaking change) and hence you can never tell from the PLAC payload which name it is - current vs historic. Once you use the EXID to point to a historic geographic database entry though, you can look up the name as of any given date and solve the problem though.

See my comment above about HPLAC linked record!

@albertemmerich
Copy link
Collaborator

The possibility to enter historical names and administrative levels does not imply, that you must do that. There are a lot of other structures in GEDCOM which are only found in specialised applications. You can do it as you did so far - trying to get todays name (and administrations levels) of a place.

However an example: You have a village of maybe 20 farms. You will not find a historical gazetteer showing those details. However the place records we already implemented by GEDCOM-L location records give you the possibility to document the names of the farms, their exact locations, their type and so on - even varying over time.

@fisharebest: You will write Калининград in Russia, if the source tells you Königsberg in Preußen? And sources do not say: he was born in Gdansk, Prussia. The city was called Danzig, which still is its German name. So it was Danzig in Preußen... Place records can manage the names of places varying in time and by language. Munich in Bavaria is the same city as München in Bayern, to have another example only based on different languages.

See:

@Norwegian-Sardines
Copy link

The PLAC.EXID to sites like "FamilySearch" is not good enough as a "place history" when we include places that are not in any external database. Places not limited to farms, estates, hospitals, small cemeteries, but could also be places with an expanded history that is important to a researcher and their readers concerning a small rural community, or even larger cities that have expanded to include neighboring communities. Sharing, merging data from multiple researcher may want to have these expanded histories of little know places included in the larger context. Many of the most important places to my family history are not found in any external database or the data is very limited!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants