Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Attempt to) stop jumping markers for duplicate wiki pages #1922

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 4 additions & 16 deletions analysers/analyser_osmosis_wikipedia.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,44 +26,34 @@
sql10 = """
SELECT
(array_agg(tid))[1:10],
ST_AsText(any_locate((array_agg(type))[1], (array_agg(id))[1])),
ST_AsText(any_locate(substring(MIN(tid), 1, 1), substring(MIN(tid), 2)::bigint)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not the min as expected, because it compare on strings.

Copy link
Collaborator Author

@Famlam Famlam Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, true, n56789 would be after n100000. I've updated the initial post.
However, my goal was just to have a stable output (as long as no object is added/removed) to get rid of the jumping markers, so whether it's alphabetically the first or numerically the first should not make a difference, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think older is more stable.
It just about concatenate type later.

Copy link
Collaborator Author

@Famlam Famlam Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how to get the oldest element.
If I add a new way today, the ID will be around 1185000000. A node with the same ID would be 12 years old. And a relation with that ID doesn't exist yet. So comparison by id will not retrieve the oldest element

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just order by type and id will do the job. The older by type is sufficient, isn't it ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have another look at a later moment.

Just calling ORDER BY won't work, since I can't add id or type to GROUP BY or it'll be filtered out. Probably have to modify the COUNT

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, also with COUNT(w) it doesn't work. Probably I misunderstood what you meant @frodrigo, where would I have to call the ORDER BY type, id?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the issue ID changes anyway if elements are added or deleted, right? So is it really needed to be the same element every time, also when elements are added? (Of course, the jumping when there's no changes must be fixed)

w
FROM ((
SELECT
tags->'wikipedia' AS w,
'N' || id AS tid,
'N' AS type,
id
'N' || id AS tid
FROM
nodes
WHERE
tags != ''::hstore AND
tags?'wikipedia' AND
NOT tags->'wikipedia' LIKE '%#%' AND
NOT tags?| ARRAY['highway', 'railway', 'waterway', 'power', 'place', 'shop', 'network', 'operator']
ORDER BY
id
) UNION ALL (
SELECT
tags->'wikipedia' AS w,
'W' || id AS tid,
'W' AS type,
id
'W' || id AS tid
FROM
ways
WHERE
tags != ''::hstore AND
tags?'wikipedia' AND
NOT tags->'wikipedia' LIKE '%#%' AND
NOT tags?| ARRAY['highway', 'railway', 'waterway', 'power', 'place', 'shop', 'network', 'operator']
ORDER BY
id
) UNION ALL (
SELECT
tags->'wikipedia' AS w,
'R' || id AS tid,
'R' AS type,
id
'R' || id AS tid
FROM
relations
WHERE
Expand All @@ -72,8 +62,6 @@
NOT tags->'wikipedia' LIKE '%#%' AND
NOT tags->'type' IN ('route', 'boundary') AND
NOT tags?| ARRAY['highway', 'railway', 'waterway', 'power', 'place', 'shop', 'network', 'operator']
ORDER BY
id
)) AS t
GROUP BY
w
Expand Down