Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured Data 2024 #3594

Closed
10 tasks done
nrllh opened this issue Mar 2, 2024 · 27 comments · Fixed by #3811
Closed
10 tasks done

Structured Data 2024 #3594

nrllh opened this issue Mar 2, 2024 · 27 comments · Fixed by #3811
Labels
2024 chapter Tracking issue for a 2024 chapter

Comments

@nrllh
Copy link
Collaborator

nrllh commented Mar 2, 2024

Structured Data 2024

Structured Data illustration

If you're interested in contributing to the Structured Data chapter of the 2024 Web Almanac, please reply to this issue and indicate which role or roles best fit your interest and availability: author, reviewer, analyst, and/or editor. You might be interested in exploring the changes to this year's version here.

Content team

Lead Authors Reviewers Analysts Editors Coordinator
@cyberandy @cyberandy @jvandriel, @rrlevering @nrllh @capjamesg -
Expand for more information about each role 👀
  • The content team lead is the chapter owner and responsible for setting the scope of the chapter and managing contributors' day-to-day progress.
  • Authors are subject matter experts and lead the content direction for each chapter. Chapters typically have one or two authors. Authors are responsible for planning the outline of the chapter, analyzing stats and trends, and writing the annual report.
  • Reviewers are also subject matter experts and assist authors with technical reviews during the planning, analyzing, and writing phases.
  • Analysts are responsible for researching the stats and trends used throughout the Almanac. Analysts work closely with authors and reviewers during the planning phase to give direction on the types of stats that are possible from the dataset, and during the analyzing/writing phases to ensure that the stats are used correctly.
  • Editors are technical writers who have a penchant for both technical and non-technical content correctness. Editors have a mastery of the English language and work closely with authors to help wordsmith content and ensure that everything fits together as a cohesive unit.
  • The section coordinator is the overall owner for all chapters within a section like "User Experience" or "Page Content" and helps to keep each chapter on schedule.

Note: The time commitment for each role varies by the chapter's scope and complexity as well as the number of contributors.

For an overview of how the roles work together at each phase of the project, see the Chapter Lifecycle doc.

Milestone checklist

0. Form the content team

  • 📆 April 15 Complete program and content committee - 🔑 Organizing committee
    • The content team has at least one author, reviewer, and analyst.

1. Plan content

  • 📆 May 1 First meeting to outline the chapter contents - 🔑 Content team
    • The content team has completed the chapter outline.

2. Gather data

  • 📆 June 1 Custom metrics completed - 🔑 Analysts
  • 📆 June 1 HTTP Archive Crawl - 🔑 HA Team
    • HTTP Archive runs the June crawl.

3. Validate results

  • 📆 August 15 Query Metrics & Save Results - 🔑 Analysts
    • Analysts have queried all metrics and saved the output.

4. Draft content

  • 📆 September 15 First Draft of Chapter - 🔑 Authors
    • Authors has written the chapter.
  • 📆 October 10 Review & Edit Chapter - 🔑 Reviewers & Editors
    • Reviewers and Editors has processed the the chapter.

5. Publication

  • 📆 October 15 Chapter Publication (Markdown & PR) - 🔑 Authors
    • Authors has converted the chapter to markdown and drafted a PR.
  • 📆 November 1 Launch of 2024 Web Almanac 🚀 - 🔑 Organizing committee

6. Virtual conference

  • 📆 November 20 Virtual Conference - 🔑 Content Team

Chapter resources

Refer to these 2024 Structured Data resources throughout the content creation process:
📄 Google Docs for outlining and drafting content
🔍 SQL files for committing the queries used during analysis
📊 Google Sheets for saving the results of queries
📝 Markdown file for publishing content and managing public metadata
💻 Collab notebook for collaborative coding in Python - if needed
💬 #web-almanac-structured-data on Slack for team coordination

@nrllh nrllh added help wanted: reviewers This chapter is looking for reviewers help wanted: analysts This chapter is looking for data analysts help wanted: coauthors This chapter is looking for coauthors 2024 chapter Tracking issue for a 2024 chapter labels Mar 2, 2024
@cyberandy
Copy link
Contributor

Happy to join as I did in the past.

@cyberandy
Copy link
Contributor

cyberandy commented Apr 4, 2024 via email

@nrllh
Copy link
Collaborator Author

nrllh commented Apr 4, 2024

Thank you, @cyberandy!

@nrllh
Copy link
Collaborator Author

nrllh commented Apr 9, 2024

Hey @JohnBarrettWDW @SeoRobt @jasonbellwebdataworks @jonoalderson @JasmineDW - awesome contributors from previous years 🙂 Are you interested in joining us again this year?

@SeoRobt
Copy link
Contributor

SeoRobt commented Apr 10, 2024 via email

@jonoalderson
Copy link
Contributor

Unfortunately I'm likely to be tied up with other commitments! :(

@cyberandy
Copy link
Contributor

@jvandriel would be a great contributor!! Sorry @jonoalderson and @SeoRobt not to have you in the crew this year.

@jvandriel
Copy link

Sounds good to me @cyberandy.

@nrllh
Copy link
Collaborator Author

nrllh commented Apr 10, 2024

Sounds good to me @cyberandy.

Great! Would you like to contribute as analyst? @jvandriel

@jvandriel
Copy link

I fear the analyst role is beyond my skillset @nrllh (I never learned SQL). Reviewer probably suits me best.

@nrllh
Copy link
Collaborator Author

nrllh commented Apr 11, 2024

All right, thank you!

@capjamesg
Copy link

I am happy to contribute as an editor. I am a professional technical writer (4+ years) and community contributor to various open standards initiatives (microformats2, W3C Social Web Community Group).

@rrlevering
Copy link

I am interested in being involved as a reviewer as well.

@cyberandy
Copy link
Contributor

Dear @jvandriel @rrlevering and @capjamesg, would you be up for a quick call to review the outline of this year's edition of the SD Chapter? I created a Doodle for this if you like the idea: https://doodle.com/meeting/participate/id/b8OMl9la

@capjamesg
Copy link

@cyberandy Thank you for the link! I am unavailable next week, but I can meet any week after that. If there is a document with the outline that I can review, please send it over and I can provide async feedback.

@cyberandy
Copy link
Contributor

cyberandy commented Aug 27, 2024

Thanks @capjamesg I have added some additional slots for the week after (same link > https://doodle.com/meeting/participate/id/b8OMl9la) and of course, don't worry if you can't make it. I will share here the link of the outline once ready. On another note, @nrllh when will the data be available?

@nrllh
Copy link
Collaborator Author

nrllh commented Aug 27, 2024

Thanks @capjamesg I have added some additional slots for the week after (same link > https://doodle.com/meeting/participate/id/b8OMl9la) and of course, don't worry if you can't make it. I will share here the link of the outline once ready. On another note, @nrllh when will the data be available?

It's cool to see the progress! The data will be available by this Friday.

@nrllh
Copy link
Collaborator Author

nrllh commented Aug 31, 2024

@cyberandy the results are already in the sheet. Please check. The JSON-LD relationships and timeseries comparisons should be completed; the other results are already there.

@cyberandy
Copy link
Contributor

thanks @nrllh, when do you think the time series comparisons can be completed?

@nrllh
Copy link
Collaborator Author

nrllh commented Sep 4, 2024

Today ;) I'll ping you in the next few hours

@nrllh
Copy link
Collaborator Author

nrllh commented Sep 5, 2024

thanks @nrllh, when do you think the time series comparisons can be completed?

The data is there, except for the last two figures. I'm still working on them.

@danbri
Copy link

danbri commented Sep 9, 2024

Hello! I am interested to help eg as a reviewer...

@cyberandy
Copy link
Contributor

I sent the memo of yesterday's meeting and added also @danbri in the loop. Here is a new doodle for the next check point: https://doodle.com/meeting/organize/id/eXnnLJAb/preview 🙌

@nrllh a few questions that came up yesterday:

  • just a confirmation that the data focuses, also this year, on the top-level page (home page) only
  • will it be possible to extract structured data from Initial HTML (before JS execution) and compare it with structured data extracted from the Final DOM (after JS execution)? This way we could see what are the common practices on this front.

Many thanks in advance.

@nrllh
Copy link
Collaborator Author

nrllh commented Sep 12, 2024

just a confirmation that the data focuses, also this year, on the top-level page (home page) only

We analyze both home pages and inner pages, but when reporting, we do so at the site level. This means we do not count a site more than once.

will it be possible to extract structured data from Initial HTML (before JS execution) and compare it with structured data extracted from the Final DOM (after JS execution)? This way we could see what are the common practices on this front.

We have access to the response of all requests, including the root page's response. Could you please provide more details on what you exactly want to compare?

@cyberandy
Copy link
Contributor

Thanks @nrllh for the clarifications. If my understanding is correct structured data from inner pages is aggregated with the top-level page.

Regarding the second point, the key aspect we’d like to explore is the percentage of websites that rely on client-side JavaScript for structured data injection versus those that serve it directly from the server. Additionally, it would be insightful to analyze the correlation between the types of entities represented in JSON-LD and whether they are injected via JavaScript or delivered server-side. Many thanks in advance!

@cyberandy
Copy link
Contributor

cyberandy commented Oct 23, 2024

Dear @danbri @rrlevering @jvandriel @capjamesg please update the Google Doc in the next few days and I'll proceed with the opening of the PR for the markdown in the next few days.

I hope everyone can have a chance to review / contribute to the final document.

@capjamesg
Copy link

I have left comments in the linked PR.

@nrllh nrllh removed help wanted: reviewers This chapter is looking for reviewers help wanted: analysts This chapter is looking for data analysts help wanted: coauthors This chapter is looking for coauthors labels Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2024 chapter Tracking issue for a 2024 chapter
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants