diff --git a/content/blog/entries/2019-07-24-cc-search-wp-plugin/contents.lr b/content/blog/entries/2019-07-24-cc-search-wp-plugin/contents.lr index 3f0fe191f..39173756f 100644 --- a/content/blog/entries/2019-07-24-cc-search-wp-plugin/contents.lr +++ b/content/blog/entries/2019-07-24-cc-search-wp-plugin/contents.lr @@ -25,7 +25,7 @@ After a while, our plugin was ready. Its temporary name is “WP CC Search plugi ![Plugin screenshot 1](01.png) -The plugin’s features are: +The plugin’s features are: - Works in WordPress editor and add a button above the content text area and into the “Add Media” pop-up window. - Via a pop-up window, allows searching through millions of images using Creative Commons Catalog API power. - Allows filtering by a provider @@ -47,11 +47,11 @@ After the plugin’s activation, when the user writes a post, an **“Image with ![Plugin screenshot 2](02.png) -By pressing the button, a pop-up window allows the user to search using Latin characters for an image. +By pressing the button, a pop-up window allows the user to search using Latin characters for an image. ![Plugin screenshot 3](03.png) -The user can browse the returned images, preview an image and its license and adjust the image settings: +The user can browse the returned images, preview an image and its license and adjust the image settings: - use of thumbnail or original image - set the image link (if any) - insert the image into the post or as the featured image @@ -66,7 +66,7 @@ After the user selects an image and insert it, into the post, the image code alo The plugin uses AJAX requests to communicate with https://api.creativecommons.engineering and get responses in JSON format. -Only 2 AJAX requests are sent to . +Only 2 AJAX requests are sent to . 1. , for getting the providers list. Then the JSON response is used as select options, for the provider’s dropdown menu. 2. A call to with the necessary parameters for searching for images based on the given words and the selected provided. @@ -74,9 +74,9 @@ The JSON results, are then transformed via jQuery to images and show up into the #### Still to do -There are still some things that we want to add/change into the plugin and any help on building them is welcome. +There are still some things that we want to add/change into the plugin and any help on building them is welcome. - Find a different plugin name :-). We believe that the plugin’s name should change since it is not an “official” CC commons plugin. -- Currently, the plugin needs the Classic Editor plugin in order to work for WP 5+. A main goal is to make it Gutenberg compatible. +- Currently, the plugin needs the Classic Editor plugin in order to work for WP 5+. A main goal is to make it Gutenberg compatible. - Add Multiple images select support - Add Multi-select options for filtering: providers - Add select options for filtering: licenses, creator @@ -97,11 +97,11 @@ There are still some things that we want to add/change into the plugin and any h #### More about us -**[1] CTI - Greek School Network and Networking Technologies Directorate** +**[1] CTI - Greek School Network and Networking Technologies Directorate** Our main activities in Greek School Network and Networking Technologies Directorate (GSN-NTS) () of Computer Technology Institute and Press “Diophantus” (CTI) are the design, implementation, and support of network infrastructures and Internet services. Through its involvement in pioneer National and European research and development projects, GSN-NTS has a major role in the development of network infrastructures and services, and Internet services as well in Greece, especially those concerning school networks and ICT infrastructures at schools. -**[2] Greek School Network ()** +**[2] Greek School Network ()** is the national network of the Ministry of Education and Religious Affairs which safely interconnects all schools of Primary and Secondary education, including educational units abroad, services and entities supervised by the Ministry of Education and Religious Affairs at central and regional level, service providers of lifelong learning, students, teaching staff, other educators and other entities of Ministry of Education and Religious Affairs ([www.minedu.gov.gr](https://www.minedu.gov.gr/)). -**[3] ** +**[3] ** The plugin was originally developed for , which is the free blogging platform for all Greek teachers, students, and school units. The blogs.sch.gr is a service of Greek School Network**[2]** (https://www.sch.gr/en). It hosts more than 50.000 blogs and users. It is built and maintained by the Greek School Network and Networking Technologies Directorate of CTI. diff --git a/content/blog/entries/2019-09-11-google-docs-plugin/contents.lr b/content/blog/entries/2019-09-11-google-docs-plugin/contents.lr index fd115d71d..056c7bcf8 100644 --- a/content/blog/entries/2019-09-11-google-docs-plugin/contents.lr +++ b/content/blog/entries/2019-09-11-google-docs-plugin/contents.lr @@ -17,7 +17,7 @@ A few years ago I published a [Google Docs Add-on that allowed users to insert a ### Why -I created the add-on after inspiration from students in one of the classes I teach at [Fresno Pacific University’s Graduate School of Education](http://www.fresno.edu/). I was teaching a course on digital literacy, and open licenses and fair use was a large portion of the course. +I created the add-on after inspiration from students in one of the classes I teach at [Fresno Pacific University’s Graduate School of Education](http://www.fresno.edu/). I was teaching a course on digital literacy, and open licenses and fair use was a large portion of the course. To get started, I used the [templates Google provided about building an addon](https://developers.google.com/apps-script/overview), took a look at the html etc on the Creative Commons License Chooser page itself, and eventually got something working! It took a couple of days off and on and mostly was me remembering javascriptI admit I had to pay a guy on Upwork to help clean up the code before I published it because sometimes the chooser would keep selecting hundreds of license images… of course, this was also before I was a bona fide contributor to the open source project I Product Manage, [OpenSALT](http://www.github.com/opensalt/opensalt). @@ -27,13 +27,13 @@ My nine year teaching career in Fresno Unified School District plus my career in ### What -This month I’ve finally updated it from the old Google Docs add-on store to the Gsuite Marketplace and updated some links. I’ve also updated the [Github sit](https://github.com/brandonopened/creativecommons_gdocs)e as the main website for the app and hope to implement some changes based on the work in[ this Github repo](https://github.com/creativecommons/cc-license-chooser) with an updated license chooser process etc. The add-on has been installed thousands of times and usually has couple of hundred uses a month based on statistics. +This month I’ve finally updated it from the old Google Docs add-on store to the Gsuite Marketplace and updated some links. I’ve also updated the [Github sit](https://github.com/brandonopened/creativecommons_gdocs)e as the main website for the app and hope to implement some changes based on the work in[ this Github repo](https://github.com/creativecommons/cc-license-chooser) with an updated license chooser process etc. The add-on has been installed thousands of times and usually has couple of hundred uses a month based on statistics. -I hope in the future to use an API call to support different languages, and perhaps embed RDF into the Google doc if that is possible to make the license machine-searchable. This is a fun project that is useful and helping teach me more about coding and best practices for open source software. +I hope in the future to use an API call to support different languages, and perhaps embed RDF into the Google doc if that is possible to make the license machine-searchable. This is a fun project that is useful and helping teach me more about coding and best practices for open source software. ### How -1. In Google Docs, select “Get Add-ons” +1. In Google Docs, select “Get Add-ons” ![How-to screenshot 1](image2.png) @@ -51,9 +51,9 @@ I hope in the future to use an API call to support different languages, and perh [A video tutorial is available here](https://youtu.be/sQZFlNXEVZ4) or by clicking on the image below. Video tutorial ---- +--- [No rights reserved](http://creativecommons.org/publicdomain/zero/1.0/) for the content of this blog post by the author. diff --git a/content/blog/entries/2019-11-25-empowering-collaboration/contents.lr b/content/blog/entries/2019-11-25-empowering-collaboration/contents.lr index ba293dadc..56cc85570 100644 --- a/content/blog/entries/2019-11-25-empowering-collaboration/contents.lr +++ b/content/blog/entries/2019-11-25-empowering-collaboration/contents.lr @@ -42,7 +42,7 @@ the following strategies: - Participate in social media networks to actively recruit groups who are underrepresented in FOSS (free and open source software) -- Collect anonymous demographic data +- Collect anonymous demographic data A couple of organization examples: diff --git a/content/blog/entries/2020-03-05-involucrate-gsoc-outreachy-es/contents.lr b/content/blog/entries/2020-03-05-involucrate-gsoc-outreachy-es/contents.lr index 85ac46bda..a3e4009ed 100644 --- a/content/blog/entries/2020-03-05-involucrate-gsoc-outreachy-es/contents.lr +++ b/content/blog/entries/2020-03-05-involucrate-gsoc-outreachy-es/contents.lr @@ -14,7 +14,7 @@ body: En Creative Commons creemos firmemente en que el código abierto es una gran herramienta para fomentar y desarrollar productos con un enfoque comunitario y, a su vez, la consolidación de una comunidad activa de contribuyentes al patrimonio común (o, en inglés, *Commons*). -Con el fin de fomentar la participación de estudiantes en nuestros proyectos de código abierto, CC es parte de los programas que ofrece Google ([Google Summer of Code](https://summerofcode.withgoogle.com/)) y también [Outreachy](https://www.outreachy.org/). En ambos casos, el objetivo es involucrar a estudiantes en el código abierto. Para ello, hacemos un llamado abierto a todos y todas quienes tengan interés en colaborar con nuestro equipo, de postular a los llamados cuanto antes. +Con el fin de fomentar la participación de estudiantes en nuestros proyectos de código abierto, CC es parte de los programas que ofrece Google ([Google Summer of Code](https://summerofcode.withgoogle.com/)) y también [Outreachy](https://www.outreachy.org/). En ambos casos, el objetivo es involucrar a estudiantes en el código abierto. Para ello, hacemos un llamado abierto a todos y todas quienes tengan interés en colaborar con nuestro equipo, de postular a los llamados cuanto antes. #### Google Summer of Code Programa impulsado por Google el cual existe desde el año 2005 el cual ha impulsado a mas de 15.000 estudiantes de mas de 118 paises a involucrarse con diversas organizaciones que abogan por el código abierto. diff --git a/content/blog/entries/2022-12-16-new-to-working-in-open/contents.lr b/content/blog/entries/2022-12-16-new-to-working-in-open/contents.lr index 373c3b196..6f542d2e5 100644 --- a/content/blog/entries/2022-12-16-new-to-working-in-open/contents.lr +++ b/content/blog/entries/2022-12-16-new-to-working-in-open/contents.lr @@ -13,7 +13,7 @@ body: I began working at Creative Commons (CC) as the Full Stack Engineer this year and it’s been amazing to get to work in the open at CC. But as someone who has been working in closed, internal source environments for a very long -time it’s definitely been a learning experience and a perspective shift. +time it’s definitely been a learning experience and a perspective shift. For years I benefited from, observed, and offered up personal work into the world of open source, but I was never deeply involved in other projects in @@ -27,7 +27,7 @@ larger community of contributors around the world. It's been refreshing and rewarding, but it's also been enlightening. There's so much that's different now. Working in the open doesn't just shift the terms under which your code is licensed or how many people can contribute, it -requires a significant shift in both approach and process. +requires a significant shift in both approach and process. For example, working in the open means that while there may be community members eager to contribute they may lack contextual understanding that someone more @@ -44,7 +44,7 @@ documentation about the codebase, as well as detailed known issues, roadmaps, etc. All of it needs to be documented and written out, which not only benefits the community contributors, but also benefits the project as a whole. It means key information has to live in the open alongside the code -it informs. It's truly a win-win all around. +it informs. It's truly a win-win all around. The process also has to shift, you can't just make a list of things you want to tackle and get to work, you have to consider how each item can be smoothly @@ -58,11 +58,11 @@ view on the overall roadmap and goals the project hopes to meet. If you are the steward of a codebase any task list you create or *issues* you identify are ultimately not just for you alone. Putting an item on your list when you're working alone isn't enough, you've also got to find time to work -on that item, and work your way through completing it. +on that item, and work your way through completing it. In the open source context, working with a community of contributors, creating an *issue* is just as important and meaningful as writing code, in many cases -it might actually be MORE important. Because *issues* are often the way in +it might actually be MORE important. Because *issues* are often the way in which contributors first offer up help and insight, they're the first contact they have with your project. Furthermore, any *issue* you create may end up getting completed by one or more people that are not you, which means it diff --git a/content/blog/entries/2023-02-01-outreachy-mid-point/contents.lr b/content/blog/entries/2023-02-01-outreachy-mid-point/contents.lr index 7fedd8c88..204af72c9 100644 --- a/content/blog/entries/2023-02-01-outreachy-mid-point/contents.lr +++ b/content/blog/entries/2023-02-01-outreachy-mid-point/contents.lr @@ -24,6 +24,6 @@ However, there were some project goals that took longer than expected to complet Additionally, I had to prioritize certain tasks over others and make adjustments to my plan as necessary. -The new CSS I have written so far already makes the website's layout responsive. I have also created a new script.js file and started working on the neccessary functionalities of the website. I plan to implement all feedback gotten from my mentors and debug any remaining issues. Additionally, I will be working on improving the website's overall performance by implementing several optimization techniques as necessary. +The new CSS I have written so far already makes the website's layout responsive. I have also created a new script.js file and started working on the neccessary functionalities of the website. I plan to implement all feedback gotten from my mentors and debug any remaining issues. Additionally, I will be working on improving the website's overall performance by implementing several optimization techniques as necessary. Overall, My aim is to ensure that the website is fully functional and user-friendly for all users. diff --git a/content/blog/entries/2023-08-25-machine-layer/contents.lr b/content/blog/entries/2023-08-25-machine-layer/contents.lr index 8f112524d..26f251cdb 100644 --- a/content/blog/entries/2023-08-25-machine-layer/contents.lr +++ b/content/blog/entries/2023-08-25-machine-layer/contents.lr @@ -83,7 +83,7 @@ systems. - **Updated License Information**: - License information has been updated to reflect the latest permissions and restrictions. This ensures users and systems are informed accurately. -- **Alignment with RDF Best Practices**: +- **Alignment with RDF Best Practices**: - Changes align the representation with RDF best practices. This boosts interoperability and compatibility, thanks to standardized namespaces, consistent naming, and proper relationship definitions. diff --git a/content/blog/entries/add-new-sections-descriptions-help-texts-code-examples-schemas-and-serializers/contents.lr b/content/blog/entries/add-new-sections-descriptions-help-texts-code-examples-schemas-and-serializers/contents.lr index 6ae1be564..048827370 100644 --- a/content/blog/entries/add-new-sections-descriptions-help-texts-code-examples-schemas-and-serializers/contents.lr +++ b/content/blog/entries/add-new-sections-descriptions-help-texts-code-examples-schemas-and-serializers/contents.lr @@ -20,17 +20,17 @@ Welcome to my third blog entry! For week 5 and 6, I added new sections, descript ### Week 5 -For this week, I managed to add a lot of stuff into the documentation. -I figured out how to add help texts to classes and how to create serializers. -I also managed to move all code examples under response samples. -In order to do this, I created a new class called CustomAutoSchema to add [x-code-samples](https://github.com/Redocly/redoc/blob/master/docs/redoc-vendor-extensions.md#x-codesamples). -Other stuff that I did include creating new sections such as “Register and Authenticate” and “Glossary”. +For this week, I managed to add a lot of stuff into the documentation. +I figured out how to add help texts to classes and how to create serializers. +I also managed to move all code examples under response samples. +In order to do this, I created a new class called CustomAutoSchema to add [x-code-samples](https://github.com/Redocly/redoc/blob/master/docs/redoc-vendor-extensions.md#x-codesamples). +Other stuff that I did include creating new sections such as “Register and Authenticate” and “Glossary”. The hardest part of this week is probably trying to figure out how to add request body examples and move code examples. ### Week 6 -For week 6, I added another section called Contribute that provides a todolist to start contributing on Github. +For week 6, I added another section called Contribute that provides a todolist to start contributing on Github. I also wrote and published this blog post. ---- diff --git a/content/blog/entries/add-query-using-curl-command-and-provide-response-samples/contents.lr b/content/blog/entries/add-query-using-curl-command-and-provide-response-samples/contents.lr index ffcb0b69e..ba0b1fad9 100644 --- a/content/blog/entries/add-query-using-curl-command-and-provide-response-samples/contents.lr +++ b/content/blog/entries/add-query-using-curl-command-and-provide-response-samples/contents.lr @@ -23,7 +23,7 @@ So, the first two weeks of Google Season of Docs have passed. For the first week ### Week 2 -For the second week, I started to write response samples. It was tough as I have a hard time understanding [drf-yasg](https://github.com/axnsan12/drf-yasg), which is an automatic Swagger generator. It can produce Swagger / OpenAPI 2.0 specifications from a Django Rest Framework API. I tried to find as many examples as I could to increase my understanding. Funny, but it took me awhile to realise that drf-yasg is not made up of random letters. The DRF part stands for Django Rest Framework while YASG stands for Yet Another Swagger Generator. +For the second week, I started to write response samples. It was tough as I have a hard time understanding [drf-yasg](https://github.com/axnsan12/drf-yasg), which is an automatic Swagger generator. It can produce Swagger / OpenAPI 2.0 specifications from a Django Rest Framework API. I tried to find as many examples as I could to increase my understanding. Funny, but it took me awhile to realise that drf-yasg is not made up of random letters. The DRF part stands for Django Rest Framework while YASG stands for Yet Another Swagger Generator. ---- diff --git a/content/blog/entries/add-response-samples-and-descriptions-for-api-endpoints/contents.lr b/content/blog/entries/add-response-samples-and-descriptions-for-api-endpoints/contents.lr index 903d78148..2528a5119 100644 --- a/content/blog/entries/add-response-samples-and-descriptions-for-api-endpoints/contents.lr +++ b/content/blog/entries/add-response-samples-and-descriptions-for-api-endpoints/contents.lr @@ -19,17 +19,17 @@ Well, hello again 👋! For week 3 and week 4, I added response samples and desc ### Week 3 -Week 3 was quite hectic. I moved back to my hometown during week 3. -Took 3 days off to settle my stuff and set up a workspace. -I worked on my GSoD project for only 2 days, Monday and Tuesday. -I managed to create response samples for most API endpoints. +Week 3 was quite hectic. I moved back to my hometown during week 3. +Took 3 days off to settle my stuff and set up a workspace. +I worked on my GSoD project for only 2 days, Monday and Tuesday. +I managed to create response samples for most API endpoints. Had a monthly video call with Kriti this week. ### Week 4 -For this week, I reviewed what I’ve done and what I haven’t to estimate new completion time. -Thank god, I have a buffer week in my GSoD timeline and deliverables. -So yeah, all is good in terms of completion time. +For this week, I reviewed what I’ve done and what I haven’t to estimate new completion time. +Thank god, I have a buffer week in my GSoD timeline and deliverables. +So yeah, all is good in terms of completion time. I started to write descriptions for API endpoints. Submitted first PR and published blog entry. diff --git a/content/blog/entries/building-the-cc-global-components-library/contents.lr b/content/blog/entries/building-the-cc-global-components-library/contents.lr index 83576927a..734b1ad4e 100644 --- a/content/blog/entries/building-the-cc-global-components-library/contents.lr +++ b/content/blog/entries/building-the-cc-global-components-library/contents.lr @@ -16,7 +16,7 @@ body: During the course of my Outreachy internship with the Creative Commons, I got to work on some cool projects, one of which is the CC Global Components library supervised by my mentor [Brylie Christopher Oxley](/blog/authors/brylie/). -Having a unified design theme/look or experience accross the different CC websites has always been an important factor while developing these +Having a unified design theme/look or experience accross the different CC websites has always been an important factor while developing these websites. With this in mind, there are several components which are part of most CC web properties. The three components in particular are:- @@ -24,16 +24,16 @@ With this in mind, there are several components which are part of most CC web pr - ** The Global footer ** : displayed on most Creative Commons properties - ** The Explore CC component ** : displayed on all CC web properties, such as Global Summit etc. -Instead of having each project implement these components leading to code duplication accross projects and maintenance issues, we decided it was -preferable +Instead of having each project implement these components leading to code duplication accross projects and maintenance issues, we decided it was +preferable to have a seperate library of these components which finally led to the CC Global Components project. ### Choosing a technology -The goal of the Global components library was to build a custom web component that can be served via CDN. While planning, we needed to decide on +The goal of the Global components library was to build a custom web component that can be served via CDN. While planning, we needed to decide on the technology to use. Agreeably, most web frameworks like React and Vue can be used to develop this but we wanted -a simple implementation with fewer dependencies. Some ideal characteristics of what we were looking for was a technology that meets the following +a simple implementation with fewer dependencies. Some ideal characteristics of what we were looking for was a technology that meets the following criteria: - Web Standards oriented @@ -41,24 +41,24 @@ criteria: - Lightweight / small bundle size - Loosely coupled (no tight or unrelated dependencies) -The two primary technologies we were considering were [Vue JS](https://v3.vuejs.org) and [Lightning Web Components](https://lwc.dev) but finally +The two primary technologies we were considering were [Vue JS](https://v3.vuejs.org) and [Lightning Web Components](https://lwc.dev) but finally decided to use Vue JS since we already had other projects developed in Vue (such as the Chooser project). ### Building the components -To scaffold the project, we used [Vue SFC rollup](https://www.npmjs.com/package/vue-sfc-rollup), which is a CLI templating utility that scaffolds -a minimal setup for compiling a library of multiple Vue SFCs (Single File Components) - into a form ready to share via npm. With this, -we could just focus on building the templates. We used [Vocabulary CSS](https://cc-vocabulary.netlify.app/), our own CC design package to style +To scaffold the project, we used [Vue SFC rollup](https://www.npmjs.com/package/vue-sfc-rollup), which is a CLI templating utility that scaffolds +a minimal setup for compiling a library of multiple Vue SFCs (Single File Components) - into a form ready to share via npm. With this, +we could just focus on building the templates. We used [Vocabulary CSS](https://cc-vocabulary.netlify.app/), our own CC design package to style the components. #### 1) CC Global Footer The CC Global Footer component was the easiest given that it's mostly static HTML. This component takes two attributes: -- `logo-url`: which should point to the logo of the website it is used on. +- `logo-url`: which should point to the logo of the website it is used on. - `donation-url`: which is used for the donation button. -After importing the CDN script for the CC Global components, we can then use the CC Global footer in any page as such: +After importing the CDN script for the CC Global components, we can then use the CC Global footer in any page as such: ```HTML ``` -and this renders as shown below: +and this renders as shown below: ![CC Explore](cc_explore.gif) @@ -94,8 +94,8 @@ The CC Global Header was an important component given that we had to make API ca such as the [Licenses and Tools](https://github.com/creativecommons/cc-legal-tools-app). We used the Axios library for the API calls to the Wordpress backend of the parent project [Projec_creativecommons.org](https://github.com/creativecommons/project_creativecommons.org). -The CC Global Header has three required attributes, `base-url`, `donation-url` and `logo-url`, which are the URLs used for the API call, -Donation button and Logo respectively. There is one additional attribute `use-menu-placeholders` you can set which renders placeholder Menu Items +The CC Global Header has three required attributes, `base-url`, `donation-url` and `logo-url`, which are the URLs used for the API call, +Donation button and Logo respectively. There is one additional attribute `use-menu-placeholders` you can set which renders placeholder Menu Items if you are in a development environment. However, for a stagin/production setup we do not use this attribute. ```HTML @@ -107,20 +107,20 @@ if you are in a development environment. However, for a stagin/production setup /> ``` -and this renders as shown: +and this renders as shown: ![CC Global Header](cc_global_header.png) ### Conclusion The first version of this library (0.1.1) was released and published to NPM on Dec 10, 2021. Till date [the time of this writing] we have had several -changes and optimizations to the code and are currently on version `0.5.0`. This was a really enriching experience for me as it was my first time +changes and optimizations to the code and are currently on version `0.5.0`. This was a really enriching experience for me as it was my first time working with Vue JS. We've also had additional code review and optimizations from [Timid Robot](/blog/authors/TimidRobot/). -The CC Global Components with all 3 components used renders as: +The CC Global Components with all 3 components used renders as: ![CC global components](cc_global_components.gif) -You can find the CC Global Components project at: +You can find the CC Global Components project at: - GitHub: [CC Global Components](https://github.com/creativecommons/cc-global-components) - NPM: [cc-global-components](https://www.npmjs.com/package/@creativecommons/cc-global-components) diff --git a/content/blog/entries/cc-browser-extension-a-gsoc-project/contents.lr b/content/blog/entries/cc-browser-extension-a-gsoc-project/contents.lr index c61b74201..7418745ac 100644 --- a/content/blog/entries/cc-browser-extension-a-gsoc-project/contents.lr +++ b/content/blog/entries/cc-browser-extension-a-gsoc-project/contents.lr @@ -37,7 +37,7 @@ Most people who will download the extension would want it to act as their own. T One good example of solving this problem would be to make an options page, that would open in a new tab, where they can set the filters they use more often as default. -Similarly, if we add a feature, say dark-mode, then the users who prefer it over the default can set their preferences. +Similarly, if we add a feature, say dark-mode, then the users who prefer it over the default can set their preferences. ### Work Done till Now: @@ -63,4 +63,4 @@ The development is still in the initial phase but you can check out **a working Tell us what you expect the extension to do, or a feature that you wish would be implemented. At this early stage of development, it might help us improve our goals. You can join the discussion on `#gsoc-browser-ext` channel on [slack](http://creativecommons.github.io/community/#slack). -I would like to thank [Alden](https://creativecommons.org/author/aldencreativecommons-org/), [Timid](https://creativecommons.org/author/timidcreativecommons-org/) and [Kriti](https://creativecommons.org/author/kriticreativecommons-org/) to for their mentorship and providing an experienced perspective and solutions to the problems faced by this naive developer. +I would like to thank [Alden](https://creativecommons.org/author/aldencreativecommons-org/), [Timid](https://creativecommons.org/author/timidcreativecommons-org/) and [Kriti](https://creativecommons.org/author/kriticreativecommons-org/) to for their mentorship and providing an experienced perspective and solutions to the problems faced by this naive developer. diff --git a/content/blog/entries/cc-browser-extension-upcoming-improvements/contents.lr b/content/blog/entries/cc-browser-extension-upcoming-improvements/contents.lr index bca5fc24c..7be011fb2 100644 --- a/content/blog/entries/cc-browser-extension-upcoming-improvements/contents.lr +++ b/content/blog/entries/cc-browser-extension-upcoming-improvements/contents.lr @@ -15,7 +15,7 @@ pub_date: 2020-06-01 --- body: -[CC Search Extension](https://opensource.creativecommons.org/ccsearch-browser-extension/) is a cross-browser extension, which lets you search for and filter content that is under Creative Commons licenses. It was developed as one of the CC projects during Google Summer of Code program of 2019. +[CC Search Extension](https://opensource.creativecommons.org/ccsearch-browser-extension/) is a cross-browser extension, which lets you search for and filter content that is under Creative Commons licenses. It was developed as one of the CC projects during Google Summer of Code program of 2019. ### The story so far It's release on the extension stores of [Chrome](https://chrome.google.com/webstore/detail/cc-search-browser-extensi/agohkbfananbebiaphblgcfhcclklfnh), [Firefox](https://addons.mozilla.org/en-US/firefox/addon/cc-search-extension/), and [Opera](https://addons.opera.com/en/extensions/details/cc-search/) was accompanied by several announcements on twitter, which were well received by the community and thus the number of weekly users (i.e the number of users that have used the extension at least once during last week) drastically increased. After several weeks, the extension reached 22,000+ weekly users. But, now this number fluctuates between 9,000 - 10,000. So, what happened? Well, the majority of initial influx were the "curious" folks who just wanted to check out this new tool and later thought that this was not something that they might find useful. But, there are also users that are looking for a similar tool and the lack of sufficient features and capabilities of the extension made them stop using it and look for better alternatives. @@ -32,7 +32,7 @@ The extension, currently supports only 3 filters: `license`, `sources`, and `use Showing related Images will help users find a variety of images that fit their requirements and also explore the images that would not usally show up on the initial pages of the search result. Whereas, Image tags will let the users incrementally make their queries better and more specific. #### Adding "Browse by Sources" section -Even though the users can see the number of sources supported by the extension from the drop-down filter, they might not be familiar with many of them and what kind of content they provide. Adding a "Browse by Source" section would make users appreciate the sources available for them to choose from. +Even though the users can see the number of sources supported by the extension from the drop-down filter, they might not be familiar with many of them and what kind of content they provide. Adding a "Browse by Source" section would make users appreciate the sources available for them to choose from. #### Improving the bookmarks section Significant improvements will be made to the bookmarks section. First of all, the bookmarks data will be cached so that unnecessary requests to the Catalog API can be avoided. The bookmarks will be organized by the dates and this will also facilitate adding some type of filter mechanism. In addition to this, pagination in the bookmarks section and support for named bookmark file exports will be added diff --git a/content/blog/entries/cc-browser-extension-week5-6/contents.lr b/content/blog/entries/cc-browser-extension-week5-6/contents.lr index 5141913c5..858599820 100644 --- a/content/blog/entries/cc-browser-extension-week5-6/contents.lr +++ b/content/blog/entries/cc-browser-extension-week5-6/contents.lr @@ -17,7 +17,7 @@ body: For the context, I am working on my GSoC project that is to make a browser-extension to search CC Licensed content in the public domain by interacting with CC Catalog API. **Previous Blogs**: -- [CC Browser Extension - A GSoC Project](http://creativecommons.github.io/blog/entries/cc-browser-extension-a-gsoc-project/) +- [CC Browser Extension - A GSoC Project](http://creativecommons.github.io/blog/entries/cc-browser-extension-a-gsoc-project/) ### Work Done These couple weeks were spent on finishing the filter section, setting up infinite-scroll and fixing issues and bugs after some days of testing. diff --git a/content/blog/entries/cc-browser-extension-week7-8/contents.lr b/content/blog/entries/cc-browser-extension-week7-8/contents.lr index d08d20e53..29fcd4e1d 100644 --- a/content/blog/entries/cc-browser-extension-week7-8/contents.lr +++ b/content/blog/entries/cc-browser-extension-week7-8/contents.lr @@ -1,64 +1,64 @@ -title: CC Browser Extension Week 7, 8 ---- -categories: -gsoc -gsoc-2019 -cc-browser-extension -open-source ---- -author: makkoncept ---- -series: gsoc-2019-browser-extension ---- -pub_date: 2019-07-19 ---- -body: - -For the context, I am working on my GSoC project that is to make a browser-extension to search CC Licensed content in the public domain by interacting with CC Catalog API. - -**Previous Blogs**: -- [CC Browser Extension - A GSoC Project](https://opensource.creativecommons.org/blog/entries/cc-browser-extension-a-gsoc-project/) -- [CC Browser Extension Week 5, 6](https://opensource.creativecommons.org/blog/entries/cc-browser-extension-week5-6/) - -### Work Done -A couple of major features were added to the browser extension these weeks like image Info and attribution preview and options UI. Also, the extension now has dark mode :) -
-
-CC Browser Extension Welcome Screen (light) -
-
-
-
-CC Browser Extension Welcome Screen (dark) -
-
- -Now, when the user clicks on the image thumbnail, a popup with image information (like title, creator link, license, and provider links), attribution (both rich text and HTML version) and social media share links opens. - -We also tried to figure out a way to let users download the images and attribute the creator with the least possible clicks. Right now there are two user-flows: -1. If the users need only one image, they can press the `Download Image` button and copy the desired attribution to the clipboard. -2. If they need to download multiple images in a single session, they can press `Download Image and Attribution`. This will download both versions (rich text and HTML) of attribution in a text file of the same name as the image file. - -
-
-CC Browser Extension Image Info/Attribution popup(light) -
-
-
-
-CC Browser Extension Image Info/Attribution popup(dark) -
-
-These workflows work perfectly on chrome. Whereas, Firefox prompts the user to confirm the download with a popup (if they have not already set a default option) and this results in termination of the browser-extension session. So, the purpose of the second workflow of letting the user continue the session while also downloading the images fails here. I also had a discussion about this with the mentors and we brainstormed some possible solutions. I have written about a possible solution in the 'Coming Up' section below. - -I also added an options page to the extension which will open in a new tab and the user can set default filters here. So, now they don't have to apply the filters that they often use again and again for every session. - - -### Coming Up -- Add a bookmark feature. A lot of the users may want to save multiple images for later to check out. I think this might also provide a better solution to the problem discussed above. -- Test and fix bugs. -- Improve documentation. - -You can check out the project on [Github](https://github.com/creativecommons/ccsearch-browser-extension) and join the discussion on `#gsoc-browser-ext` channel on [slack](https://opensource.creativecommons.org/community/#slack). - +title: CC Browser Extension Week 7, 8 +--- +categories: +gsoc +gsoc-2019 +cc-browser-extension +open-source +--- +author: makkoncept +--- +series: gsoc-2019-browser-extension +--- +pub_date: 2019-07-19 +--- +body: + +For the context, I am working on my GSoC project that is to make a browser-extension to search CC Licensed content in the public domain by interacting with CC Catalog API. + +**Previous Blogs**: +- [CC Browser Extension - A GSoC Project](https://opensource.creativecommons.org/blog/entries/cc-browser-extension-a-gsoc-project/) +- [CC Browser Extension Week 5, 6](https://opensource.creativecommons.org/blog/entries/cc-browser-extension-week5-6/) + +### Work Done +A couple of major features were added to the browser extension these weeks like image Info and attribution preview and options UI. Also, the extension now has dark mode :) +
+
+CC Browser Extension Welcome Screen (light) +
+
+
+
+CC Browser Extension Welcome Screen (dark) +
+
+ +Now, when the user clicks on the image thumbnail, a popup with image information (like title, creator link, license, and provider links), attribution (both rich text and HTML version) and social media share links opens. + +We also tried to figure out a way to let users download the images and attribute the creator with the least possible clicks. Right now there are two user-flows: +1. If the users need only one image, they can press the `Download Image` button and copy the desired attribution to the clipboard. +2. If they need to download multiple images in a single session, they can press `Download Image and Attribution`. This will download both versions (rich text and HTML) of attribution in a text file of the same name as the image file. + +
+
+CC Browser Extension Image Info/Attribution popup(light) +
+
+
+
+CC Browser Extension Image Info/Attribution popup(dark) +
+
+These workflows work perfectly on chrome. Whereas, Firefox prompts the user to confirm the download with a popup (if they have not already set a default option) and this results in termination of the browser-extension session. So, the purpose of the second workflow of letting the user continue the session while also downloading the images fails here. I also had a discussion about this with the mentors and we brainstormed some possible solutions. I have written about a possible solution in the 'Coming Up' section below. + +I also added an options page to the extension which will open in a new tab and the user can set default filters here. So, now they don't have to apply the filters that they often use again and again for every session. + + +### Coming Up +- Add a bookmark feature. A lot of the users may want to save multiple images for later to check out. I think this might also provide a better solution to the problem discussed above. +- Test and fix bugs. +- Improve documentation. + +You can check out the project on [Github](https://github.com/creativecommons/ccsearch-browser-extension) and join the discussion on `#gsoc-browser-ext` channel on [slack](https://opensource.creativecommons.org/community/#slack). + *Special Thanks*: [Alden](https://creativecommons.org/author/aldencreativecommons-org/), [Timid](https://creativecommons.org/author/timidcreativecommons-org/) and [Kriti](https://creativecommons.org/author/kriticreativecommons-org/) \ No newline at end of file diff --git a/content/blog/entries/cc-browser-extension-week9-10/contents.lr b/content/blog/entries/cc-browser-extension-week9-10/contents.lr index fd8c07401..8bb77003f 100644 --- a/content/blog/entries/cc-browser-extension-week9-10/contents.lr +++ b/content/blog/entries/cc-browser-extension-week9-10/contents.lr @@ -27,7 +27,7 @@ These weeks were spent on adding bookmarking feature to the extension and writin On clicking the bookmark icon, that appears when the image thumbnail is hovered on, the image will be bookmarked. Under the hood, the unique image identifier is saved in the local storage of the extension. The images ids are enough to get all the required image and attribution data needed using the `/image/{identifier}` endpoint of [CC Catalog API]([https://github.com/creativecommons/cccatalog-api](https://github.com/creativecommons/cccatalog-api). -Bookmarked images persist even when the extension session terminates. User can view, inspect, delete the bookmarked images in the _'Bookmarks'_ section. There is also a button to delete all the bookmarks at once. +Bookmarked images persist even when the extension session terminates. User can view, inspect, delete the bookmarked images in the _'Bookmarks'_ section. There is also a button to delete all the bookmarks at once. To let the users organize and share bookmarks, importing and exporting feature is also added. The buttons to export and import the bookmarks are in the options page. Simple `json` files are used for this feature. diff --git a/content/blog/entries/cc-catalog-wrapping-gsoc20/contents.lr b/content/blog/entries/cc-catalog-wrapping-gsoc20/contents.lr index 51b5f3788..81314bba3 100644 --- a/content/blog/entries/cc-catalog-wrapping-gsoc20/contents.lr +++ b/content/blog/entries/cc-catalog-wrapping-gsoc20/contents.lr @@ -1,6 +1,6 @@ title: CC Catalog: wrapping up GSoC20 --- -categories: +categories: cc-catalog gsoc @@ -11,15 +11,15 @@ author: srinidhi series: gsoc-2020-cccatalog --- pub_date: 2020-08-25 ---- +--- body: With the summer of code coming to an end, this blog post summarises the work done during the last three months. The project I have been working on is to add more provider API scripts to the CC Catalog. The CC Catalog project is responsible for collecting CC licensed images hosted across the web. -The internship journey has been great , and I was glad to get the opportunity to understand more about the working of the data pipeline. My work during the internship mainly involved researching new API providers and checking if they meet the necessary conditions, then we decided on a strategy to crawl the API. The strategy varies according to different APIs: some can be partitioned based on date, others have to be paginated . Script is written for the API according to the strategy. -During the later phase of the internship, I had worked on the reingestion strategy for europeana and a script to merge Common Crawl tags and metadata to the corresponding image in the image table. +The internship journey has been great , and I was glad to get the opportunity to understand more about the working of the data pipeline. My work during the internship mainly involved researching new API providers and checking if they meet the necessary conditions, then we decided on a strategy to crawl the API. The strategy varies according to different APIs: some can be partitioned based on date, others have to be paginated . Script is written for the API according to the strategy. +During the later phase of the internship, I had worked on the reingestion strategy for europeana and a script to merge Common Crawl tags and metadata to the corresponding image in the image table. -Provider API implemented : -- Science Museum : Science Museum collection has around 60,000 images and was initially crawled through Common Crawl and shifted to API based crawl. +Provider API implemented : +- Science Museum : Science Museum collection has around 60,000 images and was initially crawled through Common Crawl and shifted to API based crawl. - Issue: [Science Museum ticket][science_museum_issue] - Related PRs: [Science Museum script][science_museum_script], [Science Museum workflow][science_museum_workflow] @@ -62,8 +62,8 @@ Iconfinder is a provider of icons that could not be integrated as the current st ## Europeana reingestion strategy -Data collected from europeana was collected on a daily basis and there was a need to refresh it. The idea is that new data should be refreshed more frequently and as the data gets old, refreshing should become less frequent. While developing the strategy the API key limit and maximum collection expected is to be kept in mind. Considering these factors, a workflow was set up such that each day it crawls 59 days of data. -The 59 days were split up into layers. The DAG crawls daily up to 1 week old data then it crawls monthly for data more than 1 week old and less than a year old data, anything older than a year is crawled every 3 months. +Data collected from europeana was collected on a daily basis and there was a need to refresh it. The idea is that new data should be refreshed more frequently and as the data gets old, refreshing should become less frequent. While developing the strategy the API key limit and maximum collection expected is to be kept in mind. Considering these factors, a workflow was set up such that each day it crawls 59 days of data. +The 59 days were split up into layers. The DAG crawls daily up to 1 week old data then it crawls monthly for data more than 1 week old and less than a year old data, anything older than a year is crawled every 3 months. - Issue: [Europeana reingestion ticket][europeana_reingestion_issue] - Related PR: [Europeana reingestion strategy][europeana_reingestion_strategy] @@ -84,7 +84,7 @@ More details regarding the math of reingestion: [Data reingestion][data_reingest ## Merging Common Crawl tags When a provider is shifted from Common Crawl to API based crawl, the new data from API doesn’t have tags and metadata that were generated using clarifai and hence there is need to associate the new data with the tags corresponding to that image from the Common Crawl data. A direct url match is not possible as the Common Crawl urls and API image url are different, so we try to match it on the number or identifier that is associated with the url. -Currently the merging logic is applied to Science Museum, Museums Victoria and Met Museum . +Currently the merging logic is applied to Science Museum, Museums Victoria and Met Museum . In Science Museum, API url in image table is like https://coimages.sciencemuseumgroup.org.uk/images/240/862/large_BAB_S_1_02_0017.jpg and CC url is like https://s3-eu-west-1.amazonaws.com/smgco-images/images/369/541/medium_SMG00096855.jpg . So the idea is to reduce the url to the last identifier like number , so after the modification of the url by modify_urls function it looks like ```gpj.1700_20_1_S_BAB_``` (API url) and ```gpj.55869000GMS_``` (CC url) . Similar logic has been applied to met museum and museum victoria. @@ -93,7 +93,7 @@ Similar logic has been applied to met museum and museum victoria. ## Acknowledgement -I would like to thank my mentors Brent and Anna for their guidance throughout the internship. +I would like to thank my mentors Brent and Anna for their guidance throughout the internship. diff --git a/content/blog/entries/cc-chooser-lastweek/contents.lr b/content/blog/entries/cc-chooser-lastweek/contents.lr index 482fe2da4..69b3d5ad8 100644 --- a/content/blog/entries/cc-chooser-lastweek/contents.lr +++ b/content/blog/entries/cc-chooser-lastweek/contents.lr @@ -15,7 +15,7 @@ series: gsoc-2019-chooser Summer of Code is drawing to a close, and so is my work on the chooser project (well, sort of, I'll continue to help build and support the project for some time). These have easily been some of the best months of my life for a number of reasons. I remember vividly getting my first Slack message from my mentor, Breno, and being in utter disbelief that I was chosen. I have been fortunate enough to have this project be my own, complete with a whole Git repo just for this project! My work started with an empty GitHub repository, and bunch of planning: layout, wireframes, usability testing, etc. It wasn't until week 3 when I could start on actually building the site. To reflect on this time, I thought it would be cool to take a look back at the most significant PRs of the last three months. - + - My first PR with anything of substance was [#9 - Refactor Modal System](https://github.com/creativecommons/cc-chooser/pull/9). This changed the site's modal system from using a cooler approach to one not quite as flashy, but the old method was lacking some critical features. - Then came [#10 - Add Chooser Functionality](https://github.com/creativecommons/cc-chooser/pull/10). This was a pretty big deal, as this brought the functionality behind the main point of the page: the License Chooser. - Next up, [#11](https://github.com/creativecommons/cc-chooser/pull/11): a big change to page layout. This changed the layout of the help section itself, and the layout of the page as a whole. diff --git a/content/blog/entries/cc-chooser-week2/contents.lr b/content/blog/entries/cc-chooser-week2/contents.lr index 81f0fd8bf..48f9bbc23 100644 --- a/content/blog/entries/cc-chooser-week2/contents.lr +++ b/content/blog/entries/cc-chooser-week2/contents.lr @@ -18,7 +18,7 @@ series: gsoc-2019-chooser This week, I worked on the site layout and usability testing of the existing chooser. I enjoyed usability testing more than I expected. During my mid-week standup meeting with Breno, he had some suggestions about the layout that came up in usability testing I was conducting literally the same day. - I thought this was kinda funny, it was like my tester and Breno had talked about the layout between the meeting and testing. + I thought this was kinda funny, it was like my tester and Breno had talked about the layout between the meeting and testing. I really enjoyed working on the layout. Experimenting with different layouts and visual styles, seeing the changes and the progression from iteration to iteration was a really fun process, and I really enjoyed seeing the new site take shape right before my eyes. diff --git a/content/blog/entries/cc-chooser-week4/contents.lr b/content/blog/entries/cc-chooser-week4/contents.lr index d926a58d9..33ed710fe 100644 --- a/content/blog/entries/cc-chooser-week4/contents.lr +++ b/content/blog/entries/cc-chooser-week4/contents.lr @@ -16,17 +16,17 @@ series: gsoc-2019-chooser I am working with my mentor, [Breno Ferreira](https://creativecommons.org/author/brenoferreira/), to visually overhaul the tool, as well as making it more educational, and more usable. This week, I worked on some of the Chooser's JS functionality, and the educational section's layout. - + Getting the chooser working was lots of fun. It's the first real JS I've written for the site (other than the modal system, which is about 6 lines total). The thing I like most is seeing the site take shape. Previously, the chooser was all dummy controls and placeholder text. Now, it has a fully working chooser, complete with the license icons being updated, the link to the license page being set, etc. The thing that took the longest was getting the code to return the correct license based on the user's inputs. There was some - temperamental code, and the starting state variables were also a little off. - - Next up was the educational section. I decided to go from a three column layout, to a to row layout. I moved the question buttons - to the top row (now alone), and the license icon descriptions (and example) to the bottom row. This allows the icon descriptions + temperamental code, and the starting state variables were also a little off. + + Next up was the educational section. I decided to go from a three column layout, to a to row layout. I moved the question buttons + to the top row (now alone), and the license icon descriptions (and example) to the bottom row. This allows the icon descriptions to be longer without looking really weird, and gives a space for an example of how the license name, icons, and permissions are related. - I also decided to swap the positions of the educational section and the chooser section on the page. This means that the chooser is + I also decided to swap the positions of the educational section and the chooser section on the page. This means that the chooser is now at the top of the page, with the educational section (re-labelled as "Confused? Need Help?") below it. - + Next week, I'll be working on getting a new version of the license chooser controls working. diff --git a/content/blog/entries/cc-datacatalog-data-processing-2/contents.lr b/content/blog/entries/cc-datacatalog-data-processing-2/contents.lr index 826690002..f927e232f 100644 --- a/content/blog/entries/cc-datacatalog-data-processing-2/contents.lr +++ b/content/blog/entries/cc-datacatalog-data-processing-2/contents.lr @@ -12,7 +12,7 @@ author: soccerdroid --- series: gsoc-2019-dataviz --- -pub_date: 2019-07-26 +pub_date: 2019-07-26 --- body: @@ -70,7 +70,7 @@ The thresholds for the quantity of images and links are my intuitions from havin ### Coming soon - Extraction of unique nodes, and links. -- Visualization with the data. +- Visualization with the data. - Development or modification of pruning/filtering rules. You can follow the project development in the [Github repo](https://github.com/cc-archive/cccatalog-dataviz). diff --git a/content/blog/entries/cc-datacatalog-data-processing-3/contents.lr b/content/blog/entries/cc-datacatalog-data-processing-3/contents.lr index 87e93261c..979572c4d 100644 --- a/content/blog/entries/cc-datacatalog-data-processing-3/contents.lr +++ b/content/blog/entries/cc-datacatalog-data-processing-3/contents.lr @@ -12,7 +12,7 @@ author: soccerdroid --- series: gsoc-2019-dataviz --- -pub_date: 2019-08-12 +pub_date: 2019-08-12 --- body: @@ -23,7 +23,7 @@ project. This is a continuation of my last blog post about the data processing part 2 of the CC-data catalog visualization project. I recommend you to read that [last post](https://opensource.creativecommons.org/blog/entries/cc-datacatalog-data-processing-2/) for a better understanding of what I'll explain here. -Hello! In this post I am going to talk you about the extraction of unique nodes, and links, and the visualization of the force-directed graph with the processed data. +Hello! In this post I am going to talk you about the extraction of unique nodes, and links, and the visualization of the force-directed graph with the processed data. ### Nodes and links generation @@ -35,9 +35,9 @@ The nodes and links will be visualized using [force-graph](https://github.com/va - A domain can be repeated not only within a chunk, but in different chunks too. - Source and target must have licensed content -So as you can see, dealing with duplications is not that trivial when you have a lot of data. Next what I tried was to analyze smaller files, in order to be able to keep the data in memory in a single DataFrame. So for each TSV file I had before, now I have several small TSV files. This may extend the data analysis, but it can smooth the coding complexity. +So as you can see, dealing with duplications is not that trivial when you have a lot of data. Next what I tried was to analyze smaller files, in order to be able to keep the data in memory in a single DataFrame. So for each TSV file I had before, now I have several small TSV files. This may extend the data analysis, but it can smooth the coding complexity. -I first started by formatting the data into source and target columns to generate the unique nodes for the graph. I iterate through each row of the current DataFrame I have (the one with provider_domain, cc_licences,links column, etc), and by reading the _links_ column, I load the json of each row. For each key in the json, I create a new row with provider_domain as source, they key as target, and the value of the key as a _value_ feature. I append that new row to a new DataFrame. I build a new row each time I read a line, so I have a DataFrame with all the links of a single provider_domain. When I finish iterating over the rows, I convert the DataFrames to list and save the output. That is how I get a new DataFrame containing all the existing links of the graph, with source, target and value columns. Yeah! +I first started by formatting the data into source and target columns to generate the unique nodes for the graph. I iterate through each row of the current DataFrame I have (the one with provider_domain, cc_licences,links column, etc), and by reading the _links_ column, I load the json of each row. For each key in the json, I create a new row with provider_domain as source, they key as target, and the value of the key as a _value_ feature. I append that new row to a new DataFrame. I build a new row each time I read a line, so I have a DataFrame with all the links of a single provider_domain. When I finish iterating over the rows, I convert the DataFrames to list and save the output. That is how I get a new DataFrame containing all the existing links of the graph, with source, target and value columns. Yeah! diff --git a/content/blog/entries/cc-datacatalog-data-processing/contents.lr b/content/blog/entries/cc-datacatalog-data-processing/contents.lr index 52d1d50f9..967524b2f 100644 --- a/content/blog/entries/cc-datacatalog-data-processing/contents.lr +++ b/content/blog/entries/cc-datacatalog-data-processing/contents.lr @@ -21,7 +21,7 @@ Search (now [Openverse](https://openverse.org/)). Please also see the [Quantifying the Commons](https://github.com/creativecommons/quantifying) project. -Welcome to the data processing part of the GSoC project! In this blog post, I am going to tell you about my first thoughts with the real data, and give you some details of the implementation developed so far. +Welcome to the data processing part of the GSoC project! In this blog post, I am going to tell you about my first thoughts with the real data, and give you some details of the implementation developed so far. ### Data Extraction @@ -36,7 +36,7 @@ Spark is used again in this project to extract the data, in the form of parquet -Each file can easily contain dozens of millions of rows. My first aproach is to load the information in a Pandas Dataframe, but this can become very slow. Therefore, I will test the scripts for the data processing with a portion of the real data. Afterwards, I will use [Dask](https://dask.org/) with the entire dataset. Dask provides advanced parallelism for analytics, and has a behaviour similar to Pandas. +Each file can easily contain dozens of millions of rows. My first aproach is to load the information in a Pandas Dataframe, but this can become very slow. Therefore, I will test the scripts for the data processing with a portion of the real data. Afterwards, I will use [Dask](https://dask.org/) with the entire dataset. Dask provides advanced parallelism for analytics, and has a behaviour similar to Pandas. ### Cleansing and Filtering @@ -59,7 +59,7 @@ In the dataset, we have domain names in the form of URLs. But we want to make th The main part is the extraction of the domain name. This will be applied to the _provider\_domain_ and _links_ fields in order to build the graph. The domain names will be the ones displayed over the nodes, as depicted in [my first blog post](https://creativecommons.github.io/blog/entries/cc-datacatalog-visualization/). ### License Validation -Another important aspect is the licenses types. In the dataset, we do not have the exact license name; rather, we have a URL that directs to the license definition on [creativecommons.org](https://creativecommons.org). We have developed a [function](https://github.com/creativecommons/cccatalog/blob/master/src/providers/api/modules/etlMods.py#L75) with some regular expressions to validate the correct format of these URls, and extracts from them the license name and version. This information will be shown in the pie chart that appears after the user clicks on a node. +Another important aspect is the licenses types. In the dataset, we do not have the exact license name; rather, we have a URL that directs to the license definition on [creativecommons.org](https://creativecommons.org). We have developed a [function](https://github.com/creativecommons/cccatalog/blob/master/src/providers/api/modules/etlMods.py#L75) with some regular expressions to validate the correct format of these URls, and extracts from them the license name and version. This information will be shown in the pie chart that appears after the user clicks on a node. ``` 'https://creativecommons.org/licenses/by/4.0/' #valid license diff --git a/content/blog/entries/cc-datacatalog-data-thelinkedcommons/contents.lr b/content/blog/entries/cc-datacatalog-data-thelinkedcommons/contents.lr index 9f932907d..3f70465a2 100644 --- a/content/blog/entries/cc-datacatalog-data-thelinkedcommons/contents.lr +++ b/content/blog/entries/cc-datacatalog-data-thelinkedcommons/contents.lr @@ -12,7 +12,7 @@ author: soccerdroid --- series: gsoc-2019-dataviz --- -pub_date: 2019-09-03 +pub_date: 2019-09-03 --- body: @@ -53,7 +53,7 @@ Moreover, any visualization library starts to render the elements slower, and at Insights like this are valuable for Creative Commons, because they can help with outreach efforts, targeted communications and for the CC Search team to choose which domains to include in the CC Search tool. -The final graph is interactive. Users can pan, zoom in and out, hover over a node to see its neighbors and neighbors of neighbors, and click on a node to display a pie chart. +The final graph is interactive. Users can pan, zoom in and out, hover over a node to see its neighbors and neighbors of neighbors, and click on a node to display a pie chart. ### Pie chart visualization @@ -73,10 +73,10 @@ There are some nodes that we do not have information about their CC licenses (an I implemented the following: -- The node size is proportional to the number of CC licensed content in each domain. -- When the user hovers over a node, a label with the domain name is displayed. This might sound redundant when you can see the node perfectly. But the graph is very big, and you will like to see it in a low zoom level in order to have a picture of the shape of the entire graph. This is when this functionality is useful, because you don't have to zoom in in order to see the name of a node. +- The node size is proportional to the number of CC licensed content in each domain. +- When the user hovers over a node, a label with the domain name is displayed. This might sound redundant when you can see the node perfectly. But the graph is very big, and you will like to see it in a low zoom level in order to have a picture of the shape of the entire graph. This is when this functionality is useful, because you don't have to zoom in in order to see the name of a node. - The force of a link between two nodes (_node A_ and _node B_) is given by the number of links _node A_ has that references _node B_. -- When you hover over a node, you can also see the links to its neighbors highlighted, as well as the links to the neighbors of the neighbors. This feature make it pretty easy for you to find communities, and see how strongly connected a node is in the graph. +- When you hover over a node, you can also see the links to its neighbors highlighted, as well as the links to the neighbors of the neighbors. This feature make it pretty easy for you to find communities, and see how strongly connected a node is in the graph. Here is the final visualization, using a sample data from one month of the Common Crawl data: @@ -92,7 +92,7 @@ Here is the final visualization, using a sample data from one month of the Commo
-You can check the whole project source code in the [Github repo](https://github.com/cc-archive/cccatalog-dataviz/tree/master/GSoC2019). +You can check the whole project source code in the [Github repo](https://github.com/cc-archive/cccatalog-dataviz/tree/master/GSoC2019). ### Final comments and future work This was my first experience with big data visualization, and I really enjoyed it! @@ -106,7 +106,7 @@ There are features that could be implemented in the future in order to reduce th [2D version] (http://ec2-3-80-82-250.compute-1.amazonaws.com/) -[3D version] (http://ec2-3-80-82-250.compute-1.amazonaws.com/visualization_3d.html) +[3D version] (http://ec2-3-80-82-250.compute-1.amazonaws.com/visualization_3d.html) *Yes, with my mentor Sophine, we thought it could be a great idea to try with a 3D version of the graph :) . You can interact with the graph in the same way as with the 2d version. CC Data Catalog Visualization is my GSoC 2019 project under the guidance of [Sophine diff --git a/content/blog/entries/cc-datacatalog-visualization/contents.lr b/content/blog/entries/cc-datacatalog-visualization/contents.lr index 9c4b6d814..2233fc78b 100644 --- a/content/blog/entries/cc-datacatalog-visualization/contents.lr +++ b/content/blog/entries/cc-datacatalog-visualization/contents.lr @@ -28,7 +28,7 @@ The landscape of licensed content is wide and varied. We have domains linking to ### Hands to work! -Currently, there are tons of graphs and visualization concepts that have proven to work better with certain data. In addition, because of the huge amount of data we possess, a critical point to keep in mind is that the graph must keep being meaningful (for example, with the classic node-link approach, you can end up having a confusing hairball). Hence, the visualization must be scalable. Finding which visualization would work the best with the CC catalog dataset was my first task. I reviewed the state-of-art in graphs visualization, but in this post I'm not going to go deep with it. The highlights and conclusions of this review are the following: +Currently, there are tons of graphs and visualization concepts that have proven to work better with certain data. In addition, because of the huge amount of data we possess, a critical point to keep in mind is that the graph must keep being meaningful (for example, with the classic node-link approach, you can end up having a confusing hairball). Hence, the visualization must be scalable. Finding which visualization would work the best with the CC catalog dataset was my first task. I reviewed the state-of-art in graphs visualization, but in this post I'm not going to go deep with it. The highlights and conclusions of this review are the following: - Two case studies were reviewed, named [Papilio](https://onlinelibrary.wiley.com/doi/abs/10.1111/cgf.12395) and [GeneaQuilts](https://ieeexplore.ieee.org/document/5613445). - In both cases, data needs to be hierarchical (or have an attribute to use so to order or group the data) . @@ -39,17 +39,17 @@ Currently, there are tons of graphs and visualization concepts that have proven **Conclusion:** neither of the solutions reviewed seemed viable for the CC-Catalog visualization project, as they do not show huge improvements in scalability. In addition, they do not fit for our case study. ### The graph -Now that I knew what to draw, the next step was to find HOW to draw the graph. There is a wide range of visualization libraries where to choose from, you might feel overwhelmed at first. **Tip:** define priorities/ key aspects according to what you want to visualize. Performance with big data, community support and a smooth learning curve were the aspects I checked for making a decision. We finally chose to work with [force-d3](https://github.com/vasturiano/force-graph), an open-source library that uses the d3-force module as the underlying physics engine. It has a friendly and simple API, as well as documentation and examples. +Now that I knew what to draw, the next step was to find HOW to draw the graph. There is a wide range of visualization libraries where to choose from, you might feel overwhelmed at first. **Tip:** define priorities/ key aspects according to what you want to visualize. Performance with big data, community support and a smooth learning curve were the aspects I checked for making a decision. We finally chose to work with [force-d3](https://github.com/vasturiano/force-graph), an open-source library that uses the d3-force module as the underlying physics engine. It has a friendly and simple API, as well as documentation and examples. -One challenging task was to draw the nodes. The idea here is to show the domains names inside the nodes, and let the nodes size to be data driven. Funny fact: I spent hours before finding out that I received nodes as canvas objects with the API. Once you know how to draw nodes, the styling of the edges is a ride, because in force-d3 they are handled in the same way as the nodes objects. The edges are highlighted and their width increases when the user hovers over them. Another not-so-easy feature I implemented was to highlight a node and its neighbors. Here I have to thank to [Mr. Vasturiano](https://github.com/vasturiano), the author of the library, who directed me to an example he developed with a very similar functionality. He is continuously checking and fixing open issues in his repo. Great library! I recommend you to check it out. +One challenging task was to draw the nodes. The idea here is to show the domains names inside the nodes, and let the nodes size to be data driven. Funny fact: I spent hours before finding out that I received nodes as canvas objects with the API. Once you know how to draw nodes, the styling of the edges is a ride, because in force-d3 they are handled in the same way as the nodes objects. The edges are highlighted and their width increases when the user hovers over them. Another not-so-easy feature I implemented was to highlight a node and its neighbors. Here I have to thank to [Mr. Vasturiano](https://github.com/vasturiano), the author of the library, who directed me to an example he developed with a very similar functionality. He is continuously checking and fixing open issues in his repo. Great library! I recommend you to check it out. ### Licensed content by types of CC licenses -Creative Commons has [6 license types](https://creativecommons.org/licenses/). We know which licenses each domain uses for their content, so it would be great if we can show to the public, for example, the most popular license. The idea is then, to display in a simple graph like a pie chart, the licensed content of each domain, classified by their type. The pie chart will be placed inside a modal. The modal will be triggered and showed after the user clicks on a node in the graph. For this visualization, we are using [Highcharts](https://www.highcharts.com/). +Creative Commons has [6 license types](https://creativecommons.org/licenses/). We know which licenses each domain uses for their content, so it would be great if we can show to the public, for example, the most popular license. The idea is then, to display in a simple graph like a pie chart, the licensed content of each domain, classified by their type. The pie chart will be placed inside a modal. The modal will be triggered and showed after the user clicks on a node in the graph. For this visualization, we are using [Highcharts](https://www.highcharts.com/). -The final look of the front-end is: +The final look of the front-end is:
CC Data Catalog Visualization
diff --git a/content/blog/entries/cc-gsoc-accepted-students/contents.lr b/content/blog/entries/cc-gsoc-accepted-students/contents.lr index a4a7295dc..7a1b4790c 100644 --- a/content/blog/entries/cc-gsoc-accepted-students/contents.lr +++ b/content/blog/entries/cc-gsoc-accepted-students/contents.lr @@ -26,7 +26,7 @@ We reshuffled some of the mentors for the projects around (from our original [Pr Read more [about the accepted proposals here][cc-gsoc-projects]. Please don’t lose heart if you were not accepted! We were amazed at the quality of proposals we got and we had to make some really tough decisions. We appreciate the hard work you put into your proposals and contributing to our code and we wish we could accept more proposals. We hope that you’ll stick around and [be part of the CC developer community][contributing-code]. - + If you’re interested in working on one of our remaining [project ideas][project-ideas] as a volunteer outside of Google Summer of Code, please reach out to me. These are all things that CC is invested in building and we can provide guidance. We are especially looking for help with [supercharging our search indexer][supercharging-elasticsearch], improving our tooling around license translations (including [automated license link checking][automated-license-link-checking]), and [plugins for integrating CC Search with external creation platforms][cc-search-plugins]. If we have multiple volunteers for the same idea, we will ask them to work together. *A note about availability*: all CC staff have very limited availability over the next week because of[ CC Global Summit][cc-summit] and CC is closed for most of the following week so we won’t be around at all then. Please be patient, we’ll respond to you as soon as we can. diff --git a/content/blog/entries/cc-link-checker/contents.lr b/content/blog/entries/cc-link-checker/contents.lr index b50694c72..a0db1abb0 100644 --- a/content/blog/entries/cc-link-checker/contents.lr +++ b/content/blog/entries/cc-link-checker/contents.lr @@ -67,5 +67,5 @@ You can follow the project on Github: [creativecommons/cc-link-checker](https:// The project is approaching its completion. Can't wait to see it in production. -*Signing off +*Signing off Bhumij Gupta* diff --git a/content/blog/entries/cc-platform-toolkit-revamp-2/contents.lr b/content/blog/entries/cc-platform-toolkit-revamp-2/contents.lr index 4eeb3f3ec..8c89e455b 100644 --- a/content/blog/entries/cc-platform-toolkit-revamp-2/contents.lr +++ b/content/blog/entries/cc-platform-toolkit-revamp-2/contents.lr @@ -17,6 +17,6 @@ Time is really flying! I can hardly believe this is already week 5 as of my inte Since the holidays got a bit in the way of user testing existing platform implementations, my mentors and I slightly altered the schedule. These past weeks have been dedicated to reworking the content that is currently online (revisiting writing and format) and suggesting an entirely new structure. I've been documenting that process [here](https://drive.google.com/open?id=1g7_76zFmgtqq7khb_aBS-l2Wx8tScMgR1NPBO0vOdkM) as I tackle each section. -What I really loved about this process is the fact that I'm able to see the platform toolkit through a holistic lens: I'm thinking about how the user is currently interacting with this content, how the information could be made more palatable, how that can be achieved visually, and finally, how to do it in a way that won't be impossible to implement. +What I really loved about this process is the fact that I'm able to see the platform toolkit through a holistic lens: I'm thinking about how the user is currently interacting with this content, how the information could be made more palatable, how that can be achieved visually, and finally, how to do it in a way that won't be impossible to implement. I'm excited about the questions and answers that have been coming up during this process. Even though I'm far from delivering a final version, I believe this first rough draft already brings important improvements. My focus has been on diminishing cognitive load. That means taking very dense content and delivering the same amount of information with an approach that steps away from full copy and comes closer to a UI format. I've been learning a lot with each step of this project and I'm eager to make a positive contribution by the end of the internship. Soon enough I'll be checking in again soon to give a new follow up! diff --git a/content/blog/entries/cc-platform-toolkit-revamp-3/contents.lr b/content/blog/entries/cc-platform-toolkit-revamp-3/contents.lr index 22556fe01..7dc61a908 100644 --- a/content/blog/entries/cc-platform-toolkit-revamp-3/contents.lr +++ b/content/blog/entries/cc-platform-toolkit-revamp-3/contents.lr @@ -17,6 +17,6 @@ Last time I checked-in, I was working on revisiting the current [Platform Toolki It was a lot of work, but I'm finally happy with the wire-frame that came out after all the research and experimentation. The original material went through quite a few modifications, with text rewrites and changes in the order and organization of the content. But now comes the important part: making sure that these changes make sense to the users. My vision is already a little skewed, since I've been immersed in this project for the past 7 weeks. From now on, the process of validating this material needs fresh eyes. That way, improvements can be made based on user feedback, and reflect the best possible version when it the time comes to implement. -For the next two weeks my schedule is focusing on two different activities: I'll be going over a round of user interviews where I intend to show my wireframe and present a few tasks. The idea is to see how both content and usability perform in this new format. In parallel, I've also began taking these wire-framed components and sketching them out in a more refined UI format by experimenting with color, type, and so on. I really wanted to get an early start on this task—even if it's subject to change as the research gives me further insights—because I feel making the visual part come together will be the hardest part for me. +For the next two weeks my schedule is focusing on two different activities: I'll be going over a round of user interviews where I intend to show my wireframe and present a few tasks. The idea is to see how both content and usability perform in this new format. In parallel, I've also began taking these wire-framed components and sketching them out in a more refined UI format by experimenting with color, type, and so on. I really wanted to get an early start on this task—even if it's subject to change as the research gives me further insights—because I feel making the visual part come together will be the hardest part for me. Thankfully, I have very supportive mentors and help all-around from the CC staff and community. I'm really happy with how this project is coming together and I hope in two weeks I can come back here and report on great progress! diff --git a/content/blog/entries/cc-platform-toolkit-revamp-4/contents.lr b/content/blog/entries/cc-platform-toolkit-revamp-4/contents.lr index eac249e2e..29f7db173 100644 --- a/content/blog/entries/cc-platform-toolkit-revamp-4/contents.lr +++ b/content/blog/entries/cc-platform-toolkit-revamp-4/contents.lr @@ -17,7 +17,7 @@ Phew, how is it possible that we're already in March? Between my last check-in a There were two rounds of user interviews: the first one was done looking only at the wireframe, focusing exclusively on usability and content. This provided a lot of important feedback to adjust the structure and move forward with the UI. I tried my best to apply the design standards already mapped out by the [Vocabulary](https://github.com/creativecommons/vocabulary/) project, so that in the future this wouldn't stand out from other CC materials. -The second round of user interviews was done with a UI prototype, and was meant to take away some of my doubts regarding a few choices in this project. For instance, would it be best for the user to have the information in a single page, or would that be too much, and best to split in separate pages? There were pros and cons to both scenarios, but in the end we decided to stick with the single-page format because of a particular behavior: one of the ways users search for content is to ctrl+F and look for keywords. That would be drastically less effective with multiple pages. +The second round of user interviews was done with a UI prototype, and was meant to take away some of my doubts regarding a few choices in this project. For instance, would it be best for the user to have the information in a single page, or would that be too much, and best to split in separate pages? There were pros and cons to both scenarios, but in the end we decided to stick with the single-page format because of a particular behavior: one of the ways users search for content is to ctrl+F and look for keywords. That would be drastically less effective with multiple pages. After all the feedback (and adjustments) were done, it was time to make things come to life. I built the page from scratch with HTML/CSS/JS, making use of the Vocabulary library - it should be live soon, so yay! diff --git a/content/blog/entries/cc-platform-toolkit-revamp/contents.lr b/content/blog/entries/cc-platform-toolkit-revamp/contents.lr index 3214fab32..30ebb5a24 100644 --- a/content/blog/entries/cc-platform-toolkit-revamp/contents.lr +++ b/content/blog/entries/cc-platform-toolkit-revamp/contents.lr @@ -13,7 +13,7 @@ series: outreachy-dec-2019-platform-toolkit pub_date: 2019-12-16 --- body: -I've just finished my second week as an intern for CC (part of the Outreachy program 2019-2020 round), helping revamp the [Platform Toolkit](https://creativecommons.org/platform/toolkit/) and it's been a great learning experience so far. +I've just finished my second week as an intern for CC (part of the Outreachy program 2019-2020 round), helping revamp the [Platform Toolkit](https://creativecommons.org/platform/toolkit/) and it's been a great learning experience so far. Before we get into what I've been doing, I wanted to quickly explain why I think this is such a cool project - maybe you'll think it's cool too and want to collaborate as well at some point :) See, the better this toolkit is, the more we increase our chances of having really great, user-friendly implementations in partner platforms. Providing a helpful toolkit is one of the ways we can guide users towards finding better information, understanding licenses, and making better use of content -- not to mention that getting this right ensures we reach a ton of people. diff --git a/content/blog/entries/cc-search-accessibility-week3-4/contents.lr b/content/blog/entries/cc-search-accessibility-week3-4/contents.lr index 16a43be65..c2a80d31a 100644 --- a/content/blog/entries/cc-search-accessibility-week3-4/contents.lr +++ b/content/blog/entries/cc-search-accessibility-week3-4/contents.lr @@ -23,7 +23,7 @@ During this period I removed all the hardcoded strings from the static pages whi 1. About Page 2. Collections Page 3. Search Guide Page -4. Feedback Page +4. Feedback Page All of the above pages were then internationalized following the same procedure as detailed in the previous post. While internationalizing the homepage we ran into an interesting problem: diff --git a/content/blog/entries/cc-search-accessibility-week7-8/contents.lr b/content/blog/entries/cc-search-accessibility-week7-8/contents.lr index 37dfae1b8..81ee7fb3f 100644 --- a/content/blog/entries/cc-search-accessibility-week7-8/contents.lr +++ b/content/blog/entries/cc-search-accessibility-week7-8/contents.lr @@ -77,8 +77,8 @@ The corrected code is: ```