-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port to Python3, GTK3, port mwlib to *-python alts #1
base: master
Are you sure you want to change the base?
Conversation
mwlib is from https://github.com/godiard/mwlib |
mwlib is likely upstream at https://github.com/pediapress/mwlib |
acb8ff1 adds files at top level of activity, which would be misleading to new developers. |
@quozl, I will remove off the unwanted files, but Its not working though :( |
@quozl , currently mwlib porting is not possible as what pediapress tells, the only option is to: |
Alright then, you've explained why our older mwlib has to be ported. Ensure that explanation is in the commit message of the port of our older mwlib. |
@quozl the creators of mwlib has commented that its currently not possible to port mwlib to Python3. For now they are using https://github.com/earwig/mwparserfromhell . |
Where did they comment? Why not possible? Everything is possible with code, eventually. |
If it were me, I'd offer to help. If you don't feel you can do that in the time available to you, then set it aside. |
So far, the commit e7b0e8e is the last functional version. The |
acb8ff1
to
c3dd688
Compare
How is this pull request going? 538fde2 seems to add redundant parentheses, as if 2to3 is run twice. |
I have probably ran 2to3 twice on some files, buy I will fix it. good news is mwlib is fixed by removing the unneeded functions. please review the .re files because I don't know if what I have done is right |
You'd have to explain 5be2eea, as I don't understand it. Please confirm you have tested creating a new activity bundle from a Wikipedia download, and that the bundle does work on Sugar with Python3? |
@quozl , When I try to compile mwlib after partially porting it, It asks for a constructor or a destrutor PS: |
@walterbender @Hrishi1999 @chimosky Please review. Thanks |
Thanks. Reviewed, not tested.
Overall I'm concerned that the removal of mwlib makes the creation of a new activity bundle a very expensive CPU operation. If it can be ported instead, it should be better. On the other hand, we've failed to keep the activity bundle up to date anyway, and a reasonable update rate may be once or twice every year. |
@quozl I can remove all the mwlib resources. Should I proceed with that then. Regarding CPU usage: to get the best performance at the lowest CPU usage, it is important to port the |
Sure. Keep mwlib. Learn C. It's not hard. Python is based on C. GIve yourself about ten hours to get started, and about a week to build the rest of the knowledge required. It is common in software engineering to learn a language only enough to fix something. In this situation, although I know C, and I know Python, what I don't know yet is how to write an extension for Python in C, nor how to do it for both Python 2 and Python 3. That's the skill set needed for mwlib porting. It is all empirical though; nothing needs to be a mystery. For myself I don't need to do this yet, because I've added Python 2 support to OLPC OS next release. |
Chapter 8 of Supporting Python 3 covers Migrating C Extensions. |
Yes, I read that during GCI, and I tried them too, but maybe, because I didn't do them good. I had learnt C momentarily, like the syntax, but understanding pointers were way beyond my understanding. I guess, unless I have some real life program to work on C with Python libs so that I can understand them more properly. I may try in future, maybe in a few months, |
Port to Python3 - Wikipedia Activity
This activity is in BETA testing. All testers are welcome.
Known Issues:
Installation
The installation of wikipedia activity is easier for the end user as its only necessary for them to install the
.xo
file.The minimal steps to make this possible is given below. Wikipedia Activity uses compressed
.bz
dumps in xml to create articles from wikipedia.go to dumps.wikipedia.org and download the latest dump in your preferred language
For my test purposes, I am going to download
simplewiki dump progress on 20200101
as of 2020-01-11. Thats the latest complete dump.Download the
*.bz2
link:Periodically pull new changes, as its getting updated quite frequently
Get the downloaded bz2 file directory
Change you directory to the language you wish to develop for.
For example, I am developing
en_simple
, soYou might need to install bzip2, if its not found. Google it
If there are a trailing numbers after the
.xml
consider removing it for better presentation and to prevent errors while creating a bundleFor example, change
ensimple....wiki.xml3116641313
toensimple....wiki.xml
Now verify that you now have two files,
search.db
andensimple....*.xml
in your folderProcess the dump file: Execute within the language folder
It might take minutes (for 70 MB files) to Hours for (300 MB files) and weeks for 16.5 GB database files. It completely depends on your CPU speed. At this moment,
pages_parser
, crawls through the files to create an index, links, etc to enhance faster access.Quoting Gonzalo Odiard
You can now confirm that
*.links
,*.templates
andredirects
are found. If the above process crashes with any error, these files will be created, but they are likely to be empty. Re-running the the above code will not help, you will have to cleanup the unwanted empty files generatedYou can now add or remove certain files from the blacklist or favorite txt. 's its use is assumed.
Perform the command
Alternatively, you can ignore balcklist and favourite by
The above command checks in
en_simple
configuration and extracts the wikitext ofSun
activity/activity.info
. Make necessary changesDependencies
Checklist
search-toolbar
dependencies onbrowse-activity : seach-toolbar
Credits
@Hrishi1999 for testing this on a new system and helping me find a lot of bugs
@quozl for global info
Disclaimer
This activity has been one of the most hardest tasks ever taken up my me. See the time taken to see how long this PR has been worked upon