Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3 todo list #5

Open
13 of 35 tasks
hayd opened this issue Oct 28, 2014 · 16 comments
Open
13 of 35 tasks

Python 3 todo list #5

hayd opened this issue Oct 28, 2014 · 16 comments

Comments

@hayd
Copy link
Member

hayd commented Oct 28, 2014

I've broken up python 3 migration (#1) into the following independent tasks.

Fix the (python 2) skipped tests see #3:

  • mysql (this just needs travis to install it I think this is no big deal, i.e. just a line to .travis.yml)
  • test command line (uses PIPE atm which doesn't work on py2.6)
  • test _keywords
  • skip MediaWikiArticle
  • I think there is a couple more from travis passing python 2 tests #3 (search for FIXME and TODO)

Python 3 stuff

test files passing on python 3

e.g. via nosetests test/test_xx.py

General stuff

  • use a single README (probably an rst is best)
  • add a MANIFEST.in - burn with fire the current os.walk stuff (other files?)
  • grab version number from the init file (rather than the pattern installation)
  • style is a bit random for importing py3 stuff atm (my fault) should be more consistent
  • remove testing suite functions (just use test_main).
  • some tests are pretty flaky (ie. numbers changing)
  • remove utf-8 print hack added to the examples (due to __future__ print_statement being sensitive to unicode) i.e. make all the things unicode. See Unicode all the things #12
  • some tests are ombscurified by being so class based (although some which are still not very dry), atm I prefixed these with Abstract... but they should probably be ABC (the key is they can't start with Test otherwise nose etc tries to run them - and fails).
  • javascript tests (?) currently not run - unclear if do they do anything?!
  • add examples to tests (e.g. have just run all the example py files, no assertions just running... but potentially could add some asserts?) - I nearly have done this.
  • work out which dependancies can stop being vendorized (see below)
  • depreciation messages of uses of dependancies (which have been updated), see travis/nose output
  • add travis, coveralls, landscape (sign up with pattern & pattern3 gh accounts and add this repo)
  • add banner for the above into README
  • pep8/docformatter all the things (this should be simultaneously merged with clips/pattern, otherwise merging in work to clips will be very very difficult ???)
  • decide on when this can be merged back upstream (IMO this should be asap, we don't require py3 to be ready just that py2 still works... and a quick pep8 storm :) )

Performace

  • come up with some benchmarks to compare python 2 and 3 (and potentially the old code base)
  • profile and see what can be improved...
@hayd
Copy link
Member Author

hayd commented Oct 29, 2014

I would also like to do a big (edit: automated) pep8 cleanup, but really that requires downing-tools on the main pattern repo (until this effort is ready to merge back in), as merge-conflicts after a pep8 storm can be expensive (so you want to do it quickly). Need confirmation that this is the game plan.

@hayd
Copy link
Member Author

hayd commented Oct 29, 2014

One big problem IMO is vendorising all these modules, which are clearly dependancies e.g. beautiful soup, simplejson, feedparser?!!? Waaat. These should all be removed and added as dependencies.

We don't want to port those or manage them... at all.

  • SVM (not looked into this at all, I missed it on first pass)
  • PyWordNet (should use updated version from nltk?) need to check patch are already added. wordnet issues clips/pattern#64
  • PdfMiner (needs python3 fixes, but they have tests + travis!) Author claims they will never release a python 3 version, but there is a fork/PR Python 3.4 compatibility, and nosetests euske/pdfminer#71 ... it may be easier to use pdfminer if py2, pdfminer3k if py3 (at least for now) :s
  • docx, not sure if this is python 3 ready (?) Edit: it is (I was looking at the old repo) the new one is https://github.com/python-openxml/python-docx
  • ["beautifulsoup4", "cherrypy", "feedparser", "simplejson"] I think all py3 already

@hayd
Copy link
Member Author

hayd commented Oct 30, 2014

@tom-de-smedt I have a branch with this running all 485 tests in python3 (and passing in python 2 as well, obviously the python 3 tests fail). Would be great if you could look at / merge the pending PRs.

I ripped out all the dependancies (as discussed above), it was less painful than I thought it would be (as well as being "the right thing to do"). As such the upcoming PR is somewhat larger (and potentially more controversial)...

@waylonflinn
Copy link

Thanks @hayd! You've done an impressive amount of stuff in the past couple of days! I'll take a look at those pull requests tomorrow morning and get them merged in.

@hayd
Copy link
Member Author

hayd commented Nov 1, 2014

Fab, thanks!

I just want to clarify my thoughts for moving forward, since it's still unclear how large a job it'll be to fully migrate we want to minimize git-pain later on...

  1. merge all the PRs (above) ;)
  2. pep8 clean up (happy to do this, it's just a script)
  3. point release of clips/pattern (before including change dependancies etc. which is a big change)
  4. merge pattern3 into clips/pattern (and turn on travis etc there too)
  5. continue/finish python 3 migration on pattern3... "at leisure"
  6. merge back into clips/pattern and have a major release version (probably 3.0).

This way any bug fixing merged/PRd on clips/pattern during python 3 development can be merged cleanly (after the 1000s of lines py3 syntax and pep8 changes).

Update: I feel less strongly about this now.

@EnricoGiampieri
Copy link

Hi, I'm interested in helping the port. I routinely work in python3 and I'm really interested in a "modern" version of this library. Any suggestion on where to start from?

@hayd
Copy link
Member Author

hayd commented Apr 9, 2015

@EnricoGiampieri I think getting test_db.py to run successfully is next up. Run the test and fix the errors/failures :)

@wannaphong
Copy link

fix sgmllib is depreciated in python 3 #21

@maxschaf
Copy link

hi, there was no action in this repo during the last year. anyone still working on this?

@souravsingh
Copy link

Hi, I would like to contribute to the Project by fixing some skipped tests. How do I start?

@hayd
Copy link
Member Author

hayd commented Sep 11, 2016

@souravsingh see the comments in #3. larger projects are test_web and test_db.

#17 was waiting on @tom-de-smedt, so this port has stalled.

@james-see
Copy link

@hayd for mysql, I would just add in mysqlclient as a requirement, so in travis.yml add in install: pip3 install mysqlclient (or add it to requirements.txt) mysqlclient is a drop in replacement for MySQLdb and even imports the same way as import MySQLdb.

@james-see
Copy link

@hayd how best for me to do this part myself? fork and pull request it for that specific issue?

@JimSEOW
Copy link

JimSEOW commented Nov 22, 2016

As part of the process of running Gensim in py3.5 Win64 Windows10, I made a few changes to pattern3
piskvorky/gensim#1025

Update [1st Dec 2016]

@i git clone https://github.com/pattern3/pattern; python setup.py develop in MiniConda py3.5 Win64
=> First problem: pypiwin32
=> I downloaded pypiwin32-219-cp35-none-win_amd64.whl
=> pip install pypiwin32-219-cp35-none-win_amd64.whl
=> repeat python setup.py develop with pattern3
=> Install works ok

FYI: I have not done all test yet!

applicationframehost_2016-12-01_17-27-00
applicationframehost_2016-12-01_17-27-41

@utkarshrai
Copy link

Is the project still on? I'd love to join in?

@tom-de-smedt
Copy link
Member

Update: As part of Google Summer of Code 2017, Markus Beuckelmann (@markus-beuckelmann) will be working on the future of Pattern (porting it to Python 3 is first on our list). Markus is admin of the main repo now and can handle pull requests and invite collaborators. Be sure to reach out to him and include him in discussions about the port. We'll focus on the main repo, and reintegrate what we can from the work done by @hayd here. Hopefully we can make some progress over the summer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants