-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python 3 support #62
Comments
Hi Tom, I'm a graduate in computational linguistics and would like to contribute to Pattern. Can you be more explicit about how Pattern should support Python 3? That is, do you want to maintain two different branches in parallel, one for Python 2 and one for 3? Or do you want to have a single code base that works both with 2 and 3? In the latter case, a library such as six would be useful. Let me know what you think. Cheers, |
Hi Peter, My goal would be to have a single code base that works with 2 and 3, but I have little experience with Python 3 so I don't know how feasible it is. In any case, the task is becoming more urgent so I will start looking into it more. I took a look at Any help is appreciated! Let me know what you think. Best, |
👍 on a single codebase. I think the first stage is to add travis for testing (I looks like you're missing a
Happy to help if you can pass a |
Got through the first two steps outlined by @hayd in this fork (repo has a Some of the tests need to be excluded. from test.py
Travis is just running |
Ran futurize on the codebase. Here are some preliminary findings:
I'm a bit new to python, so any feedback is appreciated. This is a beautiful library and I'd love to see it get the unicode love from python 3. |
My 2cents:
Not sure what to do about API keys, was wondering what other modules e.g. pandas did for those parts... IIRC there may be keys you can use for testing of clipped results... Perhaps it makes sense to create a PR for this and comment there, then you can comment on specific bits of code :) ... first pass tests then make pretty |
There's an "official" fork of Pattern with the specific aim of making it compatible with Python 3: The wiki has some more information: The compatibility update is supported by a grant from the Python Software Foundation. This money is to be divided among contributors. You can read the grant proposal here: The fork is initiated by myself, Waylon Flinn and David Branner. Everyone (Peter & hayd?) is welcome to join as admin of the project. As admin, you'll be able to edit anything so feel free to take initiative! (we do encourage pull requests, so we can keep track of who did what) |
Happy to help with this, however when I tried (and trying again just now) running the tests I get a load of exceptions (python 2.7). I suspect this is just initial set up on my machine... What do I need installed / setup to run the test suite (locally)? Assuming fresh python install (or env) the following is failing:
See to the travis run in the above fork: https://travis-ci.org/pinleague/pattern/builds/32799385 (this is the kind of thing that's failing though that's a couple of months old). |
Hi Andy, My knowledge of Travis is zero, but different people including yourself have suggested it as a first step so I will examine it more closely. Looking at the output of the link you provided, these look like typical Python 2 vs 3 errors, e.g., using Best, |
@tom-de-smedt Lots of stuff to migrated to python 3 but this can really only done with confidence once tests pass (and at the moment I can't get them passing either locally or on travis on python 2.7!!!). At the moment they (the python 2.7 tests) fail with errors from the bottom of this page: https://travis-ci.org/pinleague/pattern/jobs/32799386. Any ideas why? |
Hi Tom, as I wrote at the beginning of this year, I'm still interested in contributing to pattern. However, I have not started yet because I didn't really know where to start. But now there exists a concrete plan and I would like to be part of it. I haven't written Python code for more than a year now but it should be easy for me to get into it again (I wrote a lot of Python code during my studies and I like the language very much). Last but not least, I have been out of the computational linguistics area since I started my current job a year ago, but it would be great to deal with that stuff again. Some things are not yet clear to me:
I cannot tell you yet which module I would prefer to work on. First, I need to take a look at the code again. I'm not sure though whether it's a good idea to have a lot of admins for the fork. Working with pull requests is much better anyway due to the reasons you mentioned. |
This was partially my misunderstanding (!), just running I had to capture a few actual test failures and some HTTP403Forbidden and HTTP404NotFounds. There's also a couple of proper errors (in python 2), which for now I'm skipping those tests, but they really need looking at, I've labelled them FIXME in my branch (should I PR to pattern3 or here once passing?)... As I said above, it See hayd/pattern@c5d9c23...ce1fe81 (and on travis https://travis-ci.org/hayd/pattern/builds/39245044, unfortunately not quite passing python 2.6 and 2.7, I may have to skip/fix a couple more? Some tests seem flaky - especially those that compare e.g. to 0.771!). Note1: This allows the test suite to be run by simply calling Note2: I'm skipping the mysql tests atm, but that's no biggie to fix just an install in the yml (our objective is for no tests to be skipped on travis), the others are more important, but I'm afraid I need a patterns expert to look at the FIXMEs!! Just to clarify the objectives here:
|
To answer @pemistahl I don't think going fully py3 (and dropping support for py27 is (Edit: NOT) a good option for a library... for the next decade!). I would like to see a shared code base and drop support for python <= 2.5 (nearly every library is dropping python 2.5 support). I'd really like to see pattern3 (once ready) merge upstream into pattern. |
@hayd OK, I get your point. I'm okay with that. It just reminds me again of how unhappy I am about the Python 3.* transition in general across the Python community. Another question @tom-de-smedt : If working with pull requests is the preferred way for contribution, then why did you create the pattern3 fork? Anyone who wants to contribute would create their own fork anyway. Wouldn't it be sufficient to simply create a branch here in the main repo for this purpose? |
I've submitted a couple of PRs to the pattern3 branch, I think it makes sense to fix that up then merge back here (it's going to be easier to keep track of things if they are in separate repos, separate issues/PRs etc). I would strongly recommend downing-tools for a short-while (here on clips/pattern) - hopefully for only for a few weeks, and concentrate on the pattern3 branch/repo. I'm "somewhat hopeful" it's not a massive job (famous last words). Once the python3 imports are working it should be clearer where the hit list is going to be (I suspect the toughest are the str/bytes handling). |
Just to update those following at home, last night I got python 3 running all tests without syntax or import errors (of course, half those tests are failing), python 2 is still passing all the tests (except those tests which failed before migration which are skipped). (It did require ripping out the bundled (vendorized) packages and making them dependancies - I think this is a good idea anyway... so, more "home-testing" in python 2 may be a good idea before this update is merged back clip/pattern? esp. where there is poor coverage.) This means there is a more obvious hitlist of things to do. For those who want to help I recommend (once this is merged), attempting to make all the tests pass on specific testing files you're interested in (e.g. for database): $ nosetests test/test_db.py
$ nosetests test/test_db.py:TestClass
$ nosetests test/test_db.py:TestClass.test_method
$ nosetests test/test_db.py --pdb --pdb-fail # drop in when there's a failure/exception A more complete todo list issue: pattern3#5 I haven't really thought about how six fits here, IMO if it makes fixing a test easier then use it ? |
Hello, I'm looking forward to use Pattern with Python 3, because my work is written in it. I'm kind of confused with current state of Python 3 support. This package is not installable (at least, not through By the way, Python 3 is getting more and more focus today and it's very good idea to follow this trend. You use a lot of packages, somehow embedded which is definitely not good idea for the future (e.g. BeautifulSoup_v3.2.1 is not supported for years). |
@kotrfa pattern3/pattern isn't on pip yet (so not installable), the tests aren't passing for python 3 either so it's not ready for release yet - though quite a bit of work has been done. I think the plan is for this fork to become the pattern on pip (at least that's my understanding), and it'll support both python 2 and 3. In pattern3/pattern I've ripped out a load of the vendorised deps (which is perhaps why it looks like the code base is so different), for example beautiful soup. The tests from clips/pattern are still all there and all pass (in python 2), so nothing was removed in this process (I claim). If you'd like to help out, which would be fantastic, please clone pattern3/pattern and see if you can help with anything in the todo list (maybe pick a test file and get it passing in both python 2 and 3, perhaps the section you need in your work?). I have a few of the areas of the codebase passing already (in both python 2 and 3), IMO it's not a huge amount of work to go :) mostly fiddly unicode stuff, then we can get it out on pip... |
Hello, yeah - I was speaking about installing this fork, not Pattern3, which is, as you said, not available on pip. I don't really need any part of pattern currently - my work is almost done and I've found Pattern to late, unfortunately. Nevertheless, maybe I could replace some parts of my current code using Pattern and simplify it. In that case, I would definitely like to help. But it doesn't seem likely I'll do it in following weeks, since end of semester is coming. You have done quite a lot of amazing job by the way, thank you! |
FYI all, I did a little the last couple of days, now
Surprisingly these are py3 only failures (the py2 still passes)... That said, there are some hacks - especially the unicode workflow - which could be cleaned up. Edit: Too hasty in victory, I've nearly got vector working https://travis-ci.org/hayd/pattern/jobs/43751620 |
Thanks for the information! It is really promising. 👍 |
@tom-de-smedt actually the vector thing is a little weird, it looks like that vector tests fails about 50% of the time on python 3 although it passes all the time on python 2; from running the test 10 times on both. In a way it's good that I think we're into a place where expertise is needed! :) see pattern3#17 |
+1 for Python 3 support. I realize the need to support a mature, powerful, and loyal community of legacy Python users, but Python 3 is only going to get more relevant with time, not less. More importantly, Python 3 is just better. Its standard library organization is much cleaner, its syntax is more readable, and in many common cases it performs significantly better than Python 2 (speed and/or memory footprint). That said, it’s often tricker to port to Python 3 than it “feels” like it should be. For a while, To make the transition as painless as possible, I strongly recommend the Python-Future package. It is way more powerful than I believe it was mentioned earlier in this thread, but I just wanted to reiterate its awesomeness for anyone that might have missed it. Seriously—just browsing its documentation can evoke the inspiration to transition to a 2-3 compatible codebase. I haven’t used Pattern yet, but it also has excellent documentation (great job!). Unfortunately, my current research is in Python 3. That’s how I found my way to this page. I hope Pattern gets to Python 3 soon! Keep up the excellent work, and May The Source™ Be With You! |
@Zearin I used future to do the majority of the heavy lifting in the python 3 port, see the pattern3 repo. Please do try it out. |
How could you define the "state" of the project for porting Pattern into Python 3? I used two years ago for Python 2.7 and it was awesome, now I'm going to work with Python 3 and I would love to use it (Pattern) again! Thanks! |
Greetings, we came across this from here, and I just noticed that while a lot of the build looks stable, support for Python 3.3 seems not to be working? At least that is how I would interpret the Travis CI page. Thanks. |
I realize I'm a bit behind on keeping people following this issue up to date with the latest progress! Google Summer of Code is over, since a couple weeks now already, and it has brought substantial progress (see full list of commits). We are now in a position where we have a version on the So now the plan is to smooth out the rough edges and release a new major version Pattern 3.0 within the next months. There is really only one known bug at the moment that is solely related to Python 3 and it only affects the information gain tree classifier In the meantime, everybody feel free to check out the |
not sure how long this is gonna take. but as soon as the page is up again maybe you can add some info about that dev works with py3, what the restrictions are (what does not work yet) and how to install it. e.g.
might make a lot of people happy |
Sure, I will update the |
@markus-beuckelmann Is there any update on this? Is the current advice to build the development branch if we need python 3 support? |
@jpfairbanks, yes, if you need Python 3 support right now you can check out the |
On debian 9 |
I have installed
Any idea or help will be appreciated :) |
@masaguaro I have the same issue while trying to use gensim lemmatization. Maybe a recent push gone wrong by a mix of tabs and spaces? I'll open a new issue about this. Update: the last commit to pattern3.text.tree.py seems to be 3 years ago. |
Thank you @JanmajaySingh . I posted the same question on stackoverflow but I had no answer . It seems that there is not too much Python 3 support. Perhaps you are right, and it's just a mix of tabs and spaces, which shouldn't be difficult to fix (using Sublime, for example). Right now, I am doing some work with NLPTK, but I will keep your idea for future use. |
@masaguaro @JanmajaySingh (#217), you are using the deprecated |
@markus-beuckelmann Thanks! Issue #217 was closed. |
Hello @markus-beuckelmann @JanmajaySingh
Any idea ? Thank you in advance. |
Hey, Is there any time frame on an official release of the python3 support? Or an idea of how close it is to being ready? Thanks, |
If enough people are interested, especially if it helps them working. Maybe we can consider putting a bounty on this task? It would be nice if such work would be paid :) |
@zedrem @tuxayo Considering that the development branch was last updated 9 months ago, I guess the primary contributors have been busy. The dev branch in its current form works without issues (at least for me). You can refer to @markus-beuckelmann 's comment (March 6). |
@masaguaro I dunno if you could find a workaround to your issue, but I guess modifying the C source code to something like long long int might help. But it may break other modules in unexpected ways. I don't know any other details about your project though, you're better off asking on S/O. |
Yea thanks, I've read the thread and understand that you can use the tool from the development branch. For what I needed, I was happy to even use the pattern3 side-project that was set up initially and then discontinued. However, I really like patterns article extraction tool and want to incorporate it into another tool that I'm building. As far as i'm aware (?) there is no way to do that with pattern in its current condition. I'm guessing that you can't specify specific git branches in your requirements.txt for example? If anyone is aware of any similar article extraction tools, please let me know (but I'm conscious that it is off topic for this thread). |
@zedrem For me, I can install it by running this
|
@zedrem You can certainly specify commits in requirements.txt such as As @septiangilang says on ubuntu you will need |
A lot of work was done by @markus-beuckelmann during last year's GSoC. During this year's GSoC, @Xsardas1000 (Maksim Filim) is doing great work (Markus & me are mentoring). Check Max' progress here: https://github.com/clips/pattern/tree/devmodified We should be able to get out an "overall stable" official release by the end of the month, if everything goes well. If you notice things that don't work yet, please report them here. Better yet, if you want to help out, please let us know, we can give you some editing privileges and author credits to move things forward more quickly. As a side note, the documentation needs to move to a new location too (e.g., www.pattern3.net). Let us know if you'd want to contribute some web development skills to this end. Thanks for your patience, we're nearing a stable release of Pattern 3. |
Hy, I'm an undergrad student at Unicamp (Brazil) and got interested in helping this repo. I thought about starting with code coverage and found that there's some duplicated code at |
Pattern should start supporting Python 3. Looking at the amount of code, it is a non-trivial task and any help is much appreciated.
The text was updated successfully, but these errors were encountered: