-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plans for a future rewrite #111
Comments
A slight amendment to the plan described above: I will not be planning the type algebra out on paper before writing any code. This is because I found it impossible to specify the type signature of every helper function without writing implementations. It simply wasn't clear the return types I was picking were even possible. I have not given up on working from the algebra backwards (or perhaps we should say forwards). However, the plan is now to give a minimal implementation, following one's nose, then generate a spec via QuickSpec, and then use this to do a rewrite, starting with the type algebra. In the meantime, I have also found that the property-based testing ecosystem in Haskell is both deeper and wider than I initially imagined. |
A meta-review of the friends of Haskell's QuickCheckOur purpose here is to be complete and concise. Below are lists of links to all tools for property-based testing and its applications in Haskell, ordered roughly from popular to obscure. Property-based testing libraries
Equational specification generators
Property refining libraries
Function synthesis
Companion libraries
General recommendationsThere may not be many reasons to prefer random implementations over enumerative. With enumerative frameworks, the problem of shrinking just doesn't exist, your tests are deterministic, custom instances are easier to write, and in all likelihood, you probably find bugs faster. I'd tentatively say to avoid Hedgehog, as it appears to have all the worst design decisions rolled into one package (random, no free generators/instances, integrated shrinking), despite its popularity/SEO ranking. You can't really practice Elliott-Maguire denotational design without a spec generator. Best practice is probably to live in the Matela-verse and use LeanCheck, Speculate, and FitSpec, or, if you want to stick to more tried-and-true stuff, go with QuickCheck or SmallCheck, in combination with QuickSpec. |
I have been using As such, I'm switching to |
It sounds awesome! |
@dobefore Thanks very much for your comment. I am implementing a very small subset of the rust backend. The rust backend of Anki, if you've read much of it, is much too complex. In particular, I need not deal with any information related to reviews. It is a dreadful experience trying to trace a function call through that codebase to figure out what it's actually doing. More generally, Rust makes code significantly safer, easier to read, and smaller than it would be if written in C. It does this via some nice abstractions and features/constructs that C simply does not have. Similarly, Haskell makes code significantly safer, easier to read, and smaller than it would be if written in Rust. And this is for the same reason, it has abstractions and language features that Rust does not have. In particular, I would like to see someone try and write something like |
Effect systemsThere is brief mention of effect systems above. For a while now, the main contenders for the Haskell ecosystem have been However, there are two lesser-known options taking a completely different approach that seems awfully promising. Both Without having used either in a major project, I cannot make any informed recommendations. However, if I move this tool to an effect system, |
Micro task queuesThis will be an experiment testing a goofy little trick to keep me on-task. The basic idea is that I will break down each hour of development into roughly 5 minute tasks. Hopefully this will give me a better idea of the superstructure of the stuff I need to do next, and also make things go faster. Perhaps I can also log the time I started and the time I stopped.
|
Linking C libraries in HaskellThe following resources were useful: |
Hi there, just stumbled across this project and will give it a try over the weekend - seems awesome! What does the Haskell rewrite mean for the project, the current issues raised and ways to contribute? I'd love to contribute to the project in a way, but I'm not (yet) comfortable in Haskell. |
@DonCiervo I'm looking forward to hearing how it works for you! Let me know if there's issues you run into that I might be able to fix. (Feel free to open more issues!) The Haskell rewrite means that eventually, the python version will be deprecated. At that point, the command-line version of the tool will be distributed either as signed binaries, right here on Github, or alternatively, I may consider distributing the binaries via As for the current issues raised, they will likely not be fixed in the python codebase in the meantime. A lot of the open issues are simply features I thought might be necessary or improvements to the development workflow. There is only one real bug that could impact users, which is #109, and it only matters if you have notes whose cards are split between decks. This is at least somewhat uncommon, and only rears its head if you make edits to those notes on the markdown side. I've found solutions to nearly all of these things on the Haskell side. The biggest problem I was running into was that it was simply too onerous and too slow to perform effective property-based testing on the python codebase. Hypothesis is the best thing folks have in the python ecosystem, but it is quite unpleasant to use, in my opinion. Ways to contributeI'm extremely happy to hear that you're interested in contributing, and I will bend-over backwards to make it as easy and enjoyable as possible for you if you are committed to helping out. There are a number of ways you can do this without even writing any code. Here's a list for you to consider:
On contributing codeThere are a couple ways you could do this. As an individual. If you're comfortable, I could simply pick some very simple tasks that would get you warmed-up to the language and the codebase (which is still miniscule), and you could just run with it. As a side-along dev. If you like pair programming and have some free time for it, I could also spend some time each week bringing you up to speed on some sort of call using a collaborative environment or screenshare. It would be easier to get a handle on the repository and the implicit knowledge that goes along with the code in this way, for obvious reasons. Whatever you choose to do, I encourage you not to take on too much too fast. I've been working on this for about a year now, and it would be much more useful to have a very small amount of someone's time over a long timescale rather than a couple weeks of intense attention. |
@langfield Cheers, appreciate the extensive answer. Maybe you should consider copying this information to a separate markdown file entirely so newcomers can get a quick overview. Personally I am also very partial to pinned issues for any "Wanna help? Here's how" type stuff. Regarding the move away from Python, what is the long-term goal for Now, I am aware this is by no means as easy as I'm making it out to be - but neither is a rewrite in Haskell. I get that Haskell is one of the more beautiful languages to use, but if maintainability and stability is the goal, then the Haskell environment doesn't have the best reputation either, if I'm being honest. |
Long-term goalsAn addon is planned, but it's rather low priority for me. This tool is first-and-foremost for my own use, and I have no desire for a GUI. It is also quite a tricky problem to figure out how to allow users unfamiliar with I am more interested in rolling things out to deck maintainers first, and the next big user-facing feature I have planned is rendering collections via Jekyll in a Github actions workflow. See #41. Deck maintainers are a bit more tolerant of bugs and sharp edges, and make excellent alpha testers. They also better appreciate the usefulness of Language choiceAs I mentioned earlier, the problem with Python was that I was unable to write the sort of robust test suite that is possible in some other languages. I was finding critical bugs too often, and the code was simply unsafe. Re: maintainability and stability: package distribution via Stack/Stackage is extraordinarily stable. The resolver system means that ordinarily, the developer never even has to think about compatibility of dependencies. All dependencies packaged for a given resolver are compatible by definition. It is very common, more the rule than the exception, for projects to go unmaintained for 5-10 years simply because there are no changes required to the code to keep it working for end-users. Nowhere is this more obvious than from the commit count. In the Python ecosystem, one great heuristic I used for determining if a project was stable or not was the number of commits in the Github repository. All else equal, it is usually the case that more commits indicates more active maintainers who fix bugs and keep things working as the language and dependencies change. They avoid bitrot, in other words. It was quite a surprise to learn that in the Haskell ecosystem, this heuristic works poorly. Relatively "famous" libraries often have under 100 commits, because that's simply all that's needed to get the code into a stable and complete state. Take QuickCheck, on the other hand, has only ~1100 commits, and was started in 1999, and has a Git history that goes back 16 years. It has taken an order of magnitude less commits to keep it stable and working, and it works dramatically better than its Python counterpoint. It is also notable that Haskell is extraordinarily fast compared to Python, as it compiles down to native machine code, and the compiler is state-of-the-art. It is able to hold its own against systems languages like Rust, Go, C in benchmarks. You may note that tasks that take Haskell roughly 10s require almost 10mins in Python. That's a ludicrous slowdown. You get programs that run massively slower and are also roughly an order of magnitude more likely to break at runtime. If you trawl through the Anki forums, you will notice that one of the first complaints I got about Ki was that it was simply too slow. I did a lot of work on optimizing the Python version, and got quite far, but it would all be unnecessary in Haskell. This problem is really not computationally intensive enough to require heavy optimization in a compiled language in order for the user experience to be quick and snappy. Additionally, there are things you can do in Haskell that are just impossible in Python. The type system is so strong that you can write something like Conjure and have it actually work. This is a library for function synthesis. You write a few example inputs and outputs of a function, and the library uses the type system to literally write the implementation for you. |
Having an interactive github pages front-end would be very close to appeal to even a more basic user, so that would make a great feature, methinks. I should probably get a bit more familiar with the program as it is and look into the issues one by one before I ask any more questions that have already been answered somewhere. The problem of a slowdown is obviously the big letdown in Python, and also the reason why the Anki backend has moved away from it. I was curious why you were deadset on using Haskell instead of Rust, for instance. But with the direction you're taking the project in, it only makes sense. |
Don't worry about this too much, I'm happy to answer any questions you might have. Discussing existing issues and problems will probably serve to clarify them anyway.
Yes after writing collection ops in pure python, I have a greater appreciation for just how fast the rust backend is. However, I'm not sure I'd say the slowdown is the biggest letdown. Correctness guarantees, IMO, are more important. If haskell were slow and python were fast, I'd still strongly consider switching. |
The first pseudo-correct output from (base) user@computer:~/proving-grounds$ cd ~/pkgs/ki && stack install && cd - && rm -rf dd && Ki-exe multifield/collection.anki2 dd && tree -a -I .git --filelimit 100 dd
Copying from /home/user/pkgs/ki/.stack-work/install/x86_64-linux/e79dab73246224887016704ede3d4c53affc0117456ea68c975b50a5c094bcee/8.10.7/bin/Ki-exe to /home/user/.local/bin/Ki-exe
Copied executables to /home/user/.local/bin:
- Ki-exe
/home/user/proving-grounds
Cloning media from Anki media directory '/home/user/proving-grounds/multifield/collection.media/'...
parts: ["Default"]
Committing contents to repository...
Done!
dd
├── Default
│ └── abcd
├── .gitignore
├── .gitmodules
├── .ki
│ ├── config
│ └── hashes
├── _media
└── _models
├── Mid 1673577708734.yaml
├── Mid 1673577708735.yaml
├── Mid 1673577708736.yaml
├── Mid 1673577708737.yaml
├── Mid 1673577708738.yaml
├── Mid 1673577710620.yaml
├── Mid 1673577743038.yaml
└── Mid 1673577758568.yaml
4 directories, 13 files The things that are not yet quite right:
|
Even closer-to-correct (base) user@computer:~/proving-grounds$ tree -I .git -a --filelimit 25 bb
bb
├── [1] Main Course
│ ├── [a] Option 1: Parisian French Audio
│ │ ├── I) French to English (Start here)
│ │ │ └── 5000 Most Common French Words [500 entries exceeds filelimit, not opening dir]
│ │ └── II) English to French
│ │ └── 5000 Most Common French Words [500 entries exceeds filelimit, not opening dir]
│ └── [b] Option 2: Canadian French Audio
│ ├── I) French to English
│ │ └── 5000 Most Common French Words [500 entries exceeds filelimit, not opening dir]
│ └── II) English to French
│ └── 5000 Most Common French Words [500 entries exceeds filelimit, not opening dir]
├── [A. 1] Irregular Verbs Training
│ ├── [a] Option 1: Most Frequent Conjs. Come First
│ │ └── 5000 Most Common French Words [47 entries exceeds filelimit, not opening dir]
│ ├── [a] Option 2: One Verb at a Time
│ │ └── 5000 Most Common French Words [47 entries exceeds filelimit, not opening dir]
│ ├── [b] Past Participle
│ │ └── 5000 Most Common French Words [46 entries exceeds filelimit, not opening dir]
│ ├── [c] Present Participle
│ │ └── Gerund
│ │ └── 5000 Most Common French Words [47 entries exceeds filelimit, not opening dir]
│ ├── [d] Imperative
│ │ └── 5000 Most Common French Words [47 entries exceeds filelimit, not opening dir]
│ └── [e] Literary
│ └── Poetic Verb Tenses
│ ├── Past Historic
│ │ └── 5000 Most Common French Words [47 entries exceeds filelimit, not opening dir]
│ └── Subjunctive Imperfect
│ └── 5000 Most Common French Words [47 entries exceeds filelimit, not opening dir]
├── [A. 2] The Study of Sounds (Phonology)
│ ├── I) Basic IPA & Phonetics
│ │ ├── [1] IPA Letters
│ │ │ └── 5000 Most Common French Words
│ │ │ ├── baie.md
│ │ │ ├── boue.md
│ │ │ ├── chou.md
│ │ │ ├── clé.md
│ │ │ ├── cou.md
│ │ │ ├── doux.md
│ │ │ ├── fou.md
│ │ │ ├── gnouf.md
│ │ │ ├── goût.md
│ │ │ ├── jeune.md
│ │ │ ├── jeûne.md
│ │ │ ├── joue.md
│ │ │ ├── là.md
│ │ │ ├── loup.md
│ │ │ ├── mou.md
│ │ │ ├── nous.md
│ │ │ ├── peau.md
│ │ │ ├── pou.md
│ │ │ ├── roue.md
│ │ │ ├── si.md
│ │ │ ├── sort.md
│ │ │ ├── sous.md
│ │ │ ├── tout.md
│ │ │ ├── vous.md
│ │ │ └── zou.md
│ │ ├── [2a] Comparing French & English phonemes
│ │ │ └── 5000 Most Common French Words
│ │ │ ├── an-unreleased-stop-is-a-hard-consonant-that-does-not-relea.md
│ │ │ ├── compared-to-the-english-a-the-french-a-is-spoke.md
│ │ │ ├── compared-to-the-english-e-the-french-e-is-spoke.md
│ │ │ ├── compared-to-the-english-ɛ-the-french-ɛ-is-spoke.md
│ │ │ ├── compared-to-the-english-i-the-french-i-is-spoke.md
│ │ │ ├── compared-to-the-english-o-the-french-o-is-spoke.md
│ │ │ ├── compared-to-the-english-ɔ-the-french-ɔ-is-spoke.md
│ │ │ ├── compared-to-the-english-u-the-french-u-is-spoke.md
│ │ │ ├── how-is-the-french-b-d-g-different-from-english-b-d-g.md
│ │ │ ├── how-is-the-french-p-t-k-different-from-english-p-t-k.md
│ │ │ ├── place-of-articulation-is-where-a-consonant-is-formed-in-the.md
│ │ │ └── to-illustrate-the-difference-between-aspirated-and-unaspirat.md
│ │ ├── [2b] An illustration of French & English vowels
│ │ │ └── 5000 Most Common French Words
│ │ │ └── vowels-uncompressed02png.md
│ │ ├── [2c] Audio comparison of English & French Vowels
│ │ │ └── 5000 Most Common French Words
│ │ │ ├── comparing-french-n-and-english-nfrench-vowels-in-sequenc.md
│ │ │ ├── comparing-french-s-and-english-sin-french-s-the-tip-o.md
│ │ │ ├── comparing-french-z-and-english-zin-french-z-the-tip-o.md
│ │ │ ├── french-b-p-and-english-b-p-notice-how-french-release.md
│ │ │ ├── french-t-d-and-english-t-dnotice-how-french-releases.md
│ │ │ └── using-real-and-made-up-words-this-card-will-compare-french.md
│ │ └── [3] Notes on the IPA used in this deck
│ │ └── 5000 Most Common French Words
│ │ └── about-the-ipa-in-this-deck-let-a-be-any-vowel-let-l-be.md
│ ├── III) Aspirated h
│ │ ├── [1] Intro
│ │ │ └── 5000 Most Common French Words
│ │ │ └── this-deck-contains-words-that-are-aspirated-h-and-words-t.md
│ │ └── [2] Practice
│ │ └── 5000 Most Common French Words [59 entries exceeds filelimit, not opening dir]
│ ├── II) Words With Irregular Pronunciation
│ │ ├── [1] Intro
│ │ │ └── 5000 Most Common French Words
│ │ │ └── in-a-way-or-another-the-words-of-this-deck-have-irregular-p.md
│ │ ├── [2a] Sorted by frequency
│ │ │ └── 5000 Most Common French Words [96 entries exceeds filelimit, not opening dir]
│ │ └── [2b] Sorted by group
│ │ └── 5000 Most Common French Words [96 entries exceeds filelimit, not opening dir]
│ ├── IV) Museum of Sounds
│ │ ├── [1] Intro
│ │ │ └── 5000 Most Common French Words
│ │ │ └── the-purpose-of-this-deck-is-to-showcase-the-sounds-of-the-fr.md
│ │ └── [2] Showcase (Formal Accent)
│ │ ├── a. Parisian French
│ │ │ └── 5000 Most Common French Words [39 entries exceeds filelimit, not opening dir]
│ │ └── b. Canadian French
│ │ └── 5000 Most Common French Words [39 entries exceeds filelimit, not opening dir]
│ └── V) The Skill of Listening
│ ├── [1] Intro
│ │ └── 5000 Most Common French Words
│ │ └── this-deck-has-the-purpose-of-training-your-listeningthere-a.md
│ ├── [2] Formal Accent
│ │ ├── a. Parisian
│ │ │ └── 5000 Most Common French Words [1000 entries exceeds filelimit, not opening dir]
│ │ └── b. Canadian
│ │ └── 5000 Most Common French Words [1000 entries exceeds filelimit, not opening dir]
│ └── [3] Informal Accent
│ └── a. Mixed Accents (98% Parisian)
│ ├── ii. Longer Sentences
│ │ └── 5000 Most Common French Words [1323 entries exceeds filelimit, not opening dir]
│ └── i. Short Sentences
│ └── 5000 Most Common French Words [1492 entries exceeds filelimit, not opening dir]
├── [A. 3] Read & Speak Training (Where the Real Learning Happens)
│ ├── I) Basic
│ │ ├── [0] Note about this deck
│ │ │ └── 5000 Most Common French Words
│ │ │ └── 1-all-sentences-in-this-deck-are-present-in-the-main-deck.md
│ │ ├── [1] Read
│ │ │ └── 5000 Most Common French Words [2305 entries exceeds filelimit, not opening dir]
│ │ └── [2] Speak
│ │ ├── a. Short sentences
│ │ │ └── 5000 Most Common French Words [1000 entries exceeds filelimit, not opening dir]
│ │ ├── b. Medium-sized sentences
│ │ │ └── 5000 Most Common French Words [676 entries exceeds filelimit, not opening dir]
│ │ └── c. Long sentences
│ │ └── 5000 Most Common French Words [629 entries exceeds filelimit, not opening dir]
│ └── II) Intermediate
│ ├── [0] Note about this deck
│ │ └── 5000 Most Common French Words
│ │ └── 1-i-took-the-wiktionary-articles-for-each-word-in-the-5000.md
│ ├── [1] Read
│ │ └── 5000 Most Common French Words [1154 entries exceeds filelimit, not opening dir]
│ └── [2] Speak
│ └── 5000 Most Common French Words [1154 entries exceeds filelimit, not opening dir]
├── [B] Download this deck in e-book form
│ └── 5000 Most Common French Words
│ └── the-main-course-of-this-deck-5000-most-common-french-words.md
├── .gitignore
├── .gitmodules
├── .ki
│ ├── config
│ └── hashes
├── _media [6755 entries exceeds filelimit, not opening dir]
└── _models
├── 01 BASIC.yaml
├── 5000 French Words 2.0 (E to F) C.yaml
├── 5000 French Words 2.0 (E to F).yaml
├── 5000 French Words 2.0 (F to E) C.yaml
├── 5000 French Words 2.0 (F to E).yaml
├── Basic (and reversed card).yaml
├── Basic (optional reversed card).yaml
├── Basic (type in the answer).yaml
├── Basic.yaml
├── Cloze (overlapping).yaml
├── Cloze.yaml
├── French aspirated h.yaml
├── French Ear Training.yaml
├── French IPA deck.yaml
├── French irregular pronunciation.yaml
├── French phonology.yaml
├── French sentences Read Training.yaml
├── French sentences Speak Training.yaml
├── French Verbs.yaml
├── French vowels comparison.yaml
└── Intro card.yaml
100 directories, 77 files |
See #39 for schemas. |
Considering a rewrite, possibly in another language.
As such, I'll be pausing development on the
sql-parser
branch temporarily, reserving the right to walk back on this decision, and starting on a newhaskell
branch. The structure of the repository will change dramatically. There will be a single (1) source file calledMain.hs
containing all type definitions, all ORM stuff, all helper functions, and all business logic. There will be a single (1) test file calledTest.hs
containing QuickCheck tests of everything inMain.hs
. Both of these files will live at the top-level of the repository, and I see really no need for there to be any subdirectories at all, perhaps barring generated API documentation, if I can't figure out how to build that automatically.In order to handle future distribution in the form of an addon, I will simply build binaries for all relevant architectures, and include all of them in the addon package. Barring this, people will have to build themselves with
stack
, but I will endeavor to ensure that is never necessary. There will be a microwrapper python program that only exists to detect the platform architecture and call the relevant Haskell executable.It is very possible to build portable binaries for macOS, Linux, and Windows via
stack
in CI, and indeed there have been lengthy discussions on the topic.The work on
sql-parser
will not have gone to waste, because we have proven that we don't need theanki
python package for nearly as much as I thought. In particular, theclone
andpull
commands can absolutely be written using onlysqlite-simple
and a custom parser analogous to the one we wrote inLark
to parse the output ofsqldiff
. I think it may even be possible to do the entirety of thepush
operation without relying on calling functions fromanki
. We'll be reimplementing a tiny portion of the rust backend, but I believe it will be fairly simple.Starting from scratch, I'll follow very closely the model given in Algebra-Driven Design by Sandy Maguire, and define the algebra entirely on paper before writing any code. I also plan to make heavy use of QuickSpec to check that things work out equationally as I expect.
I have not yet decided whether to either work entirely in a combination of
IO
andEither
, to handle warnings and errors, or to simply try out Polysemy. Probably the former to start with, because it is simpler.To be continued...
The text was updated successfully, but these errors were encountered: