TravisCI | CircleCI |
---|---|
hledger-flow
is a command-line program that gives you a guided Hledger
workflow. It is important to note that most of the heavy lifting is done by the
upstream hledger
project. For example, hledger-flow
cares about where you
put your files for long-term maintainability, but the actual conversion to
classified accounting journals is done by hledger
.
hledger-flow
focuses on automated processing of electronic statements as much as possible,
as opposed to manually adding your own hledger journal entries. Manual entries
are still possible, it just saves time in the long run to automatically process
a statement whenever one is available.
Within hledger-flow
you will keep your original bank statements around
permanently as input, and generate classified Hledger journals each time
you run the program. The classification is done with hledger’s rules files,
and/or your own script hooks.
Keeping the original statements means that you never have to worry too much about “am I doing this accounting thing right?” or “what happens if I make a mistake?”. If you want to change your mind about some classification, or if you made a mistake, you just change your classification rules, and run the program again.
It started when I realized that the scripts I wrote while playing around with adept’s Full-fledged Hledger aren’t really specific to my own finances, and can be shared.
- Save an input transaction file (typically CSV) to a specific directory.
- Add an hledger rules file. Include some classification rules if you want.
- Run
hledger-flow import
Add all your files to your favourite version control system.
The generated journal that you most likely want to use as your
LEDGER_FILE
is called all-years.journal
. This has include directives
to all the automatically imported journals, as well as includes for your
own manually managed journal entries.
In a typical software project we don’t add generated files to version control, but in this case I think it is a good idea to add all the generated files to version control as well - when you inevitably change something, e.g. how you classify transactions in your rules file, then you can easily see if your change had the desired effect by looking at a diff.
hledger-flow
is intended for you if:
- You are interested in getting started with hledger or ledger-cli and you wouldn’t mind pointers to the right docs along the way.
- You want a way to organise your finances into a structure that will be maintainable over the long term.
- You want to automate as much as possible when dealing with your financial life.
- You don’t mind writing some scripts when needed, as long as it saves you time over the long term.
- You want the ability to model your entire financial life in one tool, as opposed to just the parts that some online tool currently supports.
- You appreciate the fact that all your financial information stays within your control.
The rest of this file documents how to use hledger-flow
, and it has probably outgrown what should be in a README file.
If you can spare some time to contribute to this project, please consider converting these docs to something more suitable, such as Read the Docs.
The easiest way to get it running is to download the latest release
for your OS (Linux or Mac OS X), and copy the hledger-flow
executable to a directory in your PATH.
Then just run it and see what it tells you to do.
You can also compile it yourself by following the build instructions.
Currently hledger-flow
does not work on Windows.
This list of issues describes some of the details of what doesn’t work.
I believe it wouldn’t take too much effort to fix those issues, but I’m going to leave Windows support for other contributors.
Please send me some pull requests if you would like hledger-flow
to work on Windows.
We’re not close to a 1.0 release yet, which means that we can still make changes if needed.
As an example, the command-line switches we use will probably change over time. Some switches change the behaviour of the program - the default behaviour will probably change between releases. The names of these command-line options can change, or they can be removed when it is no longer needed.
That being said, some parts have been used and tested extensively and are likely to remain stable. Have a look at the “Stability of this Feature” sections in the feature reference below.
I add future work, ideas and thoughts as Github issues and in TODO.org, so have a look there for more clues as to what may likely change.
Let me know if you can think of some improvements.
Have a look at the detailed step-by-step instructions.
You can see the example imported financial transactions as it was generated by the step-by-step instructions here:
https://github.com/apauley/hledger-flow-example
Your input files will probably be CSV files with a line for each
transaction, although other file types will work fine if you use a
preprocess
or a construct
script that can read them. These scripts
are explained later.
We mostly use conventions based on a predefined directory structure for your input statements.
For example, assuming you have a savings
account at mybank
, you’ll
put your first CSV statement here:
import/john/mybank/savings/1-in/2018/123456789_2018-06-30.csv
.
Some people may want to include accounts belonging to their spouse as
part of the household finances:
import/spouse/otherbank/checking/1-in/2018/987654321_2018-06-30.csv
.
All files and directories under the import
directory are related to the
automatic importing and classification of transactions.
The directory directly under import
is meant to indicate the owner or
custodian of the accounts below it. It mostly has an impact on
reporting. You may want to have separate reports for import/mycompany
and import/personal
.
Below the directory for the owner we can indicate where an account is
held. For a bank account you may choose to name it import/john/mybank
.
If your underground bunker filled with gold has CSV statements linked to
it, then you can absolutely create import/john/secret-treasure-room
.
Under the directory for the financial institution, you’ll have a
directory for each account at that institution, e.g.
import/mycompany/bigbankinc/customer-deposits
and
import/mycompany/bigbankinc/expense-account
.
Next you’ll create a directory named 1-in
. This is to distinguish it
from 2-preprocessed
and 3-journal
which will be auto-generated
later.
Under 1-in
you’ll create a directory for the year, e.g. 2018
, and
within that you can copy the statements for that year:
import/john/mybank/savings/1-in/2018/123456789_2018-06-30.csv
The basic owner/bank/account/year structure has been used and tested fairly extensively, I don’t expect a need for it to change.
I’m open to suggestions for improvement though.
If your input file is in CSV format, or converted to CSV by your
preprocess
script, then you’ll need an
hledger rules file.
hledger-flow
will try to find a rules file for each statement in a few
places. The same rules file is typically used for all statements of a
specific account, or even for all accounts of the same specific bank.
- A global rules file for any
mybank
statement can be saved here:import/mybank.rules
- A rules file for all statements of a specific account:
import/spouse/bigbankinc/savings/bigbankinc-savings.rules
What happens if some of the statements for an account has a different format than the others?
This can happen if you normally get your statements directly from your bank, but some statements you had to download from somewhere else, like Mint, because your bank is being daft with older statements.
In order to tell hledger-flow
that you want to override the rules file
for a specific statement, you need to add a suffix, separated by an
underscore (_
) and starting with the letters rfo
(rules file
override) to the filename of that statement.
For example: assuming you’ve named your statement
99966633_20171223_1844_rfo-mint.csv
.
hledger-flow
will look for a rules file named rfo-mint.rules
in the
following places:
- in the import directory, e.g.
import/rfo-mint.rules
- in the bank directory, e.g.
import/john/mybank/rfo-mint.rules
- in the account directory, e.g.
import/john/mybank/savings/rfo-mint.rules
A common scenario is multiple accounts that share the same file format,
but have different account1
directives.
One possible approach would be to include a shared rules file in your account-specific rules file.
If you are lucky enough that all statements at mybank
share a common
format across all accounts, then you can include
a rules file that
just defines the parts that are shared across accounts.
Two accounts at mybank
may have rules files similar to these.
A checking account at mybank:
# Saved as: import/john/mybank/checking/mybank-checking.rules include ../../../mybank-shared.rules account1 Assets:Current:John:MyBank:Checking
Another account at mybank:
# Saved as: import/alice/mybank/savings/mybank-savings.rules include ../../../mybank-shared.rules account1 Assets:Current:Alice:MyBank:Savings
Where import/mybank-shared.rules
may define some shared attributes:
skip 1 fields date, description, amount, balance date-format %Y-%m-%d currency $
Another possible approach could be to use your preprocess
script to
write out a CSV file that has extra fields for account1
and
account2
.
You could then create the above mentioned global import/mybank.rules
with the fields defined more or less like this:
fields date, description, amount, balance, account1, account2
Rules files are a stable feature within hledger, and we’re just using the normal hledger rules files. The account, bank and statement-specific rules files have been used and tested fairly extensively, I don’t expect this to change.
Let me know if you think it should change.
hledger-flow
looks for a file named YEAR-opening.journal
in each
account directory, where YEAR
corresponds to an actual year directory,
eg. 1983 (if you have electronic statements
dating
back to 1983). Example:
import/john/mybank/savings/1983-opening.journal
If it exists the file will automatically be included at the beginning of the generated journal include file for that year.
You need to edit this file for each account to specify the opening balance at the date of the first available transaction.
An opening balance may look something like this:
2018-06-01 Savings Account Opening Balance assets:Current:MyBank:Savings $102.01 equity:Opening Balances:MyBank:Savings
When closing your balances it may result in some hledger
queries showing zero-values, or there could be issues with balance assertions.
Please have a look at the upstream hledger
documentation on closing balances, e.g here:
https://hledger.org/hledger.html#close-usage
Some of the gotchas you may run into are also described in this hledger-flow issue.
Similar to opening balances, hledger-flow
looks for an optional file
named YEAR-closing.journal
in each account directory. Example:
import/john/mybank/savings/1983-closing.journal
If it exists the file will automatically be included at the end of the generated journal include file for that year.
A closing balance may look something like this:
2018-06-01 Savings Account Closing Balance assets:Current:MyBank:Savings $-234.56 = $0.00 equity:Closing Balances:MyBank:Savings
As an example, assuming that the relevant year is 2019
and
hledger-flow
is about to generate
import/john/mybank/savings/2019-include.journal
, then one or both of
the following files will be added to the include file if they exist:
import/john/mybank/savings/2019-opening.journal
import/john/mybank/savings/2019-closing.journal
The opening.journal
will be included just before the other included
entries, while the closing.journal
will be included just after the
other entries in that include file.
An include file may look like this:
cat import/john/mybank/savings/2019-include.journal
### Generated by hledger-flow - DO NOT EDIT ### include 2019-opening.journal include 3-journal/2019/123456789_2019-01-30 include 2019-closing.journal
Closing balances sometimes result in unexpected query results. In future we may change how/where the generated files include the closing journal.
We may also need to suggest some naming conventions for opening and closing balances so that reports can exclude some of these transactions.
It is also possible that we might want to change the name/location of the closing journal, but we’ll try to avoid this if possible, because that would require users to rename their existing files.
hledger-flow
looks for price files to include in each yearly include file.
For example, the presence of a file named ${BASE}/prices/2020/prices.journal
will result in some extra include file magic.
The rest of this section assumes you’ll have a file named prices/2020/prices.journal
which contains price data for the year 2020.
The prices
directory should be right at the top of your hledger-flow
base directory, next to the import
directory.
hledger-flow
does not care how the price files got there, it only cares that you should have a separate file per year,
and that it follows the above naming convention.
Here is an example script which downloads prices and follows the naming convention: https://gist.github.com/apauley/398fa031c202733959af76b3b8ce8197
After running an import with available price files you’ll see a line has been added to import/2020-include.journal
:
include ../prices/2020/prices.journal
Hledger allows you to specify some useful directives which affect things such as number formatting.
A convenient place to put these directives within hledger-flow
is a file named directives.journal
(in your hledger-flow base directory).
If it exists hledger-flow
will include it within the all-years.journal
:
cat all-years.journal
### Generated by hledger-flow - DO NOT EDIT ### include directives.journal include import/all-years.journal
Sometimes the statements you get from your bank is less than suitable for automatic processing. Or maybe you just want to make it easier for the hledger rules file to do its thing by adding some useful columns.
If you put a script called preprocess
in the account directory, e.g.
import/john/mybank/savings/preprocess
, then hledger-flow
will call
that script for each input statement.
The preprocess
script will be called with 4 positional parameters:
- The path to the input statement, e.g.
import/john/mybank/savings/1-in/2018/123456789_2018-06-30.csv
- The path to an output file that can be sent to
hledger
, e.g.import/john/mybank/savings/2-preprocessed/2018/123456789_2018-06-30.csv
- The name of the bank, e.g.
mybank
- The name of the account, e.g.
savings
- The name of the owner, e.g.
john
Your preprocess
script is expected to:
- read the input file
- write a new output file at the supplied path that works with your rules file
- be idempotent. Running
preprocess
multiple times on the same files will produce the same result.
Stable and tested.
If you need even more power and flexibility than what you can get from
the preprocess
script and hledger
’s CSV import functionality, then
you can create your own custom script to construct
transactions
exactly as you need them.
At the expense of more construction work for you, of course.
The construct
script can be used in addition to the preprocess
script, or on it’s own. But since the construct
script is more
powerful than the preprocess
script, you could tell your construct
script to do anything that the preprocess
script would have done.
Save your construct
script in the account directory, e.g.
import/john/mybank/savings/construct
.
hledger-flow
will call your construct
script with 5 positional
parameters:
- The path to the input statement, e.g.
import/john/mybank/savings/1-in/2018/123456789_2018-06-30.csv
- A “-” (indicating that output should be sent to
stdout
) - The name of the bank, e.g.
mybank
- The name of the account, e.g.
savings
- The name of the owner, e.g.
john
Your construct
script is expected to:
- read the input file
- generate your own
hledger
journal transactions - be idempotent. Running
construct
multiple times on the same files should produce the same result. - send all journals to
stdout
.hledger-flow
will pipe your standard output intohledger
which will format it and save it to an output file.
You can still use stderr
in your construct script for any other output that you may want to see.
Stable and tested.
Not every transaction in your life comes with CSV statements.
Sometimes you just need to add a transaction for that time you loaned a friend some money.
hledger-flow
looks for pre-import
and post-import
files related to
each generated include file as part of the import.
You can enter your own transactions manually into these files.
You can run hledger-flow import --verbose
to see exactly which files
are being looked for.
As an example, assuming that the relevant year is 2019
and
hledger-flow
is about to generate import/john/2019-include.journal
,
then one or both of the following files will be added to the include
file if they exist:
import/john/_manual_/2019/pre-import.journal
import/john/_manual_/2019/post-import.journal
The pre-import.journal
will be included just before the other included
entries, while the post-import.journal
will be included just after the
other entries in that include file.
An include file may look like this:
cat import/john/2019-include.journal
### Generated by hledger-flow - DO NOT EDIT ### include _manual_/2019/pre-import.journal include mybank/2019-include.journal include otherbank/2019-include.journal include _manual_/2019/post-import.journal
It works, but the naming of _manual_
looks a bit weird. Should it be
changed?
The following example was contributed by Amitai Burstein:
# .github/workflows/hledger-flow.yml
name: Validate hledger-flow
on: [push]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1
- name: Install hledger
run: docker pull dastapov/hledger
- name: Install hledger-flow
run: curl -L https://github.com/apauley/hledger-flow/releases/download/v0.12.4.0/hledger-flow_Linux_x86_64_v0.12.4.0_4b9b027.tar.gz | tar xvz && mv hledger-flow_Linux_x86_64_v0.12.4.0_4b9b027/hledger-flow .
- name: Grant permissions to create files
run: chmod 777 -R ./my-finances
- name: Test hledger file
run: docker run --name="ledger" -v $(pwd):/data dastapov/hledger ./hledger-flow import ./my-finances
When writing out the journal include files, hledger-flow
sorts the
include statements by filename.
Ledger fails any balance assertions when the transactions aren’t included in chronological order.
An easy way around this is to name your input files so that March’s statement is listed before December’s statement.
Another option is to add --permissive
to any
ledger command.
So you should easily be able to use both ledger
and hledger
on these
journals if you take care to avoid the few incompatibilities which exists
(eg in your rules files or manual journals).
My hledger
files started to collect a bunch of supporting code that
weren’t really specific to my financial situation.
I want to extract and share as much as possible of that supporting code.
Adept’s goals also resonated with me:
- Tracking expenses should take as little time, effort and manual work as possible
- Eventual consistency should be achievable: even if I can’t record something precisely right now, maybe I would be able to do it later, so I should be able to leave things half-done and pick them up later
- Ability to refactor is a must. I want to be able to go back and change the way I am doing things, with as little effort as possible and without fear of irrevocably breaking things.
I’ve given a talk at Lambda Luminaries Johannesburg featuring hledger and hledger-flow.
Have a look at the contribution guidelines.
In your primary bank account you’ve happily been classifying transfers to a
secondary account as just Expenses:OtherAccount
.
But you’ve recently started processing the statements from the second account as well so that you can classify those expenses more accurately.
And now the balances of these two accounts are all wrong when the statements of each account deals with money transferred between these two accounts.
In bank1.journal
, imported from bank1.csv
:
2018/11/09 Transfer from primary account to secondary account Assets:Bank1:Primary $-200 Assets:Bank2:Secondary
In bank2.journal
, imported from bank2.csv
:
2018/11/09 Transfer from primary account to secondary account Assets:Bank2:Secondary $200 Assets:Bank1:Primary
As soon as you start importing statements for both accounts you will have to introduce an intermediate account for classification between these two accounts.
I use Assets:Transfers:*
.
And we may have reports looking at these transfers accounts at some point, you should consider using the same names.
The above example then becomes as follows.
In bank1.journal
, imported from bank1.csv
:
2019-05-18 Transfer from primary account to secondary account Assets:Bank1:Primary $-200 Assets:Transfers:Bank1Bank2
In bank2.journal
, imported from bank2.csv
:
2019-05-18 Transfer from primary account to secondary account Assets:Bank2:Secondary $200 Assets:Transfers:Bank1Bank2
Any posting to Assets:Transfers:*
indicates an in “in-flight” amount.
You would expect the balance of Assets:Transfers
to be zero most of the time.
Whenever it isn’t zero it means that you either don’t yet have the other side of
the transfer, or that something is wrong in your rules.
You could theoretically just use Assets:Transfers
without any subaccounts, but
I found it useful to use subaccounts. Because then the subaccounts can show me
where I should look for any missing transfer transaction.
I typically use sorted names as the subaccount (Python code sample):
"Assets:Transfers:" + "".join(sorted(["Bank2", "Bank1"]))
This approach is based on what is described in Full-fledged hledger: https://github.com/adept/full-fledged-hledger/wiki/Adding-more-accounts#lets-make-sure-that-transfers-are-not-double-counted
The question was first asked in issue #51.
Full-fledged Hledger is a brilliant system, and hledger-flow continues to learn much from it.
It has great documentation that does an excellent job of not only showing how things can be done, but also why it is such a great idea.
hledger-flow can be seen as a specific implementation of the Full-fledged Hledger system, with a few implementation details that are different.
Full-fledged Hledger | Hledger Flow |
---|---|
FFH is a tutorial with helper scripts that you can start using and adapt to your needs. | I started with FFH, and changed bits and pieces over time to suit my needs. The “owner/bank/account” structure for example. |
FFH is more open-ended: you can start with the basic scripts and over time turn it into something that solves your needs exactly. But you’ll also end up with more code that you need to maintain yourself. | Hledger Flow is more opinionated and less open-ended. For example, you have to adopt the “owner/bank/account” structure precisely as specified. But this allows Hledger Flow to do more work for you. |
FFH uses scripts and Haskell/Shake build files that you can easily modify as you go along, but this requires a Haskell runtime to be installed everywhere it needs to run. The included docker image helps to make it less of an issue. | Hledger Flow distributes a compiled binary. This means users or deployment targets don’t need extra dependencies installed, they can just run a CLI program. This also provides a clearer distinction between what is provided, and what users need to do. |