Kamscan implements a camera-based scan station for documents. It does so as a command line tool, making the process very convenient and efficient.
It captures pictures with a photo camera and converts them to a PDF document. Additionally, it leverages OCR software to embed the text as text in the PDF, making the PDF searchable and indexable.
You need
- Unix-like OS
- Python 3.7
- Lensfun
- GCC
- Tesseract 4
- pdftk
- ImageMagick 6.8
- Argyll CMS (in particular, cctiff)
- dcraw
- Kivy
- pytz
- click (Python package)
- ruamel.yaml
- Clone this repo.
make undistort
Note that you cannot simply make a link in your PATH
pointing to
kamscan.py
. Instead, you have to write a trivial shell script named
kamscan
that calls it explicitly, like:
#!/bin/sh PYTHONPATH=~/src python -m kamscan.kamscan "$@"
(In my case, kamscan installed in ~/src/kamscan
.) This is because
kamscan.py
uses relative imports, and it looks for its utilities in the
directory it was called in.
Then, you can call kamscan --help
as a starting point.
The configuration file must be placed at
~/.config/kamscan/configuration.yaml
. It may look like this:
sources: sony_nex_7: make: Sony model: NEX-7 icc_profile: path: /home/bronger/.config/darktable/color/in/nex7_matrix.icc color_space: Lab lenses: ["1", "2"] reuse_dir_prefix: /tmp/kamscan_nex_7 sony_a6000: make: Sony model: A6000 icc_profile: path: /home/bronger/.config/darktable/color/in/nex7_matrix.icc color_space: Lab lenses: ["1", "2"] reuse_dir_prefix: /tmp/kamscan_a6000 default_source: sony_a6000 lenses: "1": make: Sony model: E 50mm f/1.8 OSS (kamscan) "2": make: Sony model: E 50mm f/1.8 OSS
This configuration file contains two possible camera sources that you can
select with the --source
argument. The Sony α6000 is the default.
Furthermore, two lenses are selectable. “Model” and “make” refer to the
respective values in the Lensfun database.
The camera setup must be in a way that you take pictures similar to this:
,--image border----------. | | | ,-paper border----. | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | x-----------------´ | | | `-----------------------´
The code that feeds the images into the core processing units rotates these images by 90° clockwise. This way, any text printed on the pages is readable without turning your head.
The setup need not be perfect. Any inhomogeneous illumination or tilting of the paper will be corrected.
,--image border----------. | | | ,-paper border----. | | | | | | | | ----- | ----- | | | | ----- | ----- | | | | ----- | ----- | | | | | | | | x--------+--------´ | | | `------------------------´
Some hints going beyond the output of kamscan.py --help
:
Profiles are stored only for one day. After that, Kamscan assumes that they are inaccurate.
Switching between different page sizes, colour modes, or two-side mode does not need different profiles. Instead, create a new profile for a different setup, e.g.:
- different illumination
- a glass panel on top of the sheets
- sheets in bigger distance because they are too large for the small one
Of course, if you return to the former profile, you must be able to reproduce its setup accurately.
Bear in mind that the height of the sheet used for creating the profile is assumed to be A4 by default. Use
--full-height
with other heights.The alignment point for everything you scan is always the top left corner (in reading orientation) of the rectangle used for calibration, marked with “
x
” in the above diagrams. Accordingly, the two edges of the calibration rectangle originating in this point are the alignment lines for everything you scan.
Any bent pages are a problem. They should be as flat as possible because no correction of any curvature takes place.
Place a new script in sources/
. The name of the script (without the
.py
) must match the name of the key in the configuration file under
“sources”.
The API that the class must fulfil is simple:
- name it “Source”
- accept the arguments “configuration” and “params” in the constructor
- define the methods “images” and “raw_to_pnm”
“configuration” is a nested dictionary with the part of the configuration file that belongs to the source.
“params” is the value that was passed with the --params
argument on the
command line. If no such argument was given, it is None
. If it was only a
single value, it is that value. If it was a comma-separated list of key=value
pairs, it is a dictionary with those pairs.
This iterator yields the raw images from the camera in the order that they were taken. It yields a tuple of image index (starting with 0), whether it is the last page, and the path to the image. There must be at least one image in the iterator.
This method converts a camera raw file into a PNM. It does so as raw as possible, i.e. no corrections. In particular, the colour space must be linear. It takes the following parameters:
path
- The path to the raw file. The output file must have the same path but with
the extension
.ppm
(colour) or.pgm
(grey). for_preview
- (Default:
False
.) IfTrue
, generate a pnm that looks decently instead of being raw. In particular, it should be white-balanced and gamma-corrected. gray
- (Default:
False
.) Whether a PGM should be produced. b
- Exposure correction; all intensities are multiplied by this value. Default is no exposure correction.
asynchronous
- (Default:
False
.) IfTrue
, the external process that does the conversion is called asynchronously.
The return type depends on the last parameter asynchronous
. If it is
False
, the path to the PNM path is returned. Otherwise, a tuple is
returned with the output path and the external process (of the type
subprocess.Popen
).