Commit 882f7d5 1 parent 13f82bf commit 882f7d5 Copy full SHA for 882f7d5
File tree 1 file changed +31
-5
lines changed
1 file changed +31
-5
lines changed Original file line number Diff line number Diff line change @@ -38,18 +38,44 @@ $ python3.11 -mpipenv install
38
38
$ python3.11 -mpipenv shell
39
39
```
40
40
41
- Adapt ` env.sample ` to your needs and copy it to ` .env ` .
41
+ For s3-based file processing, the following environment variables need to be set:
42
+
43
+ ``` sh
44
+ SE_ACCESS_KEY=
45
+ SE_SECRET_KEY=
46
+ SE_HOST_URL=
47
+ ```
48
+
49
+ If your global environment does not contain these variables, you can set them in a local
50
+ ` .env ` file. The ` python-dotenv ` package is used to read these variables.
51
+
52
+ ``` sh
53
+ cp env.sample .env
54
+ edit .env
55
+ ```
42
56
43
57
# Running the pipeline
44
58
45
- Adapt the local paths for the input and output directories according in the
46
- ` config.local.mk ` (see ` config.local.mk.sample ` for an example).
47
- and run the following command:
59
+ ## Local configuration
60
+
61
+ Adapt the local paths for the input and output directories in the
62
+ ` config.local.mk ` (see ` config.local.mk.sample ` for default settings.)
48
63
49
64
``` sh
65
+ cp config.local.mk.sample config.local.mk
66
+ edit config.local.mk
67
+ ```
68
+
69
+ ## Running the pipeline
70
+
71
+ The build process is controlled by the ` Makefile ` .
72
+
73
+ ``` sh
74
+ make help # show available targets
75
+
50
76
make newspaper -j N # process specific newspaper/year pairs in parallel typically for testing
51
77
52
- make each -j N # process all newspapers using parallel processing within newspaper/year pairs
78
+ make collection MAKE_PARALLEL_OPTION=16 # process all newspapers using parallel processing within newspaper/year pairs
53
79
```
54
80
55
81
## Command-Line Options for ` spacy_linguistic_processing.py `
You can’t perform that action at this time.
0 commit comments