OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.
Please read the wiki to learn more.
services:
openrefine:
image: easypi/openrefine:3.9.1
ports:
- "3333:3333"
volumes:
- ./data:/data
environment:
- REFINE_INTERFACE=0.0.0.0
- REFINE_PORT=3333
- REFINE_MIN_MEMORY=1024M
- REFINE_MEMORY=1024M
- REFINE_DATA_DIR=/data
- REFINE_EXTRA_OPTS=refine.headless=true
restart: unless-stopped
- Locate your workspace directory: ./data
- Create a new folder called
extensions
inside the workspace if it does not exist. - Download the extension (usually as a zip file from GitHub, e.g., openrefine-llm-extension)
- Extract the zip contents into the
extensions
directory, making sure all the contents go into one folder with the name of the extension. - Start (or restart) OpenRefine.