This project contains tools and examples for parsing data dumps from WeRelate.org. The data dumps are available under a Creative Commons Attribution-ShareAlike license.
- Download this project. Make sure you have java and maven.
- Download the latest data dump from here.
- Look at the examples in src/main/java/org/folg/werelatedata/examples.
- Extend one of the examples or add your own.
- Build using maven:
mvn install
- Run using maven: `mvn exec:java -Dexec.mainClass= -Dexec.args=""
I haven't documented the XML structure of each page. You can usually figure this out by doing a diff between revisions in the page history, but I know it's a pain. I'll document it eventually, but if there's a particular namespace that you'd like me to document sooner rather than later, please let me know.
If you want to write a bot to update pages on WeRelate, take a look at AddFhlcIdToPlaces
or DeletePages
for examples.
Once you've written the script and have tested it on a few pages, logging in with your username and password,
submit a pull request so I can take a look at your script. If it looks good, I'll give you a "bot" account that you
can use for mass updates.
Check out other genealogy projects