Replies: 8 comments 9 replies
-
I really appreciate how much effort and thought you're putting into this! Hopefully I can lend some clarity to what happens inside BookWyrm. Here's how import works:
It sounds like 3 of your books were matched, and the reason they have metadata that wasn't present in the csv is that the book isn't being created from the csv, but rather looked up in a database based on the ISBN in the csv. The book with a dummy ISBN would have fallen back on a title/author search, which is why it got a match. Did the match have the correct title and author that corresponded to the csv data? And the book that didn't match should show you a message about what went wrong -- either "Could not find a match for book" or "Error loading book" -- which will hint at what happened. Can you tell me more specifics about what outcome you expected from the CSV vs the outcome you got? I'm not able to view the link to the import on BookWyrm, since it's only visible to you, so if you can post screenshots where it's relevant, that would be helpful. (The blog post you found appears to be out of date or possibly a homebrew system of storing book data, as those csv headers don't reflect that columns that Goodreads uses in any export I've encountered recently.) |
Beta Was this translation helpful? Give feedback.
-
I recently exported a TSV from LibraryThing and tried importing to BookWyrm on https://bookwyrm.social.
Is it possible to kill those imports and advise on LibraryThing import gotchas. I don't see any related bugs/problems under /issues. |
Beta Was this translation helpful? Give feedback.
-
>>>> Mouse Reeve ***@***.***>:
I really appreciate how much effort and thought you're putting into this! Hopefully I can lend some clarity to what happens inside BookWyrm.
Here's how import works:
1. The headers in the csv are matched up with a set list of headers that represent fields like ISBN, title, author, and your reading metadata. This allows it to understand a small variety of variations on common headers, for example it can use "isbn" "isbn13" or "isbn 13" as the ISBN column.
Yup, I found the mapping in the python code, and ended up with a minimal CSV containing only the fields mappend to.
That's the one I was partially successful in importing.
2. For each row in the file, it tries to find an entry in the book database that matches the book's identifiers provided in the csv file. First it will look up by ISBN, and if there's no match found or no ISBN in the csv, it will search by title and author. It looks first in the instance's book database, and if it doesn't find it there, it will search external databases like other BookWyrm instances, OpenLibrary, and Inventaire.
Are there any detailed info to be found about this book database?
A way to browse it?
A REST API that can be used to query it?
3. When a book is matched, it links the csv row to the book database entry and creates any reading metadata, like dates read or ratings and reviews and associates them with the book.
Right!
4. When a book has ambiguous results from a title/author search, it will ask you to manually approve or reject the best-guess book that it found.
5. When no match is found, you have the option to re-try those items. Sometimes this is because none of the databases had a match for the book, sometimes it's because of a transient error (like OpenLibrary was too slow to respond), and sometimes it's because of a bug in BookWyrm.
Ok.
It sounds like 3 of your books were matched, and the reason they have metadata that wasn't present in the csv is that the book isn't being created from the csv, but rather looked up in a database based on the ISBN in the csv. The book with a dummy ISBN would have fallen back on a title/author search, which is why it got a match. Did the match have the correct title and author that corresponded to the csv data?
Nope! A completely different title and author. :-)
And the book that didn't match should show you a message about what went wrong -- either "Could not find a match for book" or "Error loading book" -- which will hint at what happened.
It was "Could not find a match for book".
Can you tell me more specifics about what outcome you expected from the CSV vs the outcome you got? I'm not able to view the link to the import on BookWyrm, since it's only visible to you, so if you can post screenshots where it's relevant, that would be helpful.
I expected to be able to fill my bookshelves with books from the CSV.
I expected having to provide all of the metadata for the books.
I expected to be able to add new books to bookwyrm.
I have attached a screen shot of the import to this email message. If it is stripped off I'll upload the screen shot to the thread using the web GUI.
(The blog post you found appears to be out of date or possibly a homebrew system of storing book data, as those csv headers don't reflect that columns that Goodreads uses in any export I've encountered recently.)
Ok! Won't look at that anymore! :-)
Thanks!
|
Beta Was this translation helpful? Give feedback.
-
>>>> Mouse Reeve ***@***.***>:
Is it still hanging?
I don't think the import is hanging, exactly...? It just failed on one book (and mis-imported another).
After a certain period of time it should give you the option to re-try an import item. That said, imports are designed to be a very low-priority task, so it's normal that they can take quite a while.
Ok.
|
Beta Was this translation helpful? Give feedback.
-
>>>> Mouse Reeve ***@***.***>:
> Are there any detailed info to be found about this book database? A way to browse it? A REST API that can be used to query it?
http://openlibrary.org/ -- JSON search: `https://openlibrary.org/search.json?q=<query>`
https://inventaire.io/ -- JSON search: `https://inventaire.io/api/search?types=works&types=works&search=<query>`
http://bookwyrm.social/ -- JSON search: `http://bookwyrm.social/search.json?q=<query>`
Thanks! I will investigate.
> Nope! A completely different title and author. :-)
The search endpoint checks if a query looks like an ISBN, and if it doesn't (like 1234), it searches it as a free text query. This is a classic case of "garbage in, garbage out" -- for the purpose of import, it would help if it enforced ISBN search, but it's not a high priority issue
Ok. :-)
> I expected to be able to fill my bookshelves with books from the CSV. I expected having to provide all of the metadata for the books. I expected to be able to add new books to bookwyrm.
I see! It sounds like you understood the goal of the feature differently than was intended. It does populate your shelves and add books to BookWyrm, but it doesn't create them from the provided CSV metadata.
Yep!
So I'm not sure my book database has a mission anymore...? :-)
Unless you have a different way to contribute books to your database?
If so the application may still have a mission...:-)
In fact, it may have two:
1. Provide a way to edit and contribute book metadaata
2. Provide a way to set up bookshelves to import into bookwyrm
So maybe not wasted work after all...?
|
Beta Was this translation helpful? Give feedback.
-
>>>> Steinar Bang ***@***.***>:
>>>> Mouse Reeve ***@***.***>:
>> Are there any detailed info to be found about this book database? A way to browse it? A REST API that can be used to query it?
> http://openlibrary.org/ -- JSON search: `https://openlibrary.org/search.json?q=<query>`
> https://inventaire.io/ -- JSON search: `https://inventaire.io/api/search?types=works&types=works&search=<query>`
> http://bookwyrm.social/ -- JSON search: `http://bookwyrm.social/search.json?q=<query>`
Thanks! I will investigate.
You wouldn't happen to have a reference to the syntax of the queries as well...? :-)
(a link to the code in bookwyrm setting up such queries would suffice)
|
Beta Was this translation helpful? Give feedback.
-
@mouse-reeve , is it possible to use worldcat.org to lookup the imported books? If that isn't possible, it would be nice to have the option to add books manually from this import list, because most information is correctly imported (from Calibre catalogue), while the only problem seems to be that Bookwyrm cannot get a match from the book in openlibrary, inventaire and bookwyrm. |
Beta Was this translation helpful? Give feedback.
-
>>>> Peter ***@***.***>:
Thanks!
Hopefully it is not too difficult to make it possible creating books from import items. Would be a great help.
Heh! :-)
If it becomes possible creating books from import items, then my
https://github.com/steinarb/bokbase app will actually have aquired a
mission...;-)
|
Beta Was this translation helpful? Give feedback.
-
When I first looked at bookwyrm I couldn't figure out how to add my own books.
So I asked around and was told that I could do a CSV export from goodreads and import that into bookwyrm.
However, I didn't have any goodreads books to export (I have an account there, but I haven't ever used it for anything).
So I googled and found this page describing the goodreads CSV format: https://zief0002.github.io/epsy-8251/codebooks/goodreads.html
I figured "how hard can a book database be...?" and wrote a reactjs web app with PostgreSQL storage, and a datatabase schema made to store the data needed to generate the above CSV format: https://github.com/steinarb/bokbase
However the first imports went badly: none of the books were imported.
So I started looking at the bookwyrm import tests and running them on my generated CSV files. First using the goodreads import test, and later switching to the generic CSV import (since the columns in that format seems to be the one actually used).
I rewrote the database schema a little (changed publication time from year to date, and added a "finished read date" field, and added an ISBN13 field) and used the generic.csv column names.
And then I tried importing (unfortunately using a CSV generated from my dummy test database instead of the actual PostgreSQL database) and I am confused by the results: https://bookwyrm.social/import/1078
Also, when I look at the successfully imported real books, they seem to contain lots of stuff not in the generic CSV import, and (as far as I could tell) not in the fields used in the goodreads CSV imports (but values found in the above URL describing the goodreads import, such as series and publisher).
So I am confused.
Could someone who understand what goes on, perhaps explain why things turned out as they did?
And also what I should to to create CSV files that are more acceptable to bookwyrm? (because that's what I want to do...)
Thanks in advance!
And happy new year!
generatedBy_react-csv.csv
Beta Was this translation helpful? Give feedback.
All reactions