Personal win: I created an automatic - manually add book script

DiscussãoHacking LibraryThing

Aderi ao LibraryThing para poder publicar.

Personal win: I created an automatic - manually add book script

Este tópico está presentemente marcado como "adormecido"—a última mensagem tem mais de 90 dias. Pode acordar o tópico publicando uma resposta.

Editado: Nov 8, 2018, 3:37am

Now, this isn't something anyone else can use (I think, if not I'll be happy to share), but it is a personal win because it saves me so much time and makes adding books to LT worth it again...

- My set up:
I have my own MySQL DB which holds all my book data. More info than LT, and structured differently. This is my master, and LT is (just one) online derivative. I add all my books on LT manually because I want my edition details to be exact. My covers are all personal scans.

- My problem/previous way of doing things
My personal front-end of the DB (written in PHP/Laravel) had a page that had the Manual add page on LT in an iFrame, and all relevant data structured correctly above it, so I could copy paste the whole lot. Manually. Which is fine for one or two books, but we go on book hauling sprees, so for example for the last two weeks I was 120 books behind in adding them...

- My solution/new way of doing things
I'm a software tester by trade, and do lots of automated testing. So, I set up a basic testing framework using TestNG and Selenium, and made an automated "test" (written in Java). This test takes the id of one of the books in my own system, and generates all data correctly formatted. It opens the add manual page, fills in all the data. It even selects the correct media type and languages. It submits, and voila, a book is added. All I need to do is run it from my command line (I use Maven to get this working). The only manual work left is setting the correct collection (I really couldn't get that working, it wouldn't select the select box), adding the cover (I don't know how to open this page just based on the info I have and get from the "Add Manual" page) and add the "From Where" info (I can't be bothered to match my own acquisition info to LT's local stuff).

Just wanted to share, I feel pretty proud and happy to finally (I'd been planning this for a long while) get this working.

Next project is harvesting Work ID's based on my Barcode (Barcode is my own from my DB)....

Nov 8, 2018, 3:49am

So, if I get this right, this only makes sense if LibraryThing is not your only book database and you want to avoid duplication of effort and want to be able to single-source? For the rest of us (mostly) who only use LT, this system has no benefit?

Nov 8, 2018, 4:01am

Yes, and if LT is not your main system (but a derivative/back up). So that it gets the data from somewhere else. My MySQL DB is always the master for any data I enter on any book website (LT or otherwise).

Nov 8, 2018, 4:04am

I can imagine it might help if you want to migrate from one user to another user, or merge accounts, or separate accounts.

Editado: Nov 8, 2018, 5:28am

>4 JerryMmm:

For sure. Bit of a heavy setup maybe, but it should be doable. Or scrape or export account 1 into a DB, and then use that as input for something like this.

Nov 8, 2018, 1:57pm

It might be useful for importing from Calibre - which is where I consolidate my ebooks (from various sources). I believe the database for that is MySQL. At the moment, I tend to use the ASIN to search the edition.

Nov 9, 2018, 4:13am

>6 Maddz:
It's an SQLite database, but you can also connect to those using JDBC (which I used to connect to my DB). Which gives me a good idea, because I want to link my Calibre with my own DB too....

Editado: Nov 9, 2018, 5:04am

>7 divinenanny: Bearing in mind we're a Mac shop, I would be interested to know if you can manage that.

At the moment, I'm using Calibre tags to log whether that work has been logged in LT; what I'd like to do is to semi-automate that process so when I go on a buying spree at Amazon, I can easily log all my purchases, insert the Calibre book ID into 'Other Call No' and insert the LT book ID into a custom Calibre column... It would reduce the numbers of tags I currently use.

{GRIN} Why am I thinking a Calibre plug-in is in the offing? (At least to scrape the LT book ID and match with the Calibre book ID...)

Nov 9, 2018, 1:56pm

>8 Maddz:
I have a Mac too, so no worries there.

Ha, I was thinking the same thing today about a Calibre plugin. I thought, if I have my own DB as a master source, I could write Calibre plugin to take my edition ID and fill in the rest of the metadata....

I love fiddling around like this :D

Would you want to import from Calibre to LibraryThing or the other way around?

Nov 9, 2018, 2:08pm

Import from Calibre to LT, please... (Although I would be interested in scraping the LT work ID into Calibre and vice versa for tracking purposes.)

LT is my master database of mostly (there's some I'm not admitting to, hem, hem) all the books I own - Calibre is my master database of ebooks owned. So Calibre is the subset. All my ebooks get imported into Calibre (with the exception of huge pdf files), get their metadata tidied up, converted into epub if in a different format, and the covers sorted. They then get imported into LT and flagged as an ebook, and eventually the print edition (if there is one) gets culled unless there is a compelling reason to keep it.

Are you on MobileReads? Should we take this discussion over there if you are?

Nov 9, 2018, 3:17pm

>10 Maddz:
Nah, I just read Mobile Reads when needed.... This is a hack group so I guess we are fine here too, who knows who we help ;)

I can try next week to strip the RBB (my own database) part from the LT part of my script, so you are left with placeholders for the data that can be filled up if you can manage a connection to the Calibre DB.

Reverse I was reading about (changing the Calibre metadata), but I learned I need to use the API for that (but hey, there is an API). Right now my (vague) plan is to find a good metadata plugin and change that over to fit my own system.

Nov 10, 2018, 6:58am

>11 divinenanny: You may want to check this thread on Mobilereads:

So getting data from LT to Calibre might be a no-no.

What would be a useful plug-in would do the following as a one-off event per ebook:

- Select a list of newly added books (or ones with a specific tag)
- For each book, select the LT collection and one or more tags (for example, collection Science Fiction & Fantasy, tag Urban Fantasy)
- Using a suitable ISBN or ASIN, search that book in the add books screen
- User clicks add book when a suitable match is found (may have to add looping to cope with switching sources)
- Plug-in opens the edit book page and updates metadata including the Calibre book-id in Local Call No.
- If there is no ISBN or ASIN recorded (e.g. Project Gutenberg books), the plug-in opens the manual add book page, and adds metadata according to Calibre as before.
- User reviews the input data, and confirms by clicking 'Save' after making any changes necessary
- Clicking 'Save' scrapes your book ID from LT (perhaps using the URL?) and adds it to Calibre in the ID field as LTID:####

Any further editing would need to be done manually - switching covers, changing tags, adding series data.

Th idea is to automate the add book process as much as possible - it can be a chore, especially if you've been taking advantage of a sale (like the Christmas Gateway SF sale - I think between us we got over 50 books, plus some daily deals, a Humble Book Bundle and a Storybundle, and the January deals. I had to manually add over 100 books when we got home after Christmas - seriously tedious.

Mind you, even more tedious is adding other authors to anthologies...

Nov 10, 2018, 11:54am

>12 Maddz:

I wouldn't go from LT to Calibre, but Calibre to LT (but that's because I trust my own data a whole lot more).

The things that are hard with the proposed steps are:
- I haven't managed to select the collection. The standard collections have standard ID's, but all custom collections have their own ideas. I planned to hard code them in my script, but then ran into the issue that I couldn't select the checkbox. Because nearly all my books are in the same five collections, I just gave up and adjust manually afterwards when needed.
- The plugin/script/test I made is a one run deal, with no room for user input after it starts. So no room in the way I do it (and the tools I used) to have the user select the best match.
- Doing it all in manual add is a whole lot easier
- Having the user manually click save is no issue (the script can just stop before pressing save and not close the browser.
- Scraping when saving would be hard too, for two reasons. Firstly, the scrape scripts I use are in PHP and I start those manually, secondly I often need to combine stuff first, which might change the BookId. My plan was to make a collection view with at least barcode and bookid in the view, and scrape that.
- Adding series data and covers is something that is manual now.

And I know all about those sales. Even for us not in the US or UK they can have some amazing deals...

I gave up on other authors in anthologies. I see no benefit on LT of adding any anthology data (authors and stories) or collection data (stories). I have written script that takes the dump from ISFDB and imports relevant content data from anthologies/collections/magazines (publication on ISFDB) and add all stories/authors etc. to my own DB.

Nov 10, 2018, 12:09pm

>13 divinenanny: Um, then best to automatically add to 'Your Library'? I, personally, don't keep works in there at all - everything is in another collection, and it would be easy enough to manage manually from there - changing cover, changing collection, updating the metadata.

Well, we'll see. I need to look at exporting book IDs anyway.

Nov 10, 2018, 1:12pm

>14 Maddz:

Exporting Book IDs... yeah, first easy step would be to see if they are included in any of the exports....

Editado: Nov 10, 2018, 2:32pm

>15 divinenanny: Yes, they are. I checked the Excel export - the ID there matches the ID that pops up in the URL when you click on 'Edit your book'.

Unfortunately, the idea of manually inserting nearly 2400 LT Book IDs into the Calibre ID field makes me blench...

I'll also have to check Calibre to see if the Calibre book ID is readily accessible in the metadata - I really don't want to scrape that from individual book folders in the Calibre directory.

Nov 11, 2018, 8:45am

>16 Maddz:

Ah good to know... that means a very easy (periodic) way to add this info to my DB, thanks!

You can get the Calibre Book ID from the metadata.db ( ID's in the Ids field are stored in the identifiers table, linked via =

When I read the online docs, it is heavily discouraged to directly edit the metadata.db, but I don't see why you couldn't mass-add the LT ids into the identifiers table.

I have made a custom column for my own IDs, which works the same way.

Editado: Nov 11, 2018, 9:04am

>17 divinenanny: Yeah, I've pulled the Calibre book id into the list view by creating a custom column. Unfortunately, there doesn't seem to be a way to make it visible in the Edit Metadata view (which is where I add the ltid into the ids field), although you can pull other custom columns into that view.

Luckily, my Calibre IDs are only 4 digits so are reasonably easy to manually input into LT (although I'd still prefer not to have to).