Black Hole work

DiscussãoBug Collectors

Aderi ao LibraryThing para poder publicar.

Black Hole work

1AnnieMod
Mar 8, 2023, 5:44 pm

Unrelated books and titles are automatically combined with https://www.librarything.com/work/17064 for the last at least 2 months. A lot of them.

See the history here: https://www.librarything.com/topic/347847 (as Tim insists on closing that one, here is the problem spelled out differently - I think both problems are actually related but as the latest catch was after the fix, either they are not or the fix is incomplete).

The latest which got caught was caught today. I’ve been pulling multiple titles out since the other report brought the book to light.

2amanda4242
Mar 8, 2023, 6:14 pm

I had to separate my copy of Ethel's Love-Life and Other Writings from that work this morning.

3kristilabrie
Mar 9, 2023, 9:05 am

>2 amanda4242: Can I ask exactly how you added that title to your library? Add books page, Add Books Express (Quick Add)? LibraryThing App?

4kristilabrie
Mar 9, 2023, 9:11 am

Please leave the next bad combination there to look at and I can show timspalding.

5AnnieMod
Mar 9, 2023, 9:58 am

>4 kristilabrie: I will (and I will post here about it) but I think that there are other people keeping an eye on that one so someone else may separate it (and disambig note won’t help in a separation case).

6knerd.knitter
Mar 9, 2023, 10:10 am

Scanning through the editions:
A Geography of the Hutterites in North America/Evans, Simon M./ISBN 1496225082 (1 copy separate)

7shadrach_anki
Mar 9, 2023, 10:16 am

Okay, this is not anything I've added to my library, but I was looking at the editions page for this work and I spotted the following listed there:

A Geography of the Hutterites in North America/Evans, Simon M./ISBN 1496225082

Obviously not connected to the title in question, and doesn't seem to have a page of its own on LibraryThing. I'm leaving it alone because it seems like this is your latest live example of the behavior.

8AnnieMod
Mar 9, 2023, 11:00 am

It was not there when I looked earlier in the morning so... we have a live example :)

9amanda4242
Mar 9, 2023, 11:26 am

>3 kristilabrie: I used the add books page with Amazon as my source.

10kristilabrie
Editado: Mar 9, 2023, 11:29 am

>6 knerd.knitter: Wow, my eyes missed that one too many times when I was looking!

>8 AnnieMod: OK! Maybe I'm not as crazy as I feel.

11kristilabrie
Editado: Mar 9, 2023, 11:41 am

Yep, reproduced. Not sure how it's getting combined.

1. go to /addbooks
2. search Amazon.com all media for ISBN 1496225082
(see screenshot below of record data)
3. click the title to add it to your library
4. click the newly-added title to go to the book page (e.g. https://www.librarything.com/work/17064/book/236421101)

Bug: This is not the right work. (Women of the Bible)

NB: Just tested adding this from the iOS App and same behavior.

12AnnieMod
Mar 9, 2023, 11:36 am

>10 kristilabrie: Well, you deal with members of the public -- feeling crazy is permitted.

More seriously though - the work does seem to pick up a lot of these so you rarely need to wait more than a few hours before at least one gravitates too close and gets sucked in.

Just as a note: Most of the ones stuck are authors that LT does not have and titles which have nowhere to go - so they cannot combine elsewhere. The author thing is not definitive - sometimes it is known authors. And sometimes there is a better work to combine into -- but not an exact match.

However, 19 out of 20 are works which have nowhere to combine and remain singletons once I pulled them out.

13kristilabrie
Editado: Mar 9, 2023, 11:47 am

Okay, another weird thing: I don't see which author page A Geography of the Hutterites in North America exists on?? It's not on either of the authors listed on my book, nor on the author of the work that it's combined into Ann Spangler. Sorry, because it's combined with the Women of the Bible work... ignore me. (Lack of sleep brain.)

14DuncanHill
Mar 9, 2023, 11:53 am

>13 kristilabrie: It won't appear on either Evans or Evans as they haven't been confirmed as authors, and it appears on Spangler's page under the title "Women of the Bible".

15waltzmn
Mar 9, 2023, 11:55 am

>13 kristilabrie: Okay, another weird thing: I don't see which author page A Geography of the Hutterites in North America exists on?? It's not on either of the authors listed on my book, nor on the author of the work that it's combined into Ann Spangler.

Huh??

And neither of the listed authors at the top is listed as the author down below where it lists primary and secondary authors.

Also... what odds that that book is owned by 731 members? And the reviews are for a "Women of the Bible" book.

And it's not listed on the page of Ann Spangler, the listed Primary Author, though you can find it listed on the Work Combinations page. So somehow this book got combined with the Spangler book. Why? Who knows. Clearly should be split, but I didn't do it so that people can investigate. :-)

A "Search LibraryThing" search fails to find any work called "A Geography of the Hutterites in North America" that the book should belong with.

The book is available and is in print, though: https://www.barnesandnoble.com/w/a-geography-of-the-hutterites-in-north-america-....

So it looks to me like a very strange combination problem. The question is why and how.

16AnnieMod
Editado: Mar 9, 2023, 12:04 pm

>15 waltzmn: "And it's not listed on the page of Ann Spangler..."

That's because it got combined inside of "Women of the Bible" aka our Black Hole work the whole thread is about... So yes - the reviews belong to Women of the Bible - they are where they are supposed to be.

And that is not isolated - a LOT of books got caught into this one. Thus the thread.

17waltzmn
Mar 9, 2023, 12:25 pm

>16 AnnieMod: That's because it got combined inside of "Women of the Bible" aka our Black Hole work the whole thread is about... And that is not isolated - a LOT of books got caught into this one. Thus the thread.

I'm not arguing. I'm gathering data which might be useful. Or not. Basic scientific training: When in doubt, gather data.

18AnnieMod
Editado: Mar 9, 2023, 12:31 pm

>17 waltzmn: Well, yes but that is how every wrongly combined book looks like. So I was a bit confused on what you are trying to gather data about. :)

19waltzmn
Mar 9, 2023, 1:04 pm

>18 AnnieMod: Well, yes but that is how every wrongly combined book looks like. So I was a bit confused on what you are trying to gather data about. :)

Of course. (It's an endless problem for me, since I have a great many pre-ISBN works that are prone to strange combinations!) But if it's just a combination issue, why are we talking about it on a bug thread? :-) And I was doing my research before >13 kristilabrie: got updated. What I did made more sense then. :-)

20AnnieMod
Editado: Mar 9, 2023, 1:09 pm

>19 waltzmn: Because the same work catches a lot of unrelated works (I separate ~5 in most days and other people separate others as well).

If it is one or two per week/month (or usually less) going into one work or they all go into different works, we chalk it to "it happens, the auto-combiner misfires occasionally", we separate and we deal with it. When I pull 5 on average per day for weeks from the same work... something is going on.

21waltzmn
Mar 9, 2023, 1:13 pm

>20 AnnieMod: When I pull 5 on average per day for weeks from the same work... something is going on.

Yikes. Indeed. I missed or forgot that statistic. But, again, that's reason to be gathering data. What books are being linked to it, and why? Is there a list? If I were the developers, I'd want that.

(Never mind; I'll stop talking now. Last word is yours if you want it.)

22AnnieMod
Editado: Mar 9, 2023, 1:54 pm

>21 waltzmn: "I missed or forgot that statistic."

Ah, that's why you went in that direction. Now what you are doing makes sense to me:) All good :) I was trying to figure out what you are looking for and pointing out that once combined, it looks like every other mis-combination. It is really the scale... :)

I don't have a list of what I pulled out (and because of how some of the Separation log is displayed, it is not easy to find them). Maybe the developers have an easier way finding them by using the ID of the work.

Meanwhile, the work caught one more since the last time someone looked:
What do Entrepreneurs Create?: Understanding Four Types of Ventures/Morris, Michael H./ISBN 1789900212

23AnnieMod
Mar 9, 2023, 5:23 pm

And #3 joined the party:
Unnatural Selection: A Memoir of Adoption and Wilderness / Andrea Ross / (ISBN 193388083X)

24AnnieMod
Mar 10, 2023, 6:24 pm

#4 joined the party some time today:
Birth of a State: The Anglo-Irish Treaty / Liam Weeks / (ISBN 1788551591)

25shadrach_anki
Mar 10, 2023, 11:19 pm

#5 showed up sometime in the last few hours:

Nostalgic Journeys: From the Orient Express to Ocean Liners/Bitterle, Stefan/ISBN 3961713812

26DuncanHill
Mar 11, 2023, 11:30 am

>24 AnnieMod: "Birth of a State" also appears correctly (under the other co-author's name) as https://www.librarything.com/work/28798820/summary

27waltzmn
Mar 11, 2023, 12:36 pm

>26 DuncanHill:

Note that the link you cite is the "correct" one for the book: It lists Mícheál Ó Fathartaigh as the primary author, not Liam Weeks, which is as it should be (although Weeks should be listed as the secondary author on that page). Potential data point here: The copy which went to the black hole work apparently used the wrong author.

We should probably list the secondary author at https://www.librarything.com/work/28798820/summary, but it might be wise to wait just in case author confusion plays into the problem.

28AnnieMod
Mar 22, 2023, 12:31 pm

Our work had acquired more friends, although at a much slowed rate at the moment.

Should we rescue them out from the work or should we leave it for LT to be able to see what is happening.

29DuncanHill
Mar 22, 2023, 2:59 pm

>28 AnnieMod: "Should we rescue them out from the work or should we leave it for LT to be able to see what is happening"

My instinct would be to separate them (and combine where appropriate) now. LT have had plenty of time to see what's happening, and if they haven't then I'm sure more instances will be along soon.

30AnnieMod
Mar 22, 2023, 3:36 pm

>29 DuncanHill: That is why I am asking -- if they want them to stay a few more days, that's fine but sooner or later we should get these books where they belong. Say at the end of this week.. ;)

31kristilabrie
Mar 23, 2023, 8:21 am

>30 AnnieMod: Make it 2 weeks ;)

In all seriousness, I'm sure new examples will come along and I haven't seen timspalding mention he's on it this week, so feel free to clean them up and we'll deal with new examples as they arise.

Thanks!

32Juanlul
Mar 31, 2023, 4:47 am

Greetings

I am writing about a query I have regarding this ‘black hole’ record (https://www.librarything.es/work/17064/editions).

As noted, there are a significant number of books listed as editions of this record, which is incorrect. I contacted the administration and they recommended that I post the erroneous matches in this thread in case someone could tell me how to separate them. The list of false matches corresponds to the following ISBNs:

978-84-1122-130-6
9788417946654
978-84-19444-90-5
978-8426724311
9788418944550
9788436276992
978-84-16343-85-0
978-84-16622-29-0
9788490406779
9788490406755
978-84-9040-746-2
9788412483154
978-84-18971-60-0
978-84-670-6462-9
9788426723734
9788418819452
9788411310062
978-84-368-4604-1
978-84-18316-21-0
978-84-87143-62-5
978-84-368-4613-3
9788481883909
978-84-18388-95-8
9789728396572
978-84-472-3076-1
978-84-1319-378-6
9788427728851
9788413510927
9788417289010
978-84-9159-464-2
978-84-1377-971-3
9789897128042
9788426724557
9788413251394
9788413810256
978-84-472-2221-6
9788413193687
9788498606997
9788413251387
978-84-18932-01-4
9788428337540
9788418330704
978-84-1377-107-6
978-84-454-4322-4
978-8426726230
9788413913032
979-8635379011
978-84-368-4494-8
978-84-9192-212-4
9788413193779
978-8434433700
978-84-290-2564-4
978-84-1728-973-7

Thanks in advance

33kristilabrie
Editado: Mar 31, 2023, 8:30 am

To clarify >32 Juanlul:, the member is searching for these ISBNs on site search and finding Women of the Bible, an incorrect work match to these ISBNs. Likely cached data?

1. search for any of the ISBNs listed in >32 Juanlul:, in LT site search
2. search for the same ISBNs on Google
Bug: LT site search is returning the incorrect work Women of the Bible instead of the correct work for the ISBNs.

34DuncanHill
Mar 31, 2023, 11:23 am

Just removed two more:

https://www.librarything.com/work/30034318/editions "A Journey in Translation: Anne Hébert's Poetry in English" (Canadian Literature Collection)/Skallerup Bessette, Lee/ISBN 0776623761

and
https://www.librarything.com/work/30034323/editions "Historia del arsénico: Mineralogía, física, química e historia del elemento más mortal y literario de la tabla periódica" (Spanish Edition)/Calvo, Guiomar/ISBN 8417547355

35AnnieMod
Mar 31, 2023, 11:49 am

Yeah - I had been pulling out a few titles every time I look at it. Something is really wrong with this work ID.

36waltzmn
Mar 31, 2023, 11:57 am

>35 AnnieMod: (and all others on the thread): Something is really wrong with this work ID.

Wild suggestion: Is it possible that it has an ISBN or other identifying characteristic that starts with an invisible character, say unicode 00 = null, so that the field appears to contain something but is blank to the combination algorithm -- and so it sucks up other works which share... some other trait... with it? It might take someone who can look at the actual SQL to be able to tell, but this has gone on long enough. We need somebody to take drastic action. :-)

I'm probably wrong, but maybe it will get someone thinking....

37AnnieMod
Mar 31, 2023, 12:31 pm

>36 waltzmn: Highly possible.

Or one of the ISBNs in there just happens to hit some magic calculation which then matches a lot of things - hash collisions can be funny like that for example.

We really need someone from LT to look and figure it out (and that may help prevent future black holes from developing...)

38DuncanHill
Mar 31, 2023, 7:41 pm

And another

https://www.librarything.com/work/30039350/editions "THE BOOK OF SPELLS"/Harrison, Ella/ISBN 0241548659

This one does have a correct entry already but with a different ISBN

https://www.librarything.com/work/28582976/summary

Am combining.

39DuncanHill
Editado: Abr 3, 2023, 12:40 pm

And another, just separated it.

https://www.librarything.com/work/30054691/editions ISBN 0648964035

40AnnieMod
Abr 3, 2023, 12:42 pm

>39 DuncanHill: I've stopped reporting them when I find them - I just pull them out. If it had not become clear by now that this is not the usual "sometimes the autocombiner misfires", a few more won't change that :) Which does not mean you need to stop posting, I am just mentioning so it is clear that there are more than what is reported here. :)

41kristilabrie
Editado: Abr 3, 2023, 12:47 pm

Please leave the bad work ( https://www.librarything.com/work/17064 ) alone, for now. timspalding needs live examples to look at. Thanks.

42AnnieMod
Abr 3, 2023, 12:58 pm

>41 kristilabrie: I've added a temporary disambiguation notice so if people just stumble upon a misplaced book there without finding this thead, they know not to pull it out. Some people ignore the notices but hopefully none of them will find the work :)

43DuncanHill
Abr 3, 2023, 1:45 pm

>41 kristilabrie: There's no need to shout!

44kristilabrie
Abr 3, 2023, 1:56 pm

>43 DuncanHill: I didn't? I bolded it to make sure it gets seen in the long thread, particularly for new members who come across this.

>42 AnnieMod: Thank you!

45Nicole_VanK
Abr 3, 2023, 2:08 pm

>41 kristilabrie: Usually I will try to rescue identifiable books from them. But I will totally respect your wishes

46DuncanHill
Abr 3, 2023, 2:15 pm

>41 kristilabrie: Does Tim want us to continue reporting them here or not?

47kristilabrie
Abr 4, 2023, 8:05 am

>46 DuncanHill: You can certainly find and list new culprits, that would help! Otherwise I'll be checking in on this when Tim's ready to get back to it. (He's unfortunately out at the moment.)

48AnnieMod
Abr 4, 2023, 2:12 pm

>47 kristilabrie: We have company:

American Speakout, Elementary: Student Book with DVD/ROM and Audio CD / Frances Eales / (ISBN 6073240600)
Hidden Berlin: A Student Guide to Berlin's History and Memory Culture / Reinhard Zachau / (ISBN 1647930103)

You never need to wait too long for this work to get company...

49shadrach_anki
Abr 5, 2023, 12:11 pm

>47 kristilabrie: More company today:

Stressilient/ISBN 0008448043

50kristilabrie
Abr 5, 2023, 12:21 pm

51shadrach_anki
Abr 5, 2023, 5:51 pm

And a couple more have joined the party....

Dans mon village, il y a belle lurette... foc (French Edition)/Pellerin, Fred/ISBN 2897582138
The Art of Silence and Human Behaviour: Interdisciplinary Perspectives/Itten, Theodor/ISBN 0367504871

52timspalding
Abr 6, 2023, 12:34 am

Okay, I'm on the scent of it. A script that creates a helper table, mapping isbns to works, is wrongly attributing hundreds of isbns to that work. I can't find the bug without running the process slowly in steps again, so give me a day or two and don't remove the books above. Thanks!

T

53DuncanHill
Abr 6, 2023, 10:49 am

And another

Changing satire: Transformations and continuities in Europe, 1600-1830: 13 (Seventeenth- and Eighteenth-Century Studies) / Cecilia Rosengren / (ISBN 1526146118)

54shadrach_anki
Abr 6, 2023, 4:47 pm

Three more (though I'm honestly not sure what the difference between the second and third one is; visually they look identical):

The Land Without Promise: The Roots and Afterlife of One Biblical Allusion (The Library of Hebrew Bible/Old Testament Studies)/Koci, Katerina/ISBN 0567696294
Women in the Workforce: What Everyone Needs to Know® (What Everyone Needs To KnowRG)/Argys, Laura M./ISBN 0190093382
Women in the Workforce: What Everyone Needs to Know® (What Everyone Needs To KnowRG)/Argys, Laura M./ISBN 0190093382

55AnnieMod
Abr 6, 2023, 5:00 pm

>54 shadrach_anki: The encoding of ® - one of them is the symbol, the other one is the html code (page source confirms it) :)

56DuncanHill
Abr 6, 2023, 7:07 pm

Razones públicas: Una introducción a la filosofía política / Íñigo González / (ISBN 8434433702)

57timspalding
Abr 7, 2023, 2:30 am

I'm running a process tonight that is working on this. It should be fixed mid-day tomorrow. Stay tuned.

58timspalding
Editado: Abr 7, 2023, 11:16 am

Okay, the fix is up. It won't fix the ones that are there, but I *think* no new ones will enter.

It's a long story. The problem was basically that a process that "guessed" what work a given new edition belongs in got gummed up and started to snowball. By the end, some 7k ISBNs had that work as their fallback, when the exact edition wasn't already pointing to another work. That problem started with some data that should have had empty titles and authors having blank titles and authors instead. (In this case empty and blank actually meant something different. Sigh.)

Anyway, black holes can exist whenever a fallback is wrong. By nature these are guesses, so it's hard to verify that every one is correct. But I see no clusters of ISBNs within an order of magnitude of the size of the previous one. So I think we are good, at least for the near and medium-term future.

That said, this stuff is complex. It's possible there's some other trace of the data somewhere. So let's keep an eye out.

59AnnieMod
Abr 7, 2023, 11:26 am

>58 timspalding: Can we separate the ones that are already caught in or do you still need them inside for some reason?

I will leave this one on my workbench for now - which means I will be eyeballing it for new intruders at least daily.

60timspalding
Abr 7, 2023, 1:57 pm

Oh, yes, go ahead and separate them out.

61AnnieMod
Abr 7, 2023, 2:02 pm

>60 timspalding: Done. I've also removed the disambig notice to not separate editions.

I'll keep an eye on it and post if the work start accumulating new friends again.