More than 2 million research papers have disappeared from the Internet

My apologies if this doesn’t belong here, but when I read this article i thought it really reinforces the intent of using DEVONthink to capture papers (and other material) off the web if we want to reference it in the future.

A study identified more than two million articles that did not appear in a major digital archive, despite having an active DOI.

Original article:
More than 2 million research papers have disappeared from the Internet

7 Likes

Really interesting, thank you for sharing.

And timely, as I was explaining to a (non-academic) colleague last week that a URL is not a valid citation by itself. Links go dead!

2 Likes
  • The article isn’t about things disappearing as much as it’s about documents not being stored in large public collections. Those are two very different things. And having a DOI doesn’t mean it must be stored in such a public repository.

  • Why would DEVONthink disappear?

  • And if it did, so what? What do you think would happen?

3 Likes

Eventually Apple would introduce some hardware or OS change that breaks significant Devonthink functionality.

That’s highly unlikely and it probably would affect far more apps than ours.

Also, that still doesn’t answer what the actual issue with the documents in DEVONthink would be.

This won’t break your local documents which could be exported in the worst case.

3 Likes

I think that the OP might have meant that it’s good practice to store local copies of research materials in an app like DT when you can, because we can’t rely on them still being available on tap in the future. Rather than DT itself disappearing. The post was advocating for a LOCKSS approach with a significant personal research management component? I may be wrong :-).

I speak as someone who isn’t yet super old but has already seen paper journals go the way of the dodo (paperbound journal issues that used to be hardbound into a year collection by your local academic library; then stacked onto the shelves for browsing; then moved off those shelves to offsite ‘research stores’; and then pulped because no one calls them up anymore and ‘everything is online these days’…).

8 Likes

Quite right! One of the major style guides in the Humanities recently updated theirs to 4th edition (MHRA) and now mandates DOIs with every citation to a journal article, claiming that most if not all publications have one. Some sample checking of a major purveyor of online material in my field (JSTOR) suggested that this really isn’t true, though they offer ‘stable URLs’.

This is still a very unstable field with at least three classes of links: regular URLs, DOIs, ‘stable URLs’, and OP suggests none can quite be trusted.

3 Likes

Yes, my reasons for posting it were I thought it would be of interest to this community and that it reinforces for me that if you have access to a digital version of a paper, then storing a copy for your own future research referral in DEVONthink is a definitely a wise move.

“Our entire epistemology of science and research relies on the chain of footnotes,” explains author Martin Eve, a researcher in literature, technology and publishing at Birkbeck, University of London. “If you can’t verify what someone else has said at some other point, you’re just trusting to blind faith for artefacts that you can no longer read yourself.”

5 Likes

For me, many years of curating a PDF library that serves my research interests also unlocks DTs search power in ways that any one online database of millions of articles could never serve, especially since so much material is still behind paywalls or is kept in walled gardens exclusive to publishers. Much more signal to noise in my own ‘Library’ group (9000+ items there at present).

7 Likes

In this context, kudos to the professional societies who have chosen to digitize their archives.

I actually referenced a paper from the 1800s in my master’s thesis, and it’s not all that unusual for me to see citations that old in other people’s work.

6 Likes

It’s interesting, and I don’t know enough about it to know how it should be working, but in the UK any time you publish a book or magazine (professionally), you are required to provide a copy or copies to the national collection (it’s the law, and they have “book police” who monitor and send you emails until you do). Seems like if academic writing isn’t the purview of this law, we need a new institution who does cover it with the same zeal.

(I realise this is a UK specific institution, but I’m aware some other countries have similar setups, and maybe we should have a global one for all academic literature!)

2 Likes

I’d thank the lord for indexed directories in DEVONthink so I didn’t have to sift through the default storage structures!

1 Like

I think the beauty of DEVONthink is that I can store my papers in any folder hierarchy that suits my needs and DEVONthink can index them. They are then instantly searchable and accessible. Brilliant! I also have a cloud backup of the papers.

2 Likes

Same in Germany. Interestingly, before reunification, there were two national libraries that wanted to receive copies. Depending on your location, it was easy to ignore one of them.

There are several in the UK: the British Library, National Libraries of Wales, Scotland, and TCD, and Oxford and Cambridge. The recent-ish ransomware hacking incident at the BL has left many scholars without access to key resources, since they held a huge amount of digitised books and manuscripts, and ran for instance the Short Title Catalogue. I for one am still waiting for their stuff to come back online so I can get on with my work. So really another story confirming the OP’s note to save local resources (mind you, many BL resources can’t actually be saved in any way, and their online delivery is the only way to access e.g. manuscripts).

2 Likes

I was quite surprised by that, then when I thought about it, not so much. Indeed it is a good idea when you see something to DEVONthink 3 it. There are other reasons, quite a few that an article can become lost unless you have your own copy. I always had the instinct to keep a copy in DEVONthink 3. I do have a lot of bookmarks these days though… nothing I couldn’t live without is the rule for that though.

I have to say that now that papers take up so little space you can save thousands just in case. I don’t actually but I have important ones saved. On the other hand I suspect those are the least likely to vanish?
Some tricky issues with paywalls and things I won’t pound sand on here. In general I tend towards open access and more are becoming that way, especially if the subject is not commercially important directly as it were. As MsLogica says an URL is not a citation though it is useful to include one.

1 Like

Technically they are the physical locations, but it’s still one national collection. The deposit request (for adding your new publication to the collection) covers all six individual institutions and is under the umbrella of the one programme to record all published writing.

(I am being pedantic, because @chrillek’s point was that there were two national collections for Germany in recent past, but that isn’t the case in the UK.)

1 Like

About twenty-five years ago, one of the librarians at Cambridge University Library told me that they were receiving around one-and-a-half miles of books for legal deposit every year. Storing them was becoming quite a problem.

1 Like

I didn’t know that – thanks for setting me straight on this!