:::Author webpages are neither repositories, preprint servers, nor research networks. <span style="font-variant:small-caps; whitespace:nowrap;">[[User:Headbomb|Headbomb]] {[[User talk:Headbomb|t]] · [[Special:Contributions/Headbomb|c]] · [[WP:PHYS|p]] · [[WP:WBOOKS|b]]}</span> 21:32, 28 October 2017 (UTC)
:::Author webpages are neither repositories, preprint servers, nor research networks. <span style="font-variant:small-caps; whitespace:nowrap;">[[User:Headbomb|Headbomb]] {[[User talk:Headbomb|t]] · [[Special:Contributions/Headbomb|c]] · [[WP:PHYS|p]] · [[WP:WBOOKS|b]]}</span> 21:32, 28 October 2017 (UTC)
::::I am going to post at ANI to have this bot paused. Done here. [[User:Jytdog|Jytdog]] ([[User talk:Jytdog|talk]]) 22:17, 28 October 2017 (UTC)
::::I am going to post at ANI to have this bot paused. Done here. [[User:Jytdog|Jytdog]] ([[User talk:Jytdog|talk]]) 22:17, 28 October 2017 (UTC)
:::::[[User:Jytdog|Jytdog]], author rights for sharing their paper (and which version) can be deterimed at this website, which we link to right in the tool on every page where you can add a link: [http://www.sherpa.ac.uk/romeo/index.php http://www.sherpa.ac.uk/romeo/index.php] [[User:Ocaasi (WMF)|Ocaasi (WMF)]] ([[User talk:Ocaasi (WMF)|talk]]) 23:04, 28 October 2017 (UTC)
==ANI==
==ANI==
Revisionasof23:04,28October2017
Question
Naive question(s) from a non-wikipedia-person (JTW):
How much standardization is there, and how many edge cases are worth pursuing? I'm trying to figure out what tags to search for but it seems like there are layers of deprecated standards at this point.
A first pass is to just worry about references using the Cite Journal syntax. That's pretty standardized and easy to match. The simplest script that's worth writing is something like: find all cite journal tags, look for doi/pmid/pmc IDs, and look up an OA link to that paper if it's not present — Preceding unsigned comment added by Jamestwebber (talk • contribs) 18:27, 11 September 2015 (UTC)[reply]
Actually, the more I look into this the more confused I get. Can we establish a set of test-cases that this bot should handle?
To get a free to read copy of articles by DOI, we could use the CORE search engine via its API. It accepts DOIs and other identifiers as search parameters. Note however that the indexing looks a bit faulty to me: for instance, this arXiv document is associated with a DOI, and CORE harvests arXiv, but searching for this DOI from the CORE interface does not return anything.
The metadata tools we have developped for the Dissemin project overcome this issue and it should not be too hard to provide a similar API to be used by this bot. Pintoch (talk) 20:17, 11 September 2015 (UTC)[reply]
Wikidata should be involved/merged with this
Hi. I was at the recent Wikipedia Science Conference. At this, Dario Taraborelli had a great suggestion that Wikidata could house ALL the mappings of useful literature: DOI <-> PMID <-> PMC <-> arXiv ID
Hi, the OA Signalling project is doing something very similar, just for openly licensed references. A short sketch of the workflow sits here, and it includes Wikidata's WikiProject Source Metadata (alluded to above) as well as a gadget to display information from the OA Button. It would be great if we could join forces on those aspects that are independent of paywalls and licensing. -- Daniel Mietchen (talk) 00:33, 24 November 2015 (UTC)[reply]
I've written a quick proof of concept here. Feedback welcome! An interesting discussion about how open access should be indicated in references is taking place here. Pintoch (talk) 18:05, 16 March 2016 (UTC)[reply]
Might "Free Version"be better in front?
There's a section the the free version link has "Free version" added at the end. I think it might be better to put "Free version" or whatever marker at the front of the link rather than at the end, especially if the links are going to look like the one in the example. Chuck Baggett (talk) 15:07, 28 May 2016 (UTC)[reply]
Hey, Chuck Baggett. Thanks for the feedback. This mockup is entirely hypothetical and would ultimately have to be refined and approved by the CS1 template editors and other reference buffs. I personally think it's problematic to put free version before basic identifiers like title and author and date. There may be many other ways to make the free version link more prominent and I'm open to modeling and demoing any/all of them. For now the immediate focus is on having the bot add a link (technically). How that link appears is important and not-yet-decided. We will raise that discussion in the next week or two. Cheers! Jake Ocaasi (WMF) (talk) 15:57, 31 May 2016 (UTC)[reply]
I think the |url= should be used as a place where put the most useful link for our readers. A free to read version is arguably more useful than a paywalled one. We should make sure we still link to the version from the publisher, but that is what |doi= is for. If the free to read link also corresponds to an identifier, then we should also add it as an identifier (so, it would appear both as |url= and |arxiv=, say). Adding a "Free version" link would generate too much clutter, I think. − Pintoch (talk) 19:18, 31 May 2016 (UTC)[reply]
I think the url should be the version the editor actually read when they cited the content, but I'm open to discussing all the options. Jake Ocaasi (WMF) (talk) 18:42, 2 June 2016 (UTC)[reply]
URL replacement
Re the "Edge cases for future development": it's always good to remove an URL to a paywalled version from the url parameter, as long as the DOI is provided (which can be used to easily reach the publisher's version). --Nemo11:30, 15 August 2017 (UTC)[reply]
Another good example: in [1], the existing URL is broken and the CiteSeerX cache is probably an archived copy of that original URL. It would be very good to replace or remove the broken URL. --Nemo11:33, 30 August 2017 (UTC)[reply]
CiteSeerX
I'm duly checking the CiteSeerX links before adding them, so I now got this (after about 20 downloads):
Download Limit Exceeded
You have exceeded your daily download allowance.
You can try downloading the uncached versions that they list instead (which are not hosted by them, so you should not have any rate-limit on that). But they are not always listed though. − Pintoch (talk) 12:40, 30 August 2017 (UTC)[reply]
On the bright side, now that oaDOI was added as a source the CiteSeerX links are much less common, so it will take more time to hit the limit in any given day. --Nemo07:36, 26 October 2017 (UTC)[reply]
cds.cern.ch links
In[2], rather than linking at [3], it should link to [4].
This should be true in general. Link to the free document/PDF when possible, rather than simply to a page where the document can be found if you look hard enough. Headbomb {t · c · p · b}12:27, 30 August 2017 (UTC)[reply]
Going to @Nemo bis: on this as well, since you've unleash the both on a lot of physics articles, creating a lot of these links needing to be updated to point to the PDFs. Headbomb {t · c · p · b}12:28, 30 August 2017 (UTC)[reply]
Personally I prefer links to the records because then the abstract is quickly accessible. I prefer the link to the PDF only when the interface makes the PDF hard to find. --Nemo12:34, 30 August 2017 (UTC)[reply]
Repository managers also tend to prefer that, as it gives an opportunity to the reader to discover their platform. I have met multiple researchers who were explicitly told not to give direct links to the full texts but to the landing page instead (for various reasons). If a direct link to the PDF is really preferred (by a guideline somewhere on Wikipedia), then the CiteSeerX identifier should be updated to point directly to the cached PDF (and same for arXiv), as the PDF url can be obtained directly from the identifier. − Pintoch (talk) 12:44, 30 August 2017 (UTC)[reply]
CERN links are hard to find. They're buried at the bottom of a page containing videos, and half of million other links. We should put readers first, not repository managers first. Go at [8], where is the relevant link? It will take you a while to find it. Headbomb {t · c · p · b}12:56, 30 August 2017 (UTC)[reply]
For me it took maybe a couple seconds (without knowing the repository software). There is a clear "PDF" link text and icon, with good contrast, in a clearly delimited area, in a predictable position, without a need for JavaScript, localised in my language. This is not a case of hard to find PDF. Additionally, what if the user is interested in the video after all? From the PDF URL they'll almost never be able to go back to the record. Nemo13:06, 30 August 2017 (UTC)[reply]
For me it took me about 2 minutes, because I thought it was the video, and that didn't make any sense. Clicking on download also didn't give me the paper I was looking for. Then I scrolled to the bottom of the box, and there was still no link, so I went back up and dug in "files" where I finally found the link. Headbomb {t · c · p · b}13:20, 30 August 2017 (UTC)[reply]
I do realise that people react differently. For instance I tend to not click anything (I'm particularly video-blind) and to use page up/down or "end" abundantly. But still, you'll probably agree this repository is a masterpiece in usability compared to, say, Elsevier's websites. --Nemo13:43, 30 August 2017 (UTC)[reply]
I'm not saying a link shouldn't be given, but it should be a link to the document, rather than making the reader hunt for it, otherwise they'll think it's a link added in mistake, or a link only containing superficial information about the document. Headbomb {t · c · p · b}14:16, 30 August 2017 (UTC)[reply]
I think for me the safest way to reject this change is simply to say: I am happy to deploy this change if somebody takes the time to write the code for it... − Pintoch (talk) 17:12, 30 August 2017 (UTC)[reply]
Article size
Sometimes the tool seems to timeout on some articles. Do we know what's the largest article size or number of links it can handle? For now the biggest I found in my testing is [9], I think. --Nemo12:38, 30 August 2017 (UTC)[reply]
That is a problem indeed. No I don't know what the maximum size would be. Note that there is some caching at reference-scale, so the request could potentially complete if you try a second time. − Pintoch (talk) 12:45, 30 August 2017 (UTC)[reply]
OAbot usage
Is there any way to see the OAbot edits a user has made? I found a link that seemed to be an error in a CiteSeerX link - namely, the link did not go to a full article but rather went to a notice that said
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
So, then I look up in the upper right hand corner and see a pdf link, which has the following info within the URL - www(dot)employees(dot)csbsju(dot)edu - and which, from the link, I can see is a link for a specific Economics class taught at the College of Saint Benedict/Saint John's University in Minnesota. I'm not sure this is all strictly legal-ish, to get at this article's content through a link that clearly isn't posted for public use. Shearonink (talk) 15:00, 2 September 2017 (UTC)[reply]
I'm not sure I understand your question or what case you're talking about, but if the link gets interrupted for copyright reasons that's one more reason to consider the CiteSeerX links ok: it means they handle copyright notices so we need not worry about what remains up. As for the legal implications of linking, it's generally ok to link resources which are already public in the web (see e.g. CJEU on "new public"). --Nemo09:08, 3 September 2017 (UTC)[reply]
Adding links when doi is already free
According to § How does the bot work?, the bot should only be looking for links when the existing linking isn't the full free article. That seems in keeping with general ideas of maximum compliance with WP:V and making as much source material as reachable as possible to readers, while simultaneously not bloating refs with redundant external links. Things like DOI and PMID have the advantage of being stable and redirect properly, whereas links to publishers' websites are susceptible to linkrot if the publisher changes their website. I noticed a series of edits today where following the doi link allows me to access to the free full article for free, but using OABot still added a direct link to it (example). Is that intended? DMacks (talk) 02:22, 24 October 2017 (UTC)[reply]
As an alternative, the CS1 citation templates have a field specifically to identify when a standard identifier does provide the full content for free (vs just free abstract and possible link to paywalled full). See Template:Cite/doc#Access level of identifiers for details. Seems like it would be preferable to note that an existing stable link is already free vs adding a less-stable additional link that goes to the same provider. DMacks (talk) 02:30, 24 October 2017 (UTC)[reply]
I don't see how adding an extra (accessible) link is a problem. It's a problem only to add paywalled links. :) --Nemo14:23, 24 October 2017 (UTC)[reply]
Actually according to the docs (WP:OABOT#Examples), it already is supposed to know about and add |doi-access=free tags. Maybe do a regex or string comparison of the proposed link and the doi (or other identifier) and if they match to within some closeness (same hostname and maybe some later string details) assume that the publisher itself (target of doi link) hosts the free content (url link) and presumably one can get to free to doi. DMacks (talk) 07:48, 27 October 2017 (UTC)[reply]
Click "Start editing a random page", and then at the top you'll have an input field where you can type the name of the page you want to analyze. Analysis will take a long time, though. − Pintoch (talk) 17:16, 24 October 2017 (UTC)[reply]
Issues
Hello. I am having several issues:
The random pages button isn't working properly. It's giving me pages that other WIkipedians have already included links with the OAbot e.g Biotechnology, Ant and Economics. I've only just started today using this bot, so is it starting from the list from scratch for me?
Also, edits are not being made for me on Firefox/Microsoft edge for Brain. I've checked the wikipedia article to check if it's being used (which it isn't) but for some reason, I can't add the link. Thanks --MrLinkinPark333 (talk) 22:10, 24 October 2017 (UTC)[reply]
Rights checking
Hi,
I wanted to free some references as my birthday gift. It turned out checking the rights on the link proposed.
Alexander technique and Feldenkrais method: a critical overview
Sanjiv Jain, MD, Kristy Janssen, PA-C, Sharon DeCelle, MS, PT, CFT
So I wonder about the rights on publications available trough CiteSeerX.
Using the bot, what is the interaction with the authors ? In my opinion changing access to science should be done upstream, not downstream. I emailed Sanjiv. I'll make a deeper opinion on this with such experiences.
[10] is obviously a bogus URL. Is the bot confusing url with title (trying to construct a wiki-formatted link with visible alternative text) but then inducing editors to paste that as the "url" itself or is it failing to urlencode the url to protect whitespace? DMacks (talk) 17:34, 26 October 2017 (UTC)[reply]
Whatever is going in with this bot, people are using it to add, and even re-add, ELNEVER violations. See this followed this for example. I don't know if it is the bot or people not being careful enough using it, but either way COPYVIO additions are being added throughout WP. Jytdog (talk) 15:02, 28 October 2017 (UTC)[reply]
The paper that OAbot suggested a link for was published by Liebert and their policy is here and says authors can post preprints but says in bold: "The final published article (version of record) can never be archived in a repository, preprint server, or research network." The link there is to the final published article.