According to Special:Contributions/WebCiteBOT has not been working since 2009-11-26. Why isn't this bot active, since I think it is a very much needed bot? Toshio Yamaguchi (talk) 07:44, 2 February 2011 (UTC)Reply
-
See Wikipedia:WikiProject External links/Webcitebot2. -- Ϫ 09:44, 5 February 2011 (UTC)Reply
-
-
User:Hydroxonium already invited me to participate in that task force and I have already expressed my thoughts on the projects importance there. Nevertheless thank you for pointing me to that page. Its much appreciated. Toshio Yamaguchi (talk) 13:34, 5 February 2011 (UTC)Reply
I have returned to Wikipedia after 2 years off. If someone could kindly update me as to whether this bot is still needed or has been replaced, I would appreciate it. --ThaddeusB (talk) 04:48, 14 February 2012 (UTC)Reply
-
Should check into that SoC archiving extension; AFAIK it is not used anywhere (much less En) and so the bot is still needed. --Gwern (contribs) 04:49 14 February 2012 (GMT)
-
-
Thanks for quick reply. Looks like mw:Extension:ArchiveLinks is not yet operational, and possibly not being worked on anymore. I found some discussion on enwiki from January of this year indicating that no progress had been made on either a WebCiteBOT2 or the ArchiveLinks extension. As such, I will try to get my bot relaunched within the next week. --ThaddeusB (talk) 05:44, 14 February 2012 (UTC)Reply
-
-
-
There was also this BRFA, which however expired. Toshio Yamaguchi (talk) 08:03, 14 February 2012 (UTC)Reply
I've restarted to additions monitor (still works) and emptied out the pending database (no point trying to archive links that were added a year ago). It will be 48 hours before I can test the editing part of the bot since there is a 48 hour delay between adding and archiving as per the original BRFA. I also updated the stats page. --ThaddeusB (talk) 03:51, 15 February 2012 (UTC)Reply
-
As promised, I have fixed up the code to deal with the (minor) changes in the Wikipedia API/WebCite that have taken place over the lats 2 years. It made its first test edits today (limited to 10 links) which were successful (log). A couple related links failed because of a problem on WebCite's end, which I will pass on to them. A more extensive test will take place tomorrow. --ThaddeusB (talk) 04:59, 17 February 2012 (UTC)Reply
-
A larger test (30 links) was done this afternoon. It revealed a few minor bugs that I've now fixed. Next test will start shortly. --ThaddeusB (talk) 01:12, 18 February 2012 (UTC)Reply
-
Test three (60 links) completed. One minor bug was found and corrected. Behavior changed so that pages with "season" in title do not have links like [1] in text changed to refs. Normally these links are poorly formatted refs, but in sports articles they are sometimes links to game recaps found in results tables. Adding an exclusion for "season" articles should make teh bot avoid this (slightly) undesirably change. --ThaddeusB (talk) 23:52, 18 February 2012 (UTC)Reply
-
Two more tests (70 total links) completed. (A smaller one to make sure changes I made didn't break anything and then a larger one). No bugs found. --ThaddeusB (talk) 00:32, 20 February 2012 (UTC)Reply
-
The biggest test to date (100 links) was performed. A small issue was found - the IRC link reporting bot unencodes URL encoded links for some reason, which means my bot reports them as removed when it checks the wikitext. A work around would be difficult, so for now I'll just let the bot skip those pages. (Looks like ~0.2-0.5% of all links are effected by the bug.) No other problems. --ThaddeusB (talk) 03:39, 21 February 2012 (UTC)Reply
I am curious as to how this bot will react to the presence of the {{Query web archive}} template in a citation. Can a test be performed soon? – Allen4names 09:34, 22 February 2012 (UTC)Reply
-
I was unaware of the existence of this template until just now, so thanks for bringing it to my attention... Whenever possible, the bot creates an archive specific to the date (or technically slightly after) when the reference was added to the article, whereas the template just allows people to search to see if one was made. As such, I see two reasonable alternatives: 1) do nothing if the template is present. 2) behave as normal & remove the template if a successful archive is made.
-
What are your thoughts on the bets way to proceed? --ThaddeusB (talk) 18:45, 22 February 2012 (UTC)Reply
-
I would prefer option 1. The template links to the Wayback Machine and Wikiwix archive services and WebCite has been down a few times. If in is unnecessary to make a change why do so. – Allen4names 20:47, 22 February 2012 (UTC)Reply
-
Well, the two aren't quote the same thing. The {{Query web archive}} is saying "if this link is dead, try theses" whereas the archive link inside the {{cite xxx}} is (in theory) saying "this is what the page looked like when used as reference for WP:V purposes." In any case, it is a very rare situation. I have made the change to simply skip these cases, as you suggested, and tested it locally on sample text. --ThaddeusB (talk) 22:24, 23 February 2012 (UTC)Reply
-
Okay with me. Thank you for your time. – Allen4names 04:43, 24 February 2012 (UTC)Reply
Latest trial (100 links) complete without significant issue. I made a few minor improvements to the regex logic, including the suggestion by Allen4names above, and will start the next test shortly. --ThaddeusB (talk) 22:24, 23 February 2012 (UTC)Reply
-
Another 130 links complete. One minor issue that caused the bot to skip a few things it didn't need to was fixed. --ThaddeusB (talk) 02:11, 25 February 2012 (UTC)Reply
-
Another 100 links complete. Only one very minor change to the bot was made. --ThaddeusB (talk) 18:27, 26 February 2012 (UTC)Reply
-
200 more processed without any issues. Getting close to "going live" (unsupervised). --ThaddeusB (talk) 03:30, 27 February 2012 (UTC)Reply
A new feature - support for varying date formats - has been added. The bot will now use "January 01, 1900" if {{use mdy dates}} is present and "01 January, 1900" if {{use dmy dates}} is present. If neither is present (>95% of all articles), the bot will continue to use 1900-01-01. These templates didn't even exist at the time the bot was originally programmed. --ThaddeusB (talk) 05:41, 10 March 2012 (UTC)Reply
So... is it operational? How can I direct the bot to a specific article that needs a lot of web archiving? Thank you, this is a great project! (talk) user:Al83tito 22:55, 4 April 2013 (UTC)Reply
-
No contributions since 12 March 2012, not sure why not. LeadSongDog come howl! 04:15, 5 April 2013 (UTC)Reply
-
-
I'm looking into restarting the bot this weekend. Barring anything unexpected happening, it should be editing again within the next few days. --ThaddeusB (talk) 05:13, 16 June 2013 (UTC)Reply
-
Test runs have begun. No problems have been detected so far. --ThaddeusB (talk) 05:41, 20 June 2013 (UTC)Reply
-
-
-
-
Great news THaddeus! Question, though: given the instability that WebCite seems to be going through, have you considered trying to submit to both them *and* Archive.is going forward? Might be good in case WebCite gets shut down. — Huntster (t @ c) 10:08, 20 June 2013 (UTC)Reply
-
WebCite claims the old archives will stay online even if they stop accepting new ones... This is the first time I've heard of archive.is. Do you know if anyone has contacted them about high volume archiving from Wikipedia? I wouldn't want to just start sending them a bunch of traffic w/o them approving of it first. --ThaddeusB-public (talk) 16:15, 20 June 2013 (UTC)Reply
-
I don't know that anything has been said to Archive.is regarding en.wiki, but I do know it is in wide use (several thousands of pages) on each of the French, German and Italian wikis. Given that about half the time I go to manually archive something there, it already exists, I have to suspect there is some automation happening.
-
If you weren't aware, there was a discussion on Meta regarding the WMF allocating funds to support WebCite, but I don't believe it got anywhere. However, Archive.is was mentioned http://meta.wikimedia.org/wiki/WebCite#archive.is , as was another archiving service in heavy use with the French, wikiwix.com. Both sections are next to each other. Not much to read, but maybe of interest nonetheless. I've only just started using Archive.is and have never used Wikiwix, so I cannot say which seems to be better. — Huntster (t @ c) 21:40, 20 June 2013 (UTC)Reply