|
→Discussion: Reply
|
||
Line 38: | Line 38: | ||
::<code><nowiki>(?<!/)(?<!\\?url=)https?://eci[.]gov[.]in/[^\\s\\]|}{<]*[^\\s\\]|}{<]*</nowiki></code> |
::<code><nowiki>(?<!/)(?<!\\?url=)https?://eci[.]gov[.]in/[^\\s\\]|}{<]*[^\\s\\]|}{<]*</nowiki></code> |
||
:Also verify the new URL is working before switching, do a header check, don't assume, websites always have error rates some higher than others. Other issues might arise, most problems will show up during the first 100 or so edits. Common trouble points are {{para|url-status}}, {{tlx|webarchive}} and {{tlx|dead link}}. Also links that are square and bare. It might too difficult to get all these exactly right, if you can change the main {{para|url}} and square URLs and verify the new URL works, that will go a long way! -- [[User:GreenC|<span style="color: #006A4E;">'''Green'''</span>]][[User talk:GreenC|<span style="color: #093;">'''C'''</span>]] 15:51, 8 June 2024 (UTC) |
:Also verify the new URL is working before switching, do a header check, don't assume, websites always have error rates some higher than others. Other issues might arise, most problems will show up during the first 100 or so edits. Common trouble points are {{para|url-status}}, {{tlx|webarchive}} and {{tlx|dead link}}. Also links that are square and bare. It might too difficult to get all these exactly right, if you can change the main {{para|url}} and square URLs and verify the new URL works, that will go a long way! -- [[User:GreenC|<span style="color: #006A4E;">'''Green'''</span>]][[User talk:GreenC|<span style="color: #093;">'''C'''</span>]] 15:51, 8 June 2024 (UTC) |
||
::I would definitely be cautious to avoid any potential mistakes. – [[User:DreamRimmer|<b style="color:black; font-family: Tahoma">DreamRimmer</b>]] ('''[[User talk:DreamRimmer|talk]]''') 16:57, 14 June 2024 (UTC) |
Operator: DreamRimmer (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 14:01, Monday, May 27, 2024 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): Python
Source code available:
Function overview: Fix the URLs for the ECI election database.
Links to relevant discussions (where appropriate):
Edit period(s): Every six months
Estimated number of pages affected: 5050
Exclusion compliant (Yes/No):No
Already has a bot flag (Yes/No):No
Function details: The Election Commission of India has moved all of its data (except for very recent elections) to a subdomain. As a result, URLs in more than 5000 pages are now invalid and are giving a 404 error. This bot will replace URLs like https://eci.gov.in/files/file/11699-maharashtra-legislative-assembly-election-2019
with the new URL https://old.eci.gov.in/files/file/11699-maharashtra-legislative-assembly-election-2019
. Simply replace https://eci.gov.in/
with https://old.eci.gov.in/
.
Why every six months? Primefac (talk) 18:28, 27 May 2024 (UTC)[reply]
https://eci.gov.in/
since it's a "recent election". At what point will that URL get archived to the https://old.eci.gov.in/
prefix? If it is archived after the subsequent election, why not just update the URL with the new election information along with the data it represents? Primefac (talk) 15:00, 6 June 2024 (UTC)[reply]
(?<!/)(?<!\\?url=)https?://eci[.]gov[.]in/[^\\s\\]|}{<]*[^\\s\\]|}{<]*
|url-status=
, {{webarchive}}
and {{dead link}}
. Also links that are square and bare. It might too difficult to get all these exactly right, if you can change the main |url=
and square URLs and verify the new URL works, that will go a long way! -- GreenC 15:51, 8 June 2024 (UTC)[reply]