Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 KadaneBot 3  
46 comments  


1.1  Discussion  
















Wikipedia:Bots/Requests for approval/KadaneBot 3







Add links
 









Project page
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Get shortened URL
Download QR code
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 

< Wikipedia:Bots | Requests for approval

  • Approved BRFAs
  • talk
  • contribs
  • count
  • SUL
  • logs
  • page moves
  • block log
  • rights log
  • flag)
  • Operator: Kadane (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

    Time filed: 16:10, Tuesday, March 19, 2019 (UTC)

    Automatic, Supervised, or Manual: automatic

    Programming language(s): Python

    Source code available: Not published yet

    Function overview: Tags redirects with {{R to disambiguation page}}, {{R from unnecessary disambiguation}}, and {{R from incomplete disambiguation}} if it meets criteria described in function details.

    Links to relevant discussions (where appropriate): Wikipedia:Bot_requests#Tag_with_Template:R_from_unnecessary_disambiguation

    Edit period(s): Monthly

    Estimated number of pages affected: ~56,417 first run

    Exclusion compliant (Yes/No):No

    Already has a bot flag (Yes/No): Yes

    Function details:
    Note: This BRFA only covers the functionality mentioned in Case 2. Case 1 and Case 3 have been stricken
    Case 1: If a redirect exists Foo (bar) -> Foo where bar does not equal disambiguation AND Foo is NOT a disambiguation page, then tag Foo (bar) with {{R from unnecessary disambiguation}}
    Currently 39,963 articles fit this case


    Case 2: If a redirect exists Foo (bar) -> Foo where bar does not equal disambiguation AND Foo is IS a disambiguation page then tag with {{R from incomplete disambiguation}}.
    Currently 16,427 articles fit this case


    Case 3: If a redirect exists Foo (disambiguation) -> Foo AND Foo is a disambiguation page AND Foo (disambiguation) is NOT malformed, then tag Foo (bar) with {{R to disambiguation page}}
    Currently 27 articles fit this case


    The following functionality/logic exists for all 3 cases:

    Discussion[edit]

    Comment @Kadane: The following should be tagged as {{R from incomplete disambiguation}} instead of {{R from unnecessary disambiguation}}

    Extended content

    Those can be identified by the landing page being a disambiguation page.

    This one should be skipped, or tagged with something else (investigating)

    These ones should be skipped as malformed DAB pages (missing space, capital D), but collecting them so they can be RFD's would be good.

    Headbomb {t · c · p · b} 17:11, 19 March 2019 (UTC)[reply]

    Okay I have updated the functional details of the bot to fix the cases you brought up. I will update the table of edits when I make it home. Kadane (talk) 19:23, 19 March 2019 (UTC)[reply]
    @Headbomb: I have uploaded new edits to User:KadaneBot/Sandbox. It contains 100 edits of each of the cases, with the exception of {{R to disambiguation page}} which only has 22 edits total. I have also included all of the malformed disambiguation pages (these will not be modified by the bot, just included in the log). Kadane (talk) 05:48, 20 March 2019 (UTC)[reply]

    Better, although

    Should be tagged with {{R from incomplete disambiguation}} instead of {{R from unnecessary disambiguation}}. Headbomb {t · c · p · b} 09:31, 20 March 2019 (UTC)[reply]

    @Headbomb: - There was an error in my CSV parsing from the database dump. I forgot to set the parameter quoting=csv.QUOTE_NONE, which resulted in some lines being skipped when the database query was being scanned. Because of this some articles and disambiguation pages were being ignored. This is fixed now. I clicked through most of the cases and I can't find any errors. User:KadaneBot/Sandbox is updated. Kadane (talk) 15:17, 20 March 2019 (UTC)[reply]
    Of all cases, the following aren't really disambiguation pages.

    Maybe a full list should be created so we can purge all cases that shouldn't be tagged. Everything else look fine though. Headbomb {t · c · p · b} 18:03, 20 March 2019 (UTC)[reply]

    To save time, that full list to review could exclude things that end in \s\(.* (album|song|single|EP|soundtrack|network|channel|episode|series|film|journal|magazine|website|company|publisher|newspaper|company|station|decade|numeral|number|game|novel|book|gene)\) since those are safe. Headbomb {t · c · p · b} 21:02, 20 March 2019 (UTC)[reply]
    Alright all edits have been saved with the of the articles that end in what you listed above removed.

    Kadane (talk) 21:52, 20 March 2019 (UTC)[reply]

    Case 3 are all fine, I'll review Case 1 and 2. Headbomb {t · c · p · b} 22:09, 20 March 2019 (UTC)[reply]
    Actually Always(song)) and a few others with )) are malformed. Headbomb {t · c · p · b} 22:12, 20 March 2019 (UTC)[reply]

    So are

    Extended content

    Headbomb {t · c · p · b} 22:19, 20 March 2019 (UTC)[reply]

    Ah I was under the impression that we only checked malformed disambig on case 3 (when name ends with (disambiguation)). Updated the logic to check for malformed disambigs for all cases. Kadane (talk) 22:37, 20 March 2019 (UTC)[reply]

    There are actually a few more, which I've sent to RFD.

    Extended content

    Headbomb {t · c · p · b} 22:49, 20 March 2019 (UTC)[reply]

    @Kadane:, actually could you break User:KadaneBot/Task3/Case 1 in sections of 100 KB tops? Those pages are pretty slow to load/edit (I have scripts that classify type of links, which slow down these pages considerably). Headbomb {t · c · p · b} 23:06, 20 March 2019 (UTC)[reply]

     Done @Headbomb: Also I am catching disambiguation misspellings as well as other words appearing next to disambiguation between parenthesis. If there are any other misspellings they should probably be excluded manually unless there is a pattern. Kadane (talk) 23:15, 20 March 2019 (UTC)[reply]

    Could you also break down redirects into 'species', e.g. all those ending with \s\(*album\) into a subpage (or section), all those ending with \s(*song\) into another, and so on (and everything else considered "Other")? At least for endings in

    All case insensitive. Headbomb {t · c · p · b} 23:18, 20 March 2019 (UTC)[reply]

    @Kadane: and could you also put the target page in those lists? Headbomb {t · c · p · b} 23:21, 20 March 2019 (UTC)[reply]
    I am on my way to class but I can do that in a couple hours. Kadane (talk) 23:23, 20 March 2019 (UTC)[reply]
    No rush. Enjoy class. Headbomb {t · c · p · b} 23:24, 20 March 2019 (UTC)[reply]
    @Kadane: any update? Headbomb {t · c · p · b} 20:41, 22 March 2019 (UTC)[reply]
    Headbomb I got sick and fell behind. This is on my to do list today. Kadane (talk) 21:31, 22 March 2019 (UTC)[reply]

    Okay all edits have been sorted by 'species' and a list of all pages can be found here. @Headbomb: Kadane (talk) 00:09, 23 March 2019 (UTC)[reply]

    Approved for trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. - Let's start with everything in User:KadaneBot/Task3/Edits/other/Case_3. This is something that could safely be automated. Make sure to run on the most version of the pages, since things may be updated. Headbomb {t · c · p · b} 00:11, 23 March 2019 (UTC)[reply]

    Headbomb - Come to find out Task 3 is already taken care of by RussBot and it ran through and tagged every article in case 3 with {{R to disambiguation}}. I could run another database query to see if there are any cases that RussBot has missed, but a task for case 3 seems redundant. What do you think?
    Also I made 1 trial edit[1] which resulted in an error because of a misplaced quotation mark in my code. Going forward it will check (correctly) to see if the category has been added since the last database scan. Kadane (talk) 01:20, 23 March 2019 (UTC)[reply]
    If Case 3 is taken care of by RussBot, then let's leave it to RussBot. We can revisit this if RussBot goes dead. Let's trial case 2 on everything in User:KadaneBot/Task3/Edits/newspaper/Case 2 then. Headbomb {t · c · p · b} 01:23, 23 March 2019 (UTC)[reply]
    Okay @Headbomb:. I found another error in my code for case 2 that resulted in articles that were already tagged being reported in the edit cases. I have fixed that bug and it has resulted in a large reduction of edits case 2. This error only affected the database scan and was caught during editing when the algorithm double checks it should edit.

    I have completed the trial edits [2] [3] [4]. The rest were false positives. I am hesitant to mark the trial as done with only 3 edits.

    May I suggest trialing either User:KadaneBot/Task3/Edits/cricketer/Case 2 (135 edits), User:KadaneBot/Task3/Edits/footballer/Case 2 (60 edits), or User:KadaneBot/Task3/Edits/politician/Case 2 (40 edits)? Kadane (talk) 01:47, 23 March 2019 (UTC)[reply]

    I picked that category on purpose to see how it would handle those cases and not blow everything up. Side note [5]/[6]/[7] this is a much much better format. And while you don't have to do this, when making edits, you might as well add [8] if you find a #Whatever in the redirect. Headbomb {t · c · p · b} 01:51, 23 March 2019 (UTC)[reply]
    For a follow up trial, you can do 25 edits in User:KadaneBot/Task3/Edits/other/Case_2/1. Headbomb {t · c · p · b} 01:59, 23 March 2019 (UTC)[reply]
    You can do the rest of User:KadaneBot/Task3/Edits/other/Case_2/1/User:KadaneBot/Task3/Edits/other/Case_2/1 to see if all the kinks are worked out. Headbomb {t · c · p · b} 03:14, 23 March 2019 (UTC)[reply]

    Small whitespace issues: [17], [18]. Headbomb {t · c · p · b} 04:55, 23 March 2019 (UTC)[reply]

    Dupe disambiguation category: [19], [20]. Also [21].Headbomb {t · c · p · b} 05:00, 23 March 2019 (UTC)[reply]
    Weird R catshell thing. [22] Headbomb {t · c · p · b} 05:02, 23 March 2019 (UTC)[reply]
    Missed an R catshell opportunity [23]. Headbomb {t · c · p · b} 05:07, 23 March 2019 (UTC)[reply]
    [24] the those with 'alternative' dabs should be likely be skipped. Or compiled in a seperate list for human review. Headbomb {t · c · p · b} 05:11, 23 March 2019 (UTC)[reply]
    [25] should remove the dupe category for incomplete dabs. Headbomb {t · c · p · b} 05:17, 23 March 2019 (UTC)[reply]
    One more: [26] (see all aliases)Headbomb {t · c · p · b} 05:23, 23 March 2019 (UTC)[reply]

    For the whitespace issue, I think you can have something similar to \}\}\n+\{\{}}\n{{ and \n\n+\n\n. Headbomb {t · c · p · b} 05:29, 23 March 2019 (UTC)[reply]

    @Kadane: if you're ready to continue trial, you can tackle User:KadaneBot/Task3/Edits/other/Case_2/3.Headbomb {t · c · p · b} 23:43, 27 March 2019 (UTC)[reply]
    Okay everything is ready. I have several deadlines in the coming days and will run the trial when real life permits. Should be no later than Saturday 6th and I am hoping that it's much earlier than that. Kadane (talk) 01:16, 28 March 2019 (UTC)[reply]
    @Kadane: Looks all good to me. Could you update the function overview section to reflect what the BRFA is for 'case 2' only? I'll approve after. Headbomb {t · c · p · b} 16:54, 15 April 2019 (UTC)[reply]
    @Headbomb: done. Kadane (talk) 17:02, 15 April 2019 (UTC)[reply]
     Approved. Headbomb {t · c · p · b} 19:27, 15 April 2019 (UTC)[reply]
    The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.

    Retrieved from "https://en.wikipedia.org/w/index.php?title=Wikipedia:Bots/Requests_for_approval/KadaneBot_3&oldid=892619494"

    Category: 
    Approved Wikipedia bot requests for approval
     



    This page was last edited on 15 April 2019, at 19:28 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki