Викисклад:Структурированные данные/Добавление тегов с помощью компьютера



From Wikimedia Commons, the free media repository

< Commons:Structured data


Jump to navigation  Jump to search  
This page is a translated version of a page Commons:Structured data/Computer-aided tagging and the translation is 22% complete. Changes to the translation template, respectively the source language can be submitted through Commons:Structured data/Computer-aided tagging and have to be approved by a translation administrator.
  • Deutsch
  • English
  • Nederlands
  • català
  • español
  • français
  • беларуская (тарашкевіца)
  • русский
  • العربية
  • ไทย
  • 中文
  • 日本語
  • Notice Добавление тегов с помощью компьютера — новая технология. Иногда предлагаемые теги будут ошибочными или неподходящими. Такое поведение ожидаемо.

    The computer-aided tagging tool is a feature in development by the Structured Data on Commons team to assist community members in identifying and labeling depicts statements for Commons files. There are tens of millions of carefully curated files on Commons, but the structured data tool is new. With this feature, existing files can have their contents easily, quickly, and – if used with care – accurately described. To contribute, editors won’t need to know how Wikidata works or speak a particular language. This new feature prompts users with suggestions for "tags", using a computer vision model, for human review. Commons users will be able to visit a Special page on Commons and see suggested depicts tags, which can be selected to be confirmed or ignored. Tags will never be automatically added without human involvement.

    Computer-aided tagging helps populate files with structured data, in turn these files can be found using general search terms in Special:MediaSearch in a manner that was previously not possible. This helps users easily find media that otherwise isn't easy to come across using the old search, which often relies on specific information in file descriptions or category placement to find files. If specific information is lacking, it can be hard-to-impossible to find a lot of media on Commons through the standard search. For example, Peter_iredale_sunset_edited1.jpg shows up in a search for "beach" using Special:MediaSearch thanks to the "beach" depicts statement added by computer-aided tagging; it does not show up at all in a search for "beach" using regular search.

    Computer-aided tagging is a stand-alone MediaWiki extension and is not a core part of Commons itself, and ties into Commons using Special:SuggestedTags. On the back-end, the tool will use Google Cloud Vision for depicts suggestions. Wikimedia already uses the Google Cloud Vision service in Wikisource OCR, and this will work similarly. This tool is opt-in for registered, auto-confirmed users. It is not on by default for any user group, and is unavailable to new and unregistered users.

    Updates on CAT/SuggestedTags usage, September 2020

    To date (Обновлено 14 февраля 2022 года):

    1. 5,809 total users have made edits via the Computer-Aided Tagging tool
      • 962 of these users did so via mobile web
  • 341,957 total files have had edits made via Computer-Aided Tagging
    • 41,563 of those files have Computer-Aided Tagging edit on mobile web
  • 72% of files with CAT edits had those edits done by the same user who uploaded the file
  • Approximately 10,000 files edited by CAT so far were purely manual edits
  • We’re averaging about 20 new users a week currently
  • Charts for this data are updated every Monday on the CAT usage report analytics page

    CAT specificity

    We’re working on possible techniques for improving the tool’s ability to accurately identify specific elements of photos, but it’s important to keep in mind that the Google Vision algorithm already does fairly well in many topic spaces.

    Upcoming tweaks to the queue for general images

    Although most usage of the Computer Aided Tagging system comes from users editing their own uploads (72%), there is a separate queue for “popular” images. Based on recent feedback from the Commons community, we’re exploring ways to prioritize this queue differently. Particularly, we’re considering a system that would focus more on files that do not have curated categories yet.

    Google Cloud Vision

    All information that passes through Google Cloud Vision will also be public. Dumps will be available of completely anonymous data that lists the Commons File, its suggested tags, and which tags were accepted. Google Cloud Vision is completely isolated from Wikimedia Commons, the feature is separate from the core Commons experience.

    Although there are open source computer vision platforms available to start from, any such package would require resources or specialized expertise to provide an industry-standard experience with computer vision that the Wikimedia Foundation is unable to itself provide at this time. The team recognizes that Google Cloud Vision is not open source software. There will not be any non-free or proprietary code written by the Foundation for this project; all contributions will remain open source.[clarification needed] Google will not have access to any private, non-public, personal information, there will be no direct communication between users and Google's service.

    Architecture and workflow

    Design of information flow in computer-assisted image tagging. The "machine vision" provider on the far right requests and sends potential tags for images; there is no personal information exchanged and the provider is isolated from the rest of the system and Commons.

    Registered, auto-confirmed users will be able to opt-in through their preferences or uploading files. After some time has passed, the user will be contacted through their notifications that their uploads are ready for tagging at Special:SuggestedTags. Users who have opted-in can visit Special:SuggestedTags at any time to view files ready for tag processing. Anonymous users, new users, and users who have not opted-in will not be able to access Special:SuggestedTags.

    The concepts that are available for tagging are ones that translate from Google Knowledge Graph IDs to Wikidata IDs. At 2.1 million triplets, the list is too long to catalog here, but is available for download as freebase-wikidata mappings.

    Этап разработки

    All originally planned features for the tool are now deployed and available for use. The development team will continue with tweaks, and possible new features in the future.

    Внедрение и примечания по использованию

    Юзербокс

    Вы можете использовать этот юзербокс на вашей странице.

    {{User Computer-aided tagging}}
    links talk view

    This user uses Computer-aided tagging tool for tagging images.

    This was a failed project

    As early as 13 February 2020, experienced Commons users were complaining that the bulk of tags added using this tool were, as one put it, "way too vague, irrelevant or even detrimental". After numerous such complaints over the next several years, on 16 June 2023 the Sr. Director in the WMF Product department acknowledged that "We understand that the accuracy and utility of the tags generated by this tool have been called into question." After some study, on 14 September 2023 they announced, "we will be deactivating the tool on September 20, 2023, after completing the necessary code changes."


    Retrieved from "https://commons.wikimedia.org/w/index.php?title=Commons:Structured_data/Computer-aided_tagging/ru&oldid=838741873"

    Category: 
    Structured Data on Commons
    Hidden category: 
    Inactive Commons pages
     


    Navigation menu


    Personal tools  




    English
    Not logged in
    Talk
    Contributions
    Create account
    Log in
     


    Namespaces  




    Project page
    Discussion