Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Background and history  





2 Prerequisite information for zero-shot classes  





3 Generalized zero-shot learning  





4 Domains of application  





5 See also  





6 References  














Zero-shot learning






Català
فارسی

 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


Zero-shot learning (ZSL) is a problem setup in deep learning where, at test time, a learner observes samples from classes which were not observed during training, and needs to predict the class that they belong to. The name is a play on words based on the earlier concept of one-shot learning, in which classification can be learned from only one, or a few, examples.

Zero-shot methods generally work by associating observed and non-observed classes through some form of auxiliary information, which encodes observable distinguishing properties of objects.[1] For example, given a set of images of animals to be classified, along with auxiliary textual descriptions of what animals look like, an artificial intelligence model which has been trained to recognize horses, but has never been given a zebra, can still recognize a zebra when it also knows that zebras look like striped horses. This problem is widely studied in computer vision, natural language processing, and machine perception.[2]

Background and history

[edit]

The first paper on zero-shot learning in natural language processing appeared in 2008 at the AAAI’08, but the name given to the learning paradigm there was dataless classification.[3] The first paper on zero-shot learning in computer vision appeared at the same conference, under the name zero-data learning.[4] The term zero-shot learning itself first appeared in the literature in a 2009 paper from Palatucci, Hinton, Pomerleau, and Mitchell at NIPS’09.[5] This terminology was repeated later in another computer vision paper[6] and the term zero-shot learning caught on, as a take-off on one-shot learning that was introduced in computer vision years earlier.[7]

In computer vision, zero-shot learning models learned parameters for seen classes along with their class representations and rely on representational similarity among class labels so that, during inference, instances can be classified into new classes.

In natural language processing, the key technical direction developed builds on the ability to "understand the labels"—represent the labels in the same semantic space as that of the documents to be classified. This supports the classification of a single example without observing any annotated data, the purest form of zero-shot classification. The original paper[3] made use of the Explicit Semantic Analysis (ESA) representation but later papers made use of other representations, including dense representations. This approach was also extended to multilingual domains,[8][9] fine entity typing[10] and other problems. Moreover, beyond relying solely on representations, the computational approach has been extended to depend on transfer from other tasks, such as textual entailment[11] and question answering.[12]

The original paper[3] also points out that, beyond the ability to classify a single example, when a collection of examples is given, with the assumption that they come from the same distribution, it is possible to bootstrap the performance in a semi-supervised like manner (ortransductive learning).

Unlike standard generalization in machine learning, where classifiers are expected to correctly classify new samples to classes they have already observed during training, in ZSL, no samples from the classes have been given during training the classifier. It can therefore be viewed as an extreme case of domain adaptation.

Prerequisite information for zero-shot classes

[edit]

Naturally, some form of auxiliary information has to be given about these zero-shot classes, and this type of information can be of several types. 

Generalized zero-shot learning

[edit]

The above ZSL setup assumes that at test time, only zero-shot samples are given, namely, samples from new unseen classes. In generalized zero-shot learning, samples from both new and known classes, may appear at test time. This poses new challenges for classifiers at test time, because it is very challenging to estimate if a given sample is new or known. Some approaches to handle this include: 

Domains of application

[edit]

Zero shot learning has been applied to the following fields:

See also

[edit]

References

[edit]
  1. ^ Xian, Yongqin; Lampert, Christoph H.; Schiele, Bernt; Akata, Zeynep (2020-09-23). "Zero-Shot Learning -- A Comprehensive Evaluation of the Good, the Bad and the Ugly". arXiv:1707.00600 [cs.CV].
  • ^ Xian, Yongqin; Schiele, Bernt; Akata, Zeynep (2017). "Zero-shot learning-the good, the bad and the ugly". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition: 4582–4591. arXiv:1703.04394. Bibcode:2017arXiv170304394X.
  • ^ a b c Chang, M.W. (2008). "Importance of Semantic Representation: Dataless Classification". AAAI.
  • ^ Larochelle, Hugo (2008). "Zero-data Learning of New Tasks" (PDF).
  • ^ Palatucci, Mark (2009). "Zero-Shot Learning with Semantic Output Codes" (PDF). NIPS.
  • ^ a b Lampert, C.H. (2009). "Learning to detect unseen object classes by between-class attribute transfer". IEEE Conference on Computer Vision and Pattern Recognition: 951–958. CiteSeerX 10.1.1.165.9750.
  • ^ Miller, E. G. (2000). "Learning from One Example Through Shared Densities on Transforms" (PDF). CVPR.
  • ^ Song, Yangqiu (2019). "Toward any-language zero-shot topic classification of textual documents". Artificial Intelligence. 274: 133–150. doi:10.1016/j.artint.2019.02.002.
  • ^ Song, Yangqiu (2016). "Cross-Lingual Dataless Classification for Many Languages" (PDF). IJCAI.
  • ^ a b Zhou, Ben (2018). "Zero-Shot Open Entity Typing as Type-Compatible Grounding" (PDF). EMNLP. arXiv:1907.03228.
  • ^ Yin, Wenpeng (2019). "Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach" (PDF). EMNLP. arXiv:1909.00161.
  • ^ Levy, Omer (2017). "Zero-Shot Relation Extraction via Reading Comprehension" (PDF). CoNLL. arXiv:1706.04115.
  • ^ Romera-Paredes, Bernardino; Torr, Phillip (2015). "An embarrassingly simple approach to zero-shot learning" (PDF). International Conference on Machine Learning: 2152–2161.
  • ^ Atzmon, Yuval; Chechik, Gal (2018). "Probabilistic AND-OR Attribute Grouping for Zero-Shot Learning" (PDF). Uncertainty in Artificial Intelligence. arXiv:1806.02664. Bibcode:2018arXiv180602664A.
  • ^ Roth, Dan (2009). "Aspect Guided Text Categorization with Unobserved Labels". ICDM. CiteSeerX 10.1.1.148.9946.
  • ^ Hu, R Lily; Xiong, Caiming; Socher, Richard (2018). "Zero-Shot Image Classification Guided by Natural Language Descriptions of Classes: A Meta-Learning Approach" (PDF). NeurIPS.
  • ^ Srivastava, Shashank; Labutov, Igor; Mitchelle, Tom (2018). "Zero-shot Learning of Classifiers from Natural Language Quantification". Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 306–316. doi:10.18653/v1/P18-1029.
  • ^ Frome, Andrea; et, al (2013). "Devise: A deep visual-semantic embedding model" (PDF). Advances in Neural Information Processing Systems: 2121–2129.
  • ^ Socher, R; Ganjoo, M; Manning, C.D.; Ng, A. (2013). "Zero-shot learning through cross-modal transfer". Neural Information Processing Systems. arXiv:1301.3666. Bibcode:2013arXiv1301.3666S.
  • ^ Atzmon, Yuval (2019). "Adaptive Confidence Smoothing for Generalized Zero-Shot Learning". The IEEE Conference on Computer Vision and Pattern Recognition: 11671–11680. arXiv:1812.09903. Bibcode:2018arXiv181209903A.
  • ^ Felix, R; et, al (2018). "Multi-modal cycle-consistent generalized zero-shot learning". Proceedings of the European Conference on Computer Vision: 21–37. arXiv:1808.00136. Bibcode:2018arXiv180800136F.
  • ^ Wittmann, Bruce J.; Yue, Yisong; Arnold, Frances H. (2020-12-04). "Machine Learning-Assisted Directed Evolution Navigates a Combinatorial Epistatic Fitness Landscape with Minimal Screening Burden": 2020.12.04.408955. doi:10.1101/2020.12.04.408955. S2CID 227914824. {{cite journal}}: Cite journal requires |journal= (help)

  • Retrieved from "https://en.wikipedia.org/w/index.php?title=Zero-shot_learning&oldid=1227689647"

    Categories: 
    Machine learning algorithms
    Computer vision
    Hidden categories: 
    CS1 errors: missing periodical
    Articles with short description
    Short description is different from Wikidata
     



    This page was last edited on 7 June 2024, at 08:09 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki