Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Feature types  





2 Classification  





3 Examples  





4 Feature vectors  





5 Selection and extraction  





6 See also  





7 References  














Feature (machine learning): Difference between revisions






العربية
Català
Deutsch
Español
فارسی

Italiano

Русский
Suomi
Türkçe
Українська
Tiếng Vit

 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 





Help
 

From Wikipedia, the free encyclopedia
 


Browse history interactively
 Previous edit
Content deleted Content added
No edit summary
AnomieBOT (talk | contribs)
6,338,462 edits
m Dating maintenance tags: {{Cn}}
 
(37 intermediate revisions by 28 users not shown)
Line 1: Line 1:

{{Short description|Measurable property or characteristic}}

In [[machine learning]] and [[pattern recognition]], a '''feature''' is an individual measurable property or characteristic of a phenomenon being observed.<ref name="ml">{{cite book |author=Bishop, Christopher |title=Pattern recognition and machine learning |publisher=Springer |location=Berlin |year=2006 |pages= |isbn=0-387-31073-8 |oclc= |doi= |accessdate=}}</ref> Choosing informative, discriminating and independent features is a crucial step for effective algorithms in [[pattern recognition]], [[classification (machine learning)|classification]] and [[Regression analysis|regression]]. Features are usually numeric, but structural features such as [[string (computer science)|strings]] and [[Graph (discrete mathematics)|graphs]] are used in [[syntactic pattern recognition]].

{{distinguish|Feature (computer vision)}}

The concept of "feature" is related to that of [[explanatory variable]] used in [[statistics|statistical]] techniques such as [[linear regression]].

{{Refimprove|date=December 2014}}

{{machine learning bar}}

In [[machine learning]] and [[pattern recognition]], a '''feature''' is an individual measurable property or characteristic of a phenomenon.<ref name="ml">{{cite book |author=Bishop, Christopher |title=Pattern recognition and machine learning |publisher=Springer |location=Berlin |year=2006 |isbn=0-387-31073-8 }}</ref> Choosing informative, discriminating and independent features is a crucial element of effective algorithms in [[pattern recognition]], [[classification (machine learning)|classification]] and [[Regression analysis|regression]]. Features are usually numeric, but structural features such as [[string (computer science)|strings]] and [[Graph (discrete mathematics)|graphs]] are used in [[syntactic pattern recognition]]. The concept of "feature" is related to that of [[explanatory variable]] used in [[statistics|statistical]] techniques such as [[linear regression]].


==Feature types==

In feature engineering, two types of features are commonly used: numerical and categorical.


Numerical features are continuous values that can be measured on a scale. Examples of numerical features include age, height, weight, and income. Numerical features can be used in machine learning algorithms directly.{{cn|date=June 2024}}


[[Categorical variable|Categorical features]] are discrete values that can be grouped into categories. Examples of categorical features include gender, color, and zip code. Categorical features typically need to be converted to numerical features before they can be used in machine learning algorithms. This can be done using a variety of techniques, such as one-hot encoding, label encoding, and ordinal encoding.


The type of feature that is used in feature engineering depends on the specific machine learning algorithm that is being used. Some machine learning algorithms, such as decision trees, can handle both numerical and categorical features. Other machine learning algorithms, such as linear regression, can only handle numerical features.



==Classification==

==Classification==



A numeric feature can be conveniently described by a feature vector. One way to achieve [[binary classification]] is using a [[linear predictor function]] (related to the [[perceptron]]) with a feature vector as input. The method consists of calculating the [[Dot product|scalar product]] between the feature vector and a vector of weights, qualifying those observations whose result exceeds a threshold.

A set of numeric features can be conveniently described by a [[feature vector]].

An example of reaching a two-way classification{{Clarify|date=October 2017}} from a feature vector (related to the [[perceptron]]) consists of

calculating the [[Dot product|scalar product]] between the feature vector and a vector of weights,

comparing the result with a threshold, and deciding the class based on the comparison.

Algorithms for classification from a feature vector include [[k-nearest neighbor algorithm|nearest neighbor classification]], [[neural networks]], and [[statistical classification|statistical techniques]] such as [[Bayesian inference|Bayesian approaches]].

Algorithms for classification from a feature vector include [[k-nearest neighbor algorithm|nearest neighbor classification]], [[Artificial neural network|neural networks]], and [[statistical classification|statistical techniques]] such as [[Bayesian inference|Bayesian approaches]].



==Examples==

==Examples==

Line 22: Line 31:

In [[computer vision]], there are a large number of possible [[feature (computer vision)|features]], such as edges and objects.

In [[computer vision]], there are a large number of possible [[feature (computer vision)|features]], such as edges and objects.



==Extensions==

==Feature vectors==

{{See also|Word embedding}}

{{Redirect|Feature space|feature spaces in kernel machines|Kernel method}}

{{Redirect|Feature space|feature spaces in kernel machines|Kernel method}}

In [[pattern recognition]] and [[machine learning]], a '''feature vector''' is an n-dimensional [[vector (geometric)|vector]] of numerical features that represent some object. Many [[algorithm]]s in machine learning require a numerical representation of objects, since such representations facilitate processing and statistical analysis. When representing images, the feature values might correspond to the pixels of an image, while when representing texts the features might be the frequencies of occurrence of textual terms. Feature vectors are equivalent to the vectors of [[explanatory variable]]s used in [[statistics|statistical]] procedures such as [[linear regression]]. Feature vectors are often combined with weights using a [[dot product]] in order to construct a [[linear predictor function]] that is used to determine a score for making a prediction.

In [[pattern recognition]] and [[machine learning]], a '''feature vector''' is an n-dimensional [[vector (geometric)|vector]] of numerical features that represent some object. Many [[algorithm]]s in machine learning require a numerical representation of objects, since such representations facilitate processing and statistical analysis. When representing images, the feature values might correspond to the pixels of an image, while when representing texts the features might be the frequencies of occurrence of textual terms. Feature vectors are equivalent to the vectors of [[explanatory variable]]s used in [[statistics|statistical]] procedures such as [[linear regression]]. Feature vectors are often combined with weights using a [[dot product]] in order to construct a [[linear predictor function]] that is used to determine a score for making a prediction.

Line 28: Line 38:

The [[vector space]] associated with these vectors is often called the '''feature space'''. In order to reduce the dimensionality of the feature space, a number of [[dimensionality reduction]] techniques can be employed.

The [[vector space]] associated with these vectors is often called the '''feature space'''. In order to reduce the dimensionality of the feature space, a number of [[dimensionality reduction]] techniques can be employed.



Higher-level features can be obtained from already available features and added to the feature vector; for example, for the study of diseases the feature 'Age' is useful and is defined as ''Age = 'Year of death' minus 'Year of birth' ''. This process is referred to as '''feature construction'''.<ref name=Liu1998>Liu, H., Motoda H. (1998) ''Feature Selection for Knowledge Discovery and Data Mining.'', Kluwer Academic Publishers. Norwell, MA, USA. 1998.</ref><ref name=Piramithu2009>Piramuthu, S., Sikora R. T. Iterative feature construction for improving inductive learning algorithms. In Journal of Expert Systems with Applications. Vol. 36 , Iss. 2 (March 2009), pp. 3401-3406, 2009</ref> Feature construction is the application of a set of constructive operators to a set of existing features resulting in construction of new features. Examples of such constructive operators include checking for the equality conditions {=, ≠}, the arithmetic operators {+,−,×, /}, the array operators {max(S), min(S), average(S)} as well as other more sophisticated operators, for example count(S,C)<ref name=bloedorn1998>Bloedorn, E., Michalski, R. Data-driven constructive induction: a methodology and its applications. IEEE Intelligent Systems, Special issue on Feature Transformation and Subset Selection, pp. 30-37, March/April, 1998</ref> that counts the number of features in the feature vector S satisfying some condition C or, for example, distances to other recognition classes generalized by some accepting device. Feature construction has long been considered a powerful tool for increasing both accuracy and understanding of structure, particularly in high-dimensional problems.<ref name=breinman1984>Breiman, L. Friedman, T., Olshen, R., Stone, C. (1984) ''Classification and regression trees'', Wadsworth</ref> Applications include studies of disease and [[emotion recognition]] from speech.<ref name=Sidorova2009>Sidorova, J., Badia T. Syntactic learning for ESEDA.1, tool for enhanced speech emotion detection and analysis. Internet Technology and Secured Transactions Conference 2009 (ICITST-2009), London, November 9–12. IEEE</ref>

Higher-level features can be obtained from already available features and added to the feature vector; for example, for the study of diseases the feature 'Age' is useful and is defined as ''Age = 'Year of death' minus 'Year of birth' ''. This process is referred to as '''feature construction'''.<ref name=Liu1998>Liu, H., Motoda H. (1998) ''[https://books.google.com/books?id=aaDbBwAAQBAJ Feature Selection for Knowledge Discovery and Data Mining].'', Kluwer Academic Publishers. Norwell, MA, USA. 1998.</ref><ref name=Piramithu2009>Piramuthu, S., Sikora R. T. [https://www.sciencedirect.com/science/article/pii/S0957417408001309 Iterative feature construction for improving inductive learning algorithms]. In Journal of Expert Systems with Applications. Vol. 36 , Iss. 2 (March 2009), pp. 3401-3406, 2009</ref> Feature construction is the application of a set of constructive operators to a set of existing features resulting in construction of new features. Examples of such constructive operators include checking for the equality conditions {=, ≠}, the arithmetic operators {+,−,×, /}, the array operators {max(S), min(S), average(S)} as well as other more sophisticated operators, for example count(S,C)<ref name=bloedorn1998>Bloedorn, E., Michalski, R. Data-driven constructive induction: a methodology and its applications. IEEE Intelligent Systems, Special issue on Feature Transformation and Subset Selection, pp. 30-37, March/April, 1998</ref> that counts the number of features in the feature vector S satisfying some condition C or, for example, distances to other recognition classes generalized by some accepting device. Feature construction has long been considered a powerful tool for increasing both accuracy and understanding of structure, particularly in high-dimensional problems.<ref name=breinman1984>Breiman, L. Friedman, T., Olshen, R., Stone, C. (1984) ''Classification and regression trees'', Wadsworth</ref> Applications include studies of disease and [[emotion recognition]] from speech.<ref name=Sidorova2009>Sidorova, J., Badia T. [https://ieeexplore.ieee.org/abstract/document/5402574/ Syntactic learning for ESEDA.1, tool for enhanced speech emotion detection and analysis]. Internet Technology and Secured Transactions Conference 2009 (ICITST-2009), London, November 9–12. IEEE</ref>



==Selection and extraction==

==Selection and extraction==

{{main|Feature selection|Feature extraction}}

The initial set of raw features can be redundant and too large to be managed. Therefore, a preliminary step in many applications of [[machine learning]] and [[pattern recognition]] consists of [[Feature selection|selecting]] a subset of features, or [[Feature extraction|constructing]] a new and reduced set of features to facilitate learning, and to improve generalization and interpretability{{Citation needed|date=October 2017}}.



The initial set of raw features can be redundant and large enough that estimation and optimization is made difficult or ineffective. Therefore, a preliminary step in many applications of [[machine learning]] and [[pattern recognition]] consists of [[Feature selection|selecting]] a subset of features, or [[Feature extraction|constructing]] a new and reduced set of features to facilitate learning, and to improve generalization and interpretability.<ref>{{Cite book |last1=Hastie |first1=Trevor |url=https://books.google.com/books?id=eBSgoAEACAAJ |title=The Elements of Statistical Learning: Data Mining, Inference, and Prediction |last2=Tibshirani |first2=Robert |last3=Friedman |first3=Jerome H. |date=2009 |publisher=Springer |isbn=978-0-387-84884-6 |language=en}}</ref>

[[feature extraction|Extracting]]or[[Feature selection|selecting]] features is a combination of art and science; developing systems to do so is known as [[feature engineering]]. It requires the experimentation of multiple possibilities and the combination of automated techniques with the intuition and knowledge of the [[domain expert]]. Automating this process is [[feature learning]], where a machine not only uses features for learning, but learns the features itself.


Extracting or selecting features is a combination of art and science; developing systems to do so is known as [[feature engineering]]. It requires the experimentation of multiple possibilities and the combination of automated techniques with the intuition and knowledge of the [[domain expert]]. Automating this process is [[feature learning]], where a machine not only uses features for learning, but learns the features itself.



== See also ==

== See also ==

Line 41: Line 53:

* [[Hashing trick]]

* [[Hashing trick]]

* [[Statistical classification]]

* [[Statistical classification]]

* [[Explainable artificial intelligence]]



==References==

==References==

{{Reflist}}

{{Reflist}}

{{Refimprove|date=December 2014}}



[[Category:Data mining]]

[[Category:Data mining]]


Latest revision as of 16:43, 10 June 2024

Inmachine learning and pattern recognition, a feature is an individual measurable property or characteristic of a phenomenon.[1] Choosing informative, discriminating and independent features is a crucial element of effective algorithms in pattern recognition, classification and regression. Features are usually numeric, but structural features such as strings and graphs are used in syntactic pattern recognition. The concept of "feature" is related to that of explanatory variable used in statistical techniques such as linear regression.

Feature types[edit]

In feature engineering, two types of features are commonly used: numerical and categorical.

Numerical features are continuous values that can be measured on a scale. Examples of numerical features include age, height, weight, and income. Numerical features can be used in machine learning algorithms directly.[citation needed]

Categorical features are discrete values that can be grouped into categories. Examples of categorical features include gender, color, and zip code. Categorical features typically need to be converted to numerical features before they can be used in machine learning algorithms. This can be done using a variety of techniques, such as one-hot encoding, label encoding, and ordinal encoding.

The type of feature that is used in feature engineering depends on the specific machine learning algorithm that is being used. Some machine learning algorithms, such as decision trees, can handle both numerical and categorical features. Other machine learning algorithms, such as linear regression, can only handle numerical features.

Classification[edit]

A numeric feature can be conveniently described by a feature vector. One way to achieve binary classification is using a linear predictor function (related to the perceptron) with a feature vector as input. The method consists of calculating the scalar product between the feature vector and a vector of weights, qualifying those observations whose result exceeds a threshold.

Algorithms for classification from a feature vector include nearest neighbor classification, neural networks, and statistical techniques such as Bayesian approaches.

Examples[edit]

Incharacter recognition, features may include histograms counting the number of black pixels along horizontal and vertical directions, number of internal holes, stroke detection and many others.

Inspeech recognition, features for recognizing phonemes can include noise ratios, length of sounds, relative power, filter matches and many others.

Inspam detection algorithms, features may include the presence or absence of certain email headers, the email structure, the language, the frequency of specific terms, the grammatical correctness of the text.

Incomputer vision, there are a large number of possible features, such as edges and objects.

Feature vectors[edit]

Inpattern recognition and machine learning, a feature vector is an n-dimensional vector of numerical features that represent some object. Many algorithms in machine learning require a numerical representation of objects, since such representations facilitate processing and statistical analysis. When representing images, the feature values might correspond to the pixels of an image, while when representing texts the features might be the frequencies of occurrence of textual terms. Feature vectors are equivalent to the vectors of explanatory variables used in statistical procedures such as linear regression. Feature vectors are often combined with weights using a dot product in order to construct a linear predictor function that is used to determine a score for making a prediction.

The vector space associated with these vectors is often called the feature space. In order to reduce the dimensionality of the feature space, a number of dimensionality reduction techniques can be employed.

Higher-level features can be obtained from already available features and added to the feature vector; for example, for the study of diseases the feature 'Age' is useful and is defined as Age = 'Year of death' minus 'Year of birth' . This process is referred to as feature construction.[2][3] Feature construction is the application of a set of constructive operators to a set of existing features resulting in construction of new features. Examples of such constructive operators include checking for the equality conditions {=, ≠}, the arithmetic operators {+,−,×, /}, the array operators {max(S), min(S), average(S)} as well as other more sophisticated operators, for example count(S,C)[4] that counts the number of features in the feature vector S satisfying some condition C or, for example, distances to other recognition classes generalized by some accepting device. Feature construction has long been considered a powerful tool for increasing both accuracy and understanding of structure, particularly in high-dimensional problems.[5] Applications include studies of disease and emotion recognition from speech.[6]

Selection and extraction[edit]

The initial set of raw features can be redundant and large enough that estimation and optimization is made difficult or ineffective. Therefore, a preliminary step in many applications of machine learning and pattern recognition consists of selecting a subset of features, or constructing a new and reduced set of features to facilitate learning, and to improve generalization and interpretability.[7]

Extracting or selecting features is a combination of art and science; developing systems to do so is known as feature engineering. It requires the experimentation of multiple possibilities and the combination of automated techniques with the intuition and knowledge of the domain expert. Automating this process is feature learning, where a machine not only uses features for learning, but learns the features itself.

See also[edit]

References[edit]

  1. ^ Bishop, Christopher (2006). Pattern recognition and machine learning. Berlin: Springer. ISBN 0-387-31073-8.
  • ^ Liu, H., Motoda H. (1998) Feature Selection for Knowledge Discovery and Data Mining., Kluwer Academic Publishers. Norwell, MA, USA. 1998.
  • ^ Piramuthu, S., Sikora R. T. Iterative feature construction for improving inductive learning algorithms. In Journal of Expert Systems with Applications. Vol. 36 , Iss. 2 (March 2009), pp. 3401-3406, 2009
  • ^ Bloedorn, E., Michalski, R. Data-driven constructive induction: a methodology and its applications. IEEE Intelligent Systems, Special issue on Feature Transformation and Subset Selection, pp. 30-37, March/April, 1998
  • ^ Breiman, L. Friedman, T., Olshen, R., Stone, C. (1984) Classification and regression trees, Wadsworth
  • ^ Sidorova, J., Badia T. Syntactic learning for ESEDA.1, tool for enhanced speech emotion detection and analysis. Internet Technology and Secured Transactions Conference 2009 (ICITST-2009), London, November 9–12. IEEE
  • ^ Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer. ISBN 978-0-387-84884-6.

  • Retrieved from "https://en.wikipedia.org/w/index.php?title=Feature_(machine_learning)&oldid=1228325586"

    Categories: 
    Data mining
    Machine learning
    Pattern recognition
    Hidden categories: 
    Articles with short description
    Short description is different from Wikidata
    Articles needing additional references from December 2014
    All articles needing additional references
    All articles with unsourced statements
    Articles with unsourced statements from June 2024
     



    This page was last edited on 10 June 2024, at 16:43 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki