Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Disks  





2 Processor and Memory  





3 Optical media  





4 References  





5 See also  














Predictive failure analysis






Русский
Simple English
 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


Predictive Failure Analysis (PFA) refers to methods intended to predict imminent failure of systems or components (software or hardware), and potentially enable mechanisms to avoid or counteract failure issues, or recommend maintenance of systems prior to failure.

For example, computer mechanisms that analyze trends in corrected errors to predict future failures of hardware/memory components and proactively enabling mechanisms to avoid them. Predictive Failure Analysis was originally used as term for a proprietary IBM technology for monitoring the likelihood of hard disk drives to fail, although the term is now used generically for a variety of technologies for judging the imminent failure of CPU's, memory and I/O devices.[1] See also first failure data capture.

Disks[edit]

IBM introduced the term PFA and its technology in 1992 with reference to its 0662-S1x drive (1052 MB Fast-Wide SCSI-2 disk which operated at 5400 rpm).

The technology relies on measuring several key (mainly mechanical) parameters of the drive unit, for example the flying height of heads. The drive firmware compares the measured parameters against predefined thresholds and evaluates the health status of the drive. If the drive appears likely to fail soon, the system sends notification to the disk controller.

The major drawbacks of the technology included:

The technology merged with IntelliSafe to form the Self-Monitoring, Analysis, and Reporting Technology (SMART).

Processor and Memory[edit]

High counts of corrected RAM intermittent errors by ECC can be predictive of future DIMM failures [2] and so automatic offlining for memory and CPU caches can be used to avoid future errors,[3] for example under the Linux operating system the mcelog daemon will automatically remove from usage memory pages showing excessive corrections, and will remove from usage processor cores showing excessive cache correctable memory errors.[4]

Optical media[edit]

Onoptical media (CD, DVD and Blu-ray), failures caused by degradation of media can be predicted and media of low manufacturing quality can be detected prior to data loss occurring by measuring the rate of correctable data errors using software such as QpxToolorNero DiscSpeed. However, not all vendors and models of optical drives allow error scanning.[5]

References[edit]

  1. ^ Intel Corp (2011). "Intel Xeon Processor E7 Family: supporting next generation RAS servers. White paper". Retrieved 9 May 2012.
  • ^ Bianca Schroeder; Eduardo Pinheiro; Wolf-Dietrich Weber (2009). "DRAM Errors in the Wild: A Large-Scale Field Study. Proceedings SIGMETRICS, 2009".
  • ^ Tang, Arruthers, Totari, Shapiro (2006). ""Assessment of the Effect of Memory Page Retirement on Systems RAS against Hardware Faults", Proceedings of the 2006 International Conference on Dependable Systems and Networks".{{cite news}}: CS1 maint: multiple names: authors list (link)
  • ^ "mcelog - memory error handling in user space. Linux Kongress 2010" (PDF). 2010.
  • ^ List of supported devices by dosc quality scanning software QPxTool
  • See also[edit]


  • t
  • e

  • Retrieved from "https://en.wikipedia.org/w/index.php?title=Predictive_failure_analysis&oldid=1212658710"

    Categories: 
    Hard disk computer storage
    IBM storage devices
    Computer storage stubs
    Hidden categories: 
    CS1 maint: multiple names: authors list
    All stub articles
     



    This page was last edited on 8 March 2024, at 23:46 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki