Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 

















Chipkill






Français

 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 

(Redirected from Chipspare)

ChipkillisIBM's trademark for a form of advanced error checking and correcting (ECC) computer memory technology that protects computer memory systems from any single memory chip failure as well as multi-bit errors from any portion of a single memory chip.[1][2] One simple scheme to perform this function scatters the bits of a Hamming code ECC word across multiple memory chips, such that the failure of any single memory chip will affect only one ECC bit per word. This allows memory contents to be reconstructed despite the complete failure of one chip. Typical implementations use more advanced codes, such as a BCH code, that can correct multiple bits with less overhead.

Chipkill is frequently combined with dynamic bit-steering, so that if a chip fails (or has exceeded a threshold of bit errors), another, spare, memory chip is used to replace the failed chip. The concept is similar to that of RAID, which protects against disk failure, except that now the concept is applied to individual memory chips. The technology was developed by the IBM Corporation in the early and middle 1990s. An important RAS feature, Chipkill technology is deployed primarily on SSDs, mainframes and midrange servers.

An equivalent system from Sun Microsystems is called Extended ECC, while equivalent systems from HP are called Advanced ECC[3] and Chipspare. A similar system from Intel, called Lockstep memory, provides double-device data correction (DDDC) functionality.[4] Similar systems from Micron, called redundant array of independent NAND (RAIN), and from SandForce, called RAISE level 2, protect data stored on SSDs from any single NAND flash chip going bad.[5][6]

A 2009 paper using data from Google's datacentres[7] provided evidence demonstrating that in observed Google systems, DRAM errors were recurrent at the same location, and that 8% of DIMMs were affected each year. Specifically, "In more than 85% of the cases a correctable error is followed by at least one more correctable error in the same month". DIMMs with chipkill error correction showed a lower fraction of DIMMs reporting uncorrectable errors compared to DIMMs with error correcting codes that can only correct single-bit errors. A 2010 paper from University of Rochester also showed that Chipkill memory gave substantially lower memory errors, using both real world memory traces and simulations.[8]

See also[edit]

References[edit]

  1. ^ Timothy J. Dell (1997-11-19). "A White Paper on the Benefits of Chipkill-Correct ECC for PC Server Main Memory" (PDF). IBM. Archived from the original (PDF) on 2015-09-23. Retrieved 2015-02-02.
  • ^ "Enhancing IBM Netfinity Server Reliability: IBM Chipkill Memory" (PDF). IBM. 2000. Archived from the original (PDF) on 2015-09-23. Retrieved 2015-02-02.
  • ^ "Best Practice Guidelines for ProLiant Servers with the Intel Xeon 5500 processor series Engineering Whitepaper, 1st Edition" (PDF). HP. May 2009. p. 8. Retrieved 2014-09-09.
  • ^ Thomas Willhalm (2014-07-11). "Independent Channel vs. Lockstep Mode – Drive your Memory Faster or Safer". Intel. Retrieved 2015-02-02.
  • ^ Lee Hutchinson. "Solid-state revolution: in-depth on how SSDs really work". 2012.
  • ^ Eric Slack. "How to Make Reliable SSDs - Reliable NAND Flash".
  • ^ Schroeder, Bianca; Pinheiro, Eduardo; Weber, Wolf-Dietrich (2009). "DRAM errors in the wild: A large-scale field study" (PDF). Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems. SIGMETRICS '09. ACM. pp. 193–204. doi:10.1145/1555349.1555372. ISBN 9781605585116. S2CID 6115552. Retrieved 7 September 2011.
  • ^ Li, Xin; Huang, Michael; Shen, Kai; Lingkun, Chu (2010). ""A Realistic Evaluation of Memory Hardware Errors and Software System Susceptibility". Usenix Annual Tech Conference 2010" (PDF).
  • External links[edit]


    Retrieved from "https://en.wikipedia.org/w/index.php?title=Chipkill&oldid=1212659149"

    Categories: 
    Computer memory
    Error detection and correction
    IBM computer hardware
     



    This page was last edited on 8 March 2024, at 23:49 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki