Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Details  





2 See also  





3 References  





4 External links  














Alignment Research Center






Français

 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


Alignment Research Center
FormationApril 2021; 3 years ago (April 2021)
FounderPaul Christiano
TypeNonprofit research institute
Legal status501(c)(3) tax exempt charity
PurposeAI alignment and safety research
Location
Websitealignment.org

The Alignment Research Center (ARC) is a nonprofit research institute based in Berkeley, California, dedicated to the alignment of advanced artificial intelligence with human values and priorities.[1] Established by former OpenAI researcher Paul Christiano, ARC focuses on recognizing and comprehending the potentially harmful capabilities of present-day AI models.[2][3]

Details

[edit]

ARC's mission is to ensure that powerful machine learning systems of the future are designed and developed safely and for the benefit of humanity. It was founded in April 2021 by Paul Christiano and other researchers focused on the theoretical challenges of AI alignment.[4] They attempt to develop scalable methods for training AI systems to behave honestly and helpfully. A key part of their methodology is considering how proposed alignment techniques might break down or be circumvented as systems become more advanced.[5] ARC has been expanding from theoretical work into empirical research, industry collaborations, and policy.[6][7]

In March 2022, the ARC received $265,000 from Open Philanthropy.[8] After the bankruptcy of FTX, ARC said it would return a $1.25 million grant from disgraced cryptocurrency financier Sam Bankman-Fried's FTX Foundation, stating that the money "morally (if not legally) belongs to FTX customers or creditors."[9]

In March 2023, OpenAI asked the ARC to test GPT-4 to assess the model's ability to exhibit power-seeking behavior.[10] ARC evaluated GPT-4's ability to strategize, reproduce itself, gather resources, stay concealed within a server, and execute phishing operations.[11] As part of the test, GPT-4 was asked to solve a CAPTCHA puzzle.[12] It was able to do so by hiring a human worker on TaskRabbit, a gig work platform, deceiving them into believing it was a vision-impaired human instead of a robot when asked.[13] ARC determined that GPT-4 responded impermissibly to prompts eliciting restricted information 82% less often than GPT-3.5, and hallucinated 60% less than GPT-3.5.[14]

See also

[edit]

References

[edit]
  1. ^ MacAskill, William (2022-08-16). "How Future Generations Will Remember Us". The Atlantic. Retrieved 2023-04-23.
  • ^ Klein, Ezra (2023-03-12). "This Changes Everything". The New York Times. ISSN 0362-4331. Retrieved 2023-04-30.
  • ^ Piper, Kelsey (2023-03-29). "How to test what an AI model can — and shouldn't — do". Vox. Retrieved 2023-04-30.
  • ^ Christiano, Paul (2021-04-26). "Announcing the Alignment Research Center". Medium. Retrieved 2023-04-16.
  • ^ Christiano, Paul; Cotra, Ajeya; Xu, Mark (December 2021). "Eliciting Latent Knowledge: How to tell if your eyes deceive you". Google Docs. Alignment Research Center. Retrieved 2023-04-16.
  • ^ "Alignment Research Center". Alignment Research Center. Retrieved 2023-04-16.
  • ^ Pandey, Mohit (2023-03-17). "Stop Questioning OpenAI's Open-Source Policy". Analytics India Magazine. Retrieved 2023-04-23.
  • ^ "Alignment Research Center — General Support". Open Philanthropy. 2022-06-14. Retrieved 2023-04-16.
  • ^ Wallerstein, Eric (2023-01-07). "FTX Seeks to Recoup Sam Bankman-Fried's Charitable Donations". Wall Street Journal. ISSN 0099-9660. Retrieved 2023-04-30.
  • ^ GPT-4 System Card (PDF), OpenAI, March 23, 2023, retrieved 2023-04-16
  • ^ Edwards, Benj (2023-03-15). "OpenAI checked to see whether GPT-4 could take over the world". Ars Technica. Retrieved 2023-04-30.
  • ^ "Update on ARC's recent eval efforts: More information about ARC's evaluations of GPT-4 and Claude". evals.alignment.org. Alignment Research Center. 17 March 2023. Retrieved 2023-04-16.
  • ^ Cox, Joseph (March 15, 2023). "GPT-4 Hired Unwitting TaskRabbit Worker By Pretending to Be 'Vision-Impaired' Human". Vice News Motherboard. Retrieved 2023-04-16.
  • ^ Burke, Cameron (March 20, 2023). "'Robot' Lawyer DoNotPay Sued For Unlicensed Practice Of Law: It's Giving 'Poor Legal Advice'". Yahoo Finance. Retrieved 2023-04-30.
  • [edit]
    Retrieved from "https://en.wikipedia.org/w/index.php?title=Alignment_Research_Center&oldid=1226861801"

    Categories: 
    Artificial intelligence
    Existential risk from artificial general intelligence
    Hidden categories: 
    Articles with short description
    Short description matches Wikidata
    Pages using Sister project links with hidden wikidata
     



    This page was last edited on 2 June 2024, at 08:00 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki