Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 History  





2 See also  





3 References  














Apache Iceberg






Português
 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


Apache Iceberg
Original author(s)Ryan Blue, Daniel Weeks
Initial release10 August 2017; 6 years ago (10 August 2017)
Written inJava, Python
Operating systemCross-platform
TypeData warehouse, Data lake
LicenseApache License 2.0
Website

Apache Iceberg is an open-source high-performance format for huge analytic tables. Iceberg enables the use of SQL tables for big data while making it possible for engines like Spark, Trino, Flink, Presto, Hive, Impala, StarRocks, Doris, and Pig to safely work with the same tables, at the same time.[1] Iceberg is released under the Apache License.[2] Iceberg addresses the performance and usability challenges of using Apache Hive tables in large and demanding data lake environments.[3] Vendors currently supporting Apache Iceberg tables in their products include CelerData, Cloudera, Dremio, IOMETE, Snowflake, Starburst, Tabular,[4] and AWS.[5]

History

[edit]

Iceberg was started at Netflix by Ryan Blue and Dan Weeks. Hive was used by many different services and engines in the Netflix infrastructure. Hive was never able to guarantee correctness and did not provide stable atomic transactions.[3] Many at Netflix avoided using these services and making changes to the data to avert unintended consequences from the Hive format.[3] Ryan Blue set out to address three issues that faced the Hive table by creating Iceberg:[3]

  1. Ensure the correctness of the data and support ACID transactions.
  2. Improve performance by enabling finer-grained operations to be done at the file granularity for optimal writes.
  3. Simplify and obfuscate[citation needed] general operation and maintenance of tables.

Iceberg development started in 2017.[6] The project was open-sourced and donated to the Apache Software Foundation in November 2018.[7] In May 2020, the Iceberg project graduated to become a top-level Apache project.[7]

Iceberg is used by multiple companies including Airbnb,[8] Apple,[3] Expedia,[9] LinkedIn,[10] Adobe,[11] Lyft, and many more.[12]

See also

[edit]

References

[edit]
  1. ^ "Apache Iceberg". iceberg.apache.org. Retrieved 5 October 2022.
  • ^ "apache/iceberg GitHub License". The Apache Software Foundation. 5 October 2022. Retrieved 5 October 2022.
  • ^ a b c d e Woodie, Alex (8 February 2021). "Apache Iceberg: The Hub of an Emerging Data Service Ecosystem?". Datanami.
  • ^ "Vendors". iceberg.apache.org. Retrieved 2023-05-05.
  • ^ "Using Apache Iceberg tables – Amazon Athena". Amazon Web Services, Inc.
  • ^ "Initial public release in apache/iceberg". GitHub. Retrieved 5 October 2022.
  • ^ a b "Incubation Status Template - Apache Incubator". incubator.apache.org.
  • ^ Zhu, Ronnie (26 September 2022). "Upgrading Data Warehouse Infrastructure at Airbnb". The Airbnb Tech Blog.
  • ^ Mathiesen, Christine (26 January 2021). "A Short Introduction to Apache Iceberg". Expedia Group Technology. Retrieved 5 October 2022.
  • ^ "FastIngest: Low-latency Gobblin with Apache Iceberg and ORC format". engineering.linkedin.com.
  • ^ Bremner, Jaemi (3 December 2020). "Iceberg at Adobe". Medium.
  • ^ Council, Data. "Open Source Highlight: Apache Iceberg". www.datacouncil.ai. Retrieved 5 October 2022.

  • Retrieved from "https://en.wikipedia.org/w/index.php?title=Apache_Iceberg&oldid=1225409756"

    Categories: 
    SQL
    Free system software
    Hadoop
    Cloud platforms
    Java platform
    Hidden categories: 
    Articles with short description
    Short description is different from Wikidata
    All articles with unsourced statements
    Articles with unsourced statements from June 2023
     



    This page was last edited on 24 May 2024, at 08:15 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki