Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Education and early career  





2 Open source projects  





3 Use of MapReduce paradigm  





4 Open source foundations and awards  





5 References  





6 Articles  





7 External links  














Doug Cutting






Français
ி
 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


Doug Cutting
Doug Cutting
Known forOpen-source software, The Apache Software Foundation
AwardsO'Reilly Open Source Award

Douglass Read Cutting is a software designer, advocate for, and creator of open-source search technology. He founded two technology projects, Lucene and Nutch, with Mike Cafarella. The Apache Software Foundation now manages both projects. Cutting and Cafarella were also co-founders of Apache Hadoop.[1]

Education and early career

[edit]

Cutting graduated from Stanford University in 1985 with a bachelor's degree.[2][3]

Prior to developing Lucene, Cutting held search technology positions at Xerox PARC where he worked on the Scatter/Gather algorithm[4][5] and on computational stylistics.[6] He also worked at Excite, where he was one of the chief designers of the search engine, and Apple Inc., where he was the primary author of the V-Twin text search framework.[7]

Open source projects

[edit]

Lucene, a search indexer, and Nutch, a spider or crawler, are the two key components of an open-source general search platform that first crawls the Web for content, and then structures it into a searchable index. Cutting's leadership of these two projects extended the concepts and capabilities of general open-source software projects such as Linux and MySQL into the vertical domain of search.[8] In a 2017 article, Cutting was quoted with the statement, "Open source is a requirement for business."[9]

Use of MapReduce paradigm

[edit]

In December 2004, Google Research published a paper on the MapReduce algorithm, which allows very large-scale computations to be trivially parallelized across large clusters of servers. Cutting and Mike Cafarella, realizing the importance of this paper to extending Lucene into the realm of extremely large search problems, created the open-source Hadoop framework. This framework allows applications based on the MapReduce paradigm to be run on large clusters of commodity hardware. Cutting was an employee of Yahoo!, where he led the Hadoop project full-time; he later went on to work for Cloudera.[10]

Open source foundations and awards

[edit]

In July 2009, Cutting was elected to the board of directors of the Apache Software Foundation, and in September 2010, he was elected the chairman.[11]

In 2015, Cutting was awarded the O'Reilly Open Source Award.[12]

References

[edit]
  1. ^ Cutting, Mike Cafarella, Ben Lorica, Doug (2016-03-31). "The next 10 years of Apache Hadoop". O'Reilly Media. Retrieved 2018-04-16.{{cite news}}: CS1 maint: multiple names: authors list (link)
  • ^ "Doug Cutting—The Father of Search - Code World". www.codetd.com. Retrieved 18 May 2022.
  • ^ "Cloudera management team". Cloudera. Retrieved 2016-08-17.
  • ^ Cutting, Douglass R., David R. Karger, Jan O. Pedersen, and John W. Tukey. "Scatter/gather: A cluster-based approach to browsing large document collections." SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval. (Reprinted in ACM SIGIR Forum, vol. 51, no. 2, pp. 148-159. ACM, 2017.)
  • ^ Pedersen, Jan O., David Karger, Douglass R. Cutting, and John W. Tukey. "Scatter-gather: a cluster-based method and apparatus for browsing large document collections." U.S. Patent 5,442,778, issued August 15, 1995.
  • ^ Karlgren, Jussi; Cutting, Douglass. "Recognizing text genres with simple metrics using discriminant analysis.". Proceedings of the 15th conference on Computational linguistics-Volume 2. Association for Computational Linguistics, 1994.
  • ^ "The Lucene search engine: Powerful, flexible, and free". JavaWorld (published 2000-09-15). 15 September 2000. Retrieved 2017-01-25. Cutting is the primary author of the V-Twin search engine (part of Apple's Copland operating system effort)…
  • ^ "Wikipedia: Powered by Lucene". Lucene. Retrieved September 5, 2007.
  • ^ "Doug Cutting, 'father' of Hadoop, talks about big data tech evolution". ComputerWeekly.com. Retrieved June 26, 2018.
  • ^ Handy, Alex (10 August 2009). "Hadoop creator goes to Cloudera". Software Development Times. Archived from the original on 13 March 2012. Retrieved 2011-03-22.
  • ^ Sally (2010-07-15). "The Apache Software Foundation Announces New Board Members". The Apache Software Foundation Blog. Retrieved 2023-05-02.
  • ^ "O'Reilly Open Source Awards - OSCON 2015". YouTube. O'Reilly. Archived from the original on 2021-12-14. Retrieved 27 July 2015.
  • Articles

    [edit]
    [edit]
    Retrieved from "https://en.wikipedia.org/w/index.php?title=Doug_Cutting&oldid=1232147497"

    Categories: 
    American information theorists
    Living people
    Stanford University alumni
    Scientists at PARC (company)
    Yahoo! employees
    Apple Inc. employees
    American computer programmers
    Open source advocates
    Hidden categories: 
    CS1 maint: multiple names: authors list
    Articles with short description
    Short description matches Wikidata
    Wikipedia articles with style issues from February 2012
    All articles with style issues
    Articles with hCards
    Articles with VIAF identifiers
    Articles with ACM-DL identifiers
    Articles with DBLP identifiers
    Articles with SUDOC identifiers
    Year of birth missing (living people)
     



    This page was last edited on 2 July 2024, at 06:02 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki