articleJun 9, 2008Closed access
Pig latin
Indexed incrossref
Abstract
There is a growing need for ad-hoc analysis of extremely large data sets, especially at internet companies where innovation critically depends on being able to analyze terabytes of data collected every day. Parallel database products, e.g., Teradata, offer a solution, but are usually prohibitively expensive at this scale. Besides, many of the people who analyze this data are entrenched procedural programmers, who find the declarative, SQL style to be unnatural. The success of the more procedural map-reduce programming model, and its associated scalable implementations on commodity hardware, is evidence of the above. However, the map-reduce paradigm is too low-level and rigid, and leads to a great deal of…
Citation impact
1,744
total citations
- FWCI
- 166.59
- Percentile
- 100%
- References
- 16
Citations per year
Authors
5Topics & keywords
Topics
Keywords
- Computer science
- Scalability
- Implementation
- Terabyte
- Programming paradigm
- SQL
- Code reuse
- Reuse
UN Sustainable Development Goals
- Industry, innovation and infrastructure
No related works found for this paper.