PLDB
Languages Features Calendar CSV Lists Blog About Sponsor Add Language
GitHub icon

Pig Latin

Pig Latin

Pig Latin is a query language created in 2008.

#339on PLDB 14Years Old 1.3kUsers
0Books 0Papers 1kRepos

Apache Pig is a high-level platform for creating programs that run on Apache Hadoop. The language for this platform is called Pig Latin. Pig can execute its Hadoop jobs in MapReduce, Apache Tez, or Apache Spark. Read more on Wikipedia...


Example from the web:
input_lines = LOAD '/tmp/word.txt' AS (line:chararray); words = FOREACH input_lines GENERATE FLATTEN(TOKENIZE(line)) AS word; filtered_words = FILTER words BY word MATCHES '\\w+'; word_groups = GROUP filtered_words BY word; word_count = FOREACH word_groups GENERATE COUNT(filtered_words) AS count, group AS word; ordered_word_count = ORDER word_count BY count DESC; STORE ordered_word_count INTO '/tmp/results.txt';
Example from hello-world:
Hello WorldPIGHello World
Example from Linguist:
/** * sample.pig */ REGISTER $SOME_JAR; A = LOAD 'person' USING PigStorage() AS (name:chararray, age:int); -- Load person B = FOREACH A generate name; DUMP B;
Example from Wikipedia:
input_lines = LOAD '/tmp/my-copy-of-all-pages-on-internet' AS (line:chararray); -- Extract words from each line and put them into a pig bag -- datatype, then flatten the bag to get one word on each row words = FOREACH input_lines GENERATE FLATTEN(TOKENIZE(line)) AS word; -- filter out any words that are just white spaces filtered_words = FILTER words BY word MATCHES '\\w+'; -- create a group for each word word_groups = GROUP filtered_words BY word; -- count the entries in each group word_count = FOREACH word_groups GENERATE COUNT(filtered_words) AS count, group AS word; -- order the records by count ordered_word_count = ORDER word_count BY count DESC; STORE ordered_word_count INTO '/tmp/number-of-words-on-internet';

Language features

Feature Supported Example Token
Integers
-- [0-9]+L?
Floats
-- [0-9]*\.[0-9]+(e[0-9]+)?[fd]?
Hexadecimals
-- 0x[0-9a-f]+
MultiLine Comments
/* A comment
*/
/* */
Comments
-- A comment
Line Comments
-- A comment
--
Semantic Indentation ϴ
mbox.html · pig.html · comtran.html

View source

PLDB - Build the next great programming language · v2022 · Day 31 · Docs · Build · Acknowledgements · Traffic Today · Traffic Trends · Mirrors · GitHub · feedback@pldb.com