Skip to main content

Posts

Showing posts with the label Pig

PIG : Reading data from file

To read the data from a file we can use the LOAD command. Assume there is a file named player.csv (downloaded public dataset of english premier league player from one of the open data set). Sample Data from player.csv file Player id,Player,Position,Number,Club,Club (country),D.O.B,Age,Height (cm),Country,Caps,International goals,Plays in home country 336722,Alan PULIDO,Forward,11,Tigres UANL,Mexico,08.03.1991,23,176,Mexico,5,4,TRUE 368902,Adam TAGGART,Forward,9,Newcastle United Jets FC,Australia,02.06.1993,21,172,Australia,4,3,TRUE 362641,Reza GHOOCHANNEJAD,Forward,16,Charlton Athletic FC,England,20.09.1987,26,181,Iran,13,9,FALSE Pig script to load the data. We must specify the record structure of the file. grunt> player_data = LOAD 'players.csv' USING PigStorage( ',' ) AS (player_id:int, player:chararray, position:chararray, number:int, club:chararray, club_country:chararray, d_o_b:chararr