Skip to main content

Hadoop for beginners

I just completed by hadoop fundamentals course from Udemy.com . The videos were very well organized so that you will get the glance of what is this world of big data and how hadoop framework can play a major role is processing this big data. The course was insisting in downloading hortonworks hadoop development sandbox and working with it. Hortonworks are providing the hadoop environment setup to download and we can load it in a virtual machine. I have downloaded the virtual box sandbox file.

The course gave a string insight on hadoop architecture and buzz words around it. It gave a in depth idea of hive and pig tools and how they play the key role in storing and processing data in the framework.

Comments

Popular posts from this blog

UNIX : How to get record count from zipped file

Sometimes we may need to get records count from file . For that we can use wc -l , command with file name. In some situation the file will be in compressed format . wc -l will not directly work with zipped files . In this case we can do zcat the file and pipe the word count command with it. Example : Let say we have a file cricketData.dat.gz To get word count from the file use : zcat cricketData.dat.gz | wc -l This will give the record count.

Excel : How to pad zeros

Today I got a requirement to format the number in excel cell - to left pad number with zeros.i find the following function very useful to do it. In case one to make the number left padded with "0" s give the formula =TEXT(A1,"0000") In case two even more enhanced form to make it left padded with "0" and add two decimal places give the formula as =TEXT(A2,"0000.00")

Scala

Scala is a object oriented functional type programing language. All variables declared in scala is considered as objects.