Horotonworks Certification Tips and Guidelines
I successfully completed this certification on Oct 24, 2014 with a passing score of 88%. I am sharing the experience I gained on this certification. I have given all the required materials what I have gone through for this certification. Please have some sandbox level hands on experience on these topics before you appear for the examination.
Read all the answers of a question very carefully before you select the correct answer. Because these are little tricky. Also 90 minutes is more than enough for this exam as you can easily complete in 45 to 60 mins. All the Best
Certification 1 – Hortonworks Certified Apache Hadoop Developer (Pig and Hive)
Exam Format
- Registration Cost – $200/attempt (Unlimited attempts are allowed)
- 1The exam consists of approximately 50 multiple-choice questions. The exam is delivered in English.
- You have to clear 38 questions (75%) to get certified
- Certification References @ Hortonworks — > http://hortonworks.com/training/hadoop-2-0-developer-certification/
Course Curriculum
Objective 1.1 – HDFS and Hadoop 2.0
- Explain Hadoop 2.0 and YARN
- Explain how HDFS Federation works in Hadoop 2.0
- Explain the various tools and frameworks in the Hadoop 2.0 ecosystem
- Use the Hadoop client to input data into HDFS
- Using HDFS commands
Various Study materials I have referred for this objective:
- Hadoop Definitive Guide by Tom White (Chapters 2,3 )
- YARN by Arun C. Murthy (Chapters 3,4)
- http://m.dummies.com/how-to/content/how-to-launch-a-yarnbased-application.html
- http://hortonworks.com/hadoop/yarn/
- http://hortonworks.com/blog/an-introduction-to-hdfs-federation/
- http://hortonworks.com/blog/namenode-high-availability-in-hdp-2-0/
- http://hortonworks.com/hadoop-tutorial/using-commandline-manage-files-hdfs/
- http://hortonworks.com/hadoop/hdfs/
Objective 2.1 – MapReduce and YARN
- Explain the architecture of MapReduce
- Run a MapReduce job on Hadoop
- Monitor a MapReduce job
Various Study materials I have referred for this objective:
- http://hortonworks.com/hadoop/mapreduce/
- http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_using-apache-hadoop/content/running_mapreduce_examples_on_yarn.html
Objective 3.1 – Pig
- Write a Pig script to explore and transform data in HDFS
- Define advanced Pig relations
- Use Pig to apply structure to unstructured Big Data
- Invoke a Pig User-Defined Function
- Compute Quartiles with Pig
- Explore data with Pig
- Split a dataset with Pig
- Join datasets with Pig
- Use Pig to prepare data for Hive
Objective 4.1 – Hive and HCatalog
- Write a Hive query
- Understand how Hive tables are defined and implemented
- Use Hive to run SQL-like queries to perform data analysis
- Perform a multi-table select in Hive
- Design a proper schema for Hive
- Explain the uses and purpose of HCatalog ™
- Use HCatalog with Pig and Hive
- Computing ngrams with Hive
- Analyzing Big Data with Hive
- Understanding MapReduce in Hive
- Joining datasets with Hive
- Streaming data with Hive and Python
Various Study materials I have referred for the objectives 3.1 and 4.1:
- Programming Pig by Alan Gates (Chapters 4,5,6) http://datafu.incubator.apache.org/docs/datafu/guide/statistics.html
- Programming Hive by Jason Rutherglen, Dean Wampler, Edward Caprilo (Chapters 3,4,5,6,14,15) https://cwiki.apache.org/confluence/display/Hive/StatisticsAndDataMining
Objective 5.1 – Hadoop Tools
- Use Sqoop to transfer data between Hadoop and a relational database·
- Using Sqoop to transfer data between HDFS and a RDBMS
- Using HCatalog with Pig
- Define a workflow using Oozie
Various Study materials I have referred for the above objective.
- http://hortonworks.com/hadoop/sqoop/
- http://hortonworks.com/hadoop-tutorial/import-microsoft-sql-server-hortonworks-sandbox-using-sqoop/
- http://hortonworks.com/hadoop/oozie/
- https://cwiki.apache.org/confluence/display/Hive/HCatalog+UsingHCat
Additional Oozie References:
Focus More on:
- Functional Flow of YARN — >> Client — > Resource Manager — > Application Master — >Containers.
- Hadoop fs get (copyToLocal), put (copyFromLocal), cat commands
- Pig JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN (Please find the attachment below)
- Pig Replicated, Merge, Skew Joins
- REGISTER and INVOKE Pig UDFs
- Hive Managed vs External Tables and It’s file transmission to /apps/hive/warehouse directory.
- HCatLoader, HCatStorer
- Hive SORT BY
- MR Default Partitioner
- WebHDFS open, CREATE, MKDIRS, LISTSTATUS, GETFILESTATUS commands
- Hive ngrams
- HDFS Federation vs DataNodes vs Namenode Failures.
- Get a Clear Programming Level understanding of Pig and Hive Scripts
- Apache HCatalog
Some More Notables:
1.Pig Picks the file from current working directory
2.Pig Sample Command
3.Top 50 BiGrams
4. HCatLoader and HCatStorer
5.NameNode Federation
6. YARN Life Cycle
Am sure you must be able to follow my guidelines on this Pig Hive Certification. Wish you all the Best and Advanced Congratulations.
Hi,
Thank you very much for the info. I am preparing for this exam. It would be of great help if you could suggest me some simulator or dumps too. Once again thank you very much for providing precise info.
LikeLike
Hortonworks Sandbox on Oracle VM. Sample questions will b there once u register for the exam. Also all topics have some hands on exercises.
LikeLike
Sample questions not visible even after registration
LikeLike
No sample questions are included in the blog.
LikeLike
Also, let me know if any more formal training is necessary or just following your method would suffice.
LikeLike
Formal training is not required. But need good understanding of entire Hadoop EcoSysyems and YARN.
LikeLike
Giri,
Thank you very much for the quick response. I would just ask one more question on hands-on. What are all the areas should I practice hands on using the VM?
LikeLike
MR,PIG,HIVE,FLUME,SQOOP,OOZIE AND HADOOP FS COMMANDS.
LikeLiked by 1 person
Nice post ! Congratulations ! I’m preparing for this exam.
Best Regards
Marco Garcia
http://www.cetax.com.br
LikeLike
Reblogged this on xpertprogrammers.
LikeLiked by 1 person
Hi Giri, I have registered for my exam, I am not able to see sample question any where after registration at hortonworks site or kryterion site. One of your comments says that “Sample questions will b there once u register for the exam”. Where can I get these sample questions, it will give us confidence if we atleast know what kind of questions will be asked. Thanks for help!
LikeLike
If you go to the exam registration website, you can see hdp developer exam sample questions. Just browse through the exam registration link..u will find it.
LikeLike
Thanks for your quick response. But I am still not able to find them even in registration website . I have registered for Hortonworks Certified Apache Hadoop 2.x Developer. Will you be able to send an email with specifics as I have the exam registered for tomorrow. Any help will be really appreciated.
LikeLike
I don’t have the specifics on the sample questions. Actually they don’t have samples for 2.0. It is there for 1.x. However if you have gone through the topics in the list I have given, u should be good for the exam.
LikeLike
Reblogged this on Big Data World.
LikeLike
Looks like they have completely changed the pattern and it is not going to be MCQs anymore.Are you aware of any such change?
LikeLike
Yes.
LikeLike
Saurabh/ Giri- Can you please provide some more details about the new question pattern for Hortonworks developer (pig and hive) certification?
LikeLike
Hi Himadri,
Please review this page http://hortonworks.com/training/class/hdp-certified-developer-hdpcd-exam/
I have not taken the exam yet.
LikeLike
That was extensive , Giri. Help us know a similar way for administrator certification as well.
LikeLiked by 1 person