When I do not complete an earlier step in the course to load data into HDFS I get the same error running that Pig program. When I run that earlier step to load the data in HDFS then the Pig program works. So that is likely the issue you are having. Try these steps:
- Open mRemoteNG from the desktop icon.
- In the connections panel in top left of the mRemoteNG application, double-click student@HadoopClient connection.
- On the linux command line for that connection submit this command:
hdfs dfs -put /workshop/dihdm/data /user/student/dihdm
When you execute that command you may (or may not) see one or more messages that files or directories already exist but that is OK.
Here is a helpful screen capture:
Once above is complete you will hopefully be able to execute that Pig program (and others). This is something you should repeat whenever you start up a new Fresh image instead of starting from a saved image.