BookmarkSubscribeRSS Feed
Sassy_lady
Obsidian | Level 7

 

Hello, I was given this question for review and the solution, but I had no idea how to begin this problem.

 

Questions

  1. When you write a new SAS program I thought you have to reference a libname or data step to begin. When do you begin with a libname or data step? The solution automatically goes into data steps and statements.
  2. If a data set is in the data folder do you reference the work folder? I have trouble writing a set statement sometimes if the data set is not listed in the sasa.help or libraries.
  3. What is the CR. referencing in this solution?
  4. What does it mean when the question asks for unique columns? Sometimes, the way SAS words the questions is confusing.
  5. It says values in country data must be US or AU, but I thought you would write a where statement and place 'US' and 'AU' in quotation marks.
  6. Overall, I am very concrete thinker so if the question does not specify nlevels, format, tables, etc I will exclude it from the code unless I can interpret what they are asking me.
  7. What is the best way to interpret what a question is saying to receive the solution they are asking for? 

I do highlight key words, reference my syntax sheet, and review old lessons, but every question is worded differently. Some questions give more details than others. I know I asked a lot of questions, but I would appreciate anyone's help.

 

 

 

 

 

 

Write a new SAS program that uses procedures to evaluate whether the data in the employee_raw table meets the following requirements:

  • Values in the EmpID column must be unique.
  • Values in the Country column should be either US or AU.
  • There are 17 unique department names.
  • If TermDate has a known value, it should be after HireDate.

 

  • Which value of EmpID occurs more than once?

  • How many rows in the Country column violate the data rules?

  • What is the name of the department that has the most employees?

  • How many rows have a known value for TermDate that is before HireDate?



proc freq data=cr.employee_raw order=freq nlevels ;
    tables EmpID Country Department;	
run;

proc print data=cr.employee_raw;
    where TermDate ne . and HireDate>TermDate;
    format salary dollar10. TermDate HireDate BirthDate date9.;
run;
4 REPLIES 4
ballardw
Super User

Generic comment:

If you do not specify a library with a data set name then SAS assumes you are using a one level and will default to treating the name  in the WORK library. This one-level name setting can be changed but that's pretty advanced and not commonly done.

You can create data set without referencing anything outside of a data step. Example:

data junk;
   x=3;
run;

Trivial but if the variables are created and values assigned in the code then you do not need to "set" anything from another data set.

Also data steps can read external files.

 

The remainder of your questions reference something and it is not obvious what it may be. You should phrase questions such than any other stuff needed is easy to find.

Example the question: What is the CR. referencing in this solution? Are you asking about the code here:

proc freq data=cr.employee_raw order=freq nlevels ;

If you do not recognize CR. as a library name then you need to go back to earlier classes as that should have been one of the very earliest items in your training. If you hope to take one of the certification exams just following the review questions then be prepared for a rough road.

 

 

Another instance your question: What does it mean when the question asks for unique columns?

There is nothing about "unique columns" that I see. I see unique values. At which point you are at the meaning of the word "unique" applied to a single variable.

 

These all relate to "evaluate whether the data in the employee_raw table meets the following requirements:".

Which means use one or more procedures to see if the following statements are true or not.

 

 

 

Sassy_lady
Obsidian | Level 7

Hello Ballardw,

 

I'll try to word my questions better next time. Well If I have a long road ahead I'll review the previous video lessons. Thank you for your help.

tarheel13
Rhodochrosite | Level 12

Unique means no duplicates. So you should not have the same ID twice in your dataset. PROC FREQ is fine to check if the country columns is either US or AU. CR would be the library name that the dataset is stored in. Yes, you do have to put US and AU in quotes if you're using them in a where statement.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 803 views
  • 2 likes
  • 3 in conversation