In the last Practise of Lesson 5, I encounter some trouble with the Challenge Practice.
My puzzle is that even with this code, I cannot find the answer to the Question 2 easily and even more difficult than without IDGROUP option. I can't figure out the meaning of the IDGROUP option to this question.
Next, I will list the whole problem and solution given by them.(without any modification)
1.Create a new program. Write a PROC MEANS step to analyze rows from pg1.np_multiyr and create a table named top3parks with the following attributes:
- Suppress the display of the PROC MEANS report.
- Analyze Visitors grouped by Region and Year.
- Drop the _FREQ_ and _TYPE_ columns from top3parks and keep only the rows that are a result of a combination of Region and Year.
- Create a column for TotalVisitors in the output table.
- In the output table, include the top three parks in terms of the number of visitors. Automatically resolve conflicts in the column names when the names are assigned to the new columns in the output table.
Note: Use SAS Help to learn about the IDGROUP option in the OUTPUT statement. - Submit the program and view the output data.
2.Which park has the highest value of TotalVisitors?
Solution:
proc means data=pg1.np_multiyr noprint;
var Visitors;
class Region Year;
ways 2;
output out=top3list(drop=_freq_ _type_)
sum=TotalVisitors /*sum total visitors*/
idgroup(max(Visitors) /*find the max of visitors*/
out[3] /*top 3*/
(Visitors ParkName)=); /*output columns for top 3 parks*/
run;
Answer:
Golden Gate National Recreation Area