I am currently studying course Programming 2. In lesson (summarizing data), level 1 practice. I make sense the answer, but I am confused about question 2. When I write code as following:
proc sort data=pg2.np_yearlyTraffic
out=sortedTraffic(keep=ParkType ParkName Location Count);
by ParkType ParkName;
run;
proc sort data=pg2.np_yearlyTraffic
out=sortedTraffic(keep=ParkType ParkName Location Count);
by ParkType ParkName;
run;
SAS sorted column Parktype, but when I check output data, there are only 2 type of park in ParkType column. Actually, there are 5 parktypes. I am super confused about it. Can you please help me figure it out?
Hi:
I would recommend trying a PROC FREQ after your PROC SORT. When I do this:
I do see that there are 5 ParkType values in the sorted data.
So then, when I run the next step for question #2:
I get 478 rows and 2 columns in the output table work.TypeTraffic to answer #4.
Next, for question #4, I do THIS:
And then in my output (when I do a PROC PRINT), I see 5 rows, one for each value of ParkType, as shown in the PROC FREQ:
And the answer to #5 is shown in the PROC PRINT results as highlighted above.
If you want to see the result of using BY group processing in a data step, modify the first program to save the values of first.ParkType and Last.ParkType, as shown below:
data TypeTraffic;
set work.sortedTraffic;
by ParkType;
if first.ParkType=1 then TypeCount=0;
TypeCount+Count;
** save the values of the first. and last. variables;
first_by = first.ParkType;
last_by = last.ParkType;
format typeCount comma12.;
keep ParkType first_by last_by ParkName Count TypeCount;
run;
proc print data=TypeTraffic;
where first_by = 1 or last_by = 1;
var ParkType first_by last_by ParkName Count TypeCount;
run;
And then you'll see the beginning values for each park and the ending values for each park:
If you want to see all the First.BY and Last.BY values on every row, then run the PROC PRINT without the WHERE statement.
I hope this helps you understand the first.and last. processing.
Cynthia
First, follow Maxim 2 and read the log. You will see that PROC SORT keeps all observations (since you did not use the nodup option).
Next, follow Maxim 3 (Know Your Data), and inspect your source dataset. Run
proc freq data=pg2.np_yearlyTraffic;
tables parktype;
run;
to see the distribution of values in your variable.
Thanks Sir, why do you use tables here rather than table? I try table again, it's noting different. Thanks.
@Jianan_luna wrote:
Thanks Sir, why do you use tables here rather than table? I try table again, it's noting different. Thanks.
In the documentation, the statement is TABLES.
Hi:
I would recommend trying a PROC FREQ after your PROC SORT. When I do this:
I do see that there are 5 ParkType values in the sorted data.
So then, when I run the next step for question #2:
I get 478 rows and 2 columns in the output table work.TypeTraffic to answer #4.
Next, for question #4, I do THIS:
And then in my output (when I do a PROC PRINT), I see 5 rows, one for each value of ParkType, as shown in the PROC FREQ:
And the answer to #5 is shown in the PROC PRINT results as highlighted above.
If you want to see the result of using BY group processing in a data step, modify the first program to save the values of first.ParkType and Last.ParkType, as shown below:
data TypeTraffic;
set work.sortedTraffic;
by ParkType;
if first.ParkType=1 then TypeCount=0;
TypeCount+Count;
** save the values of the first. and last. variables;
first_by = first.ParkType;
last_by = last.ParkType;
format typeCount comma12.;
keep ParkType first_by last_by ParkName Count TypeCount;
run;
proc print data=TypeTraffic;
where first_by = 1 or last_by = 1;
var ParkType first_by last_by ParkName Count TypeCount;
run;
And then you'll see the beginning values for each park and the ending values for each park:
If you want to see all the First.BY and Last.BY values on every row, then run the PROC PRINT without the WHERE statement.
I hope this helps you understand the first.and last. processing.
Cynthia
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.