I am currently studying course Programming 2. In lesson (summarizing data), level 1 practice. I make sense the answer, but I am confused about question 2. When I write code as following:
proc sort data=pg2.np_yearlyTraffic
out=sortedTraffic(keep=ParkType ParkName Location Count);
by ParkType ParkName;
run;
proc sort data=pg2.np_yearlyTraffic
out=sortedTraffic(keep=ParkType ParkName Location Count);
by ParkType ParkName;
run;SAS sorted column Parktype, but when I check output data, there are only 2 type of park in ParkType column. Actually, there are 5 parktypes. I am super confused about it. Can you please help me figure it out?
Hi:
I would recommend trying a PROC FREQ after your PROC SORT. When I do this:
I do see that there are 5 ParkType values in the sorted data.
So then, when I run the next step for question #2:
I get 478 rows and 2 columns in the output table work.TypeTraffic to answer #4.
Next, for question #4, I do THIS:
And then in my output (when I do a PROC PRINT), I see 5 rows, one for each value of ParkType, as shown in the PROC FREQ:
And the answer to #5 is shown in the PROC PRINT results as highlighted above.
If you want to see the result of using BY group processing in a data step, modify the first program to save the values of first.ParkType and Last.ParkType, as shown below:
data TypeTraffic;
set work.sortedTraffic;
by ParkType;
if first.ParkType=1 then TypeCount=0;
TypeCount+Count;
** save the values of the first. and last. variables;
first_by = first.ParkType;
last_by = last.ParkType;
format typeCount comma12.;
keep ParkType first_by last_by ParkName Count TypeCount;
run;
proc print data=TypeTraffic;
where first_by = 1 or last_by = 1;
var ParkType first_by last_by ParkName Count TypeCount;
run;
And then you'll see the beginning values for each park and the ending values for each park:
If you want to see all the First.BY and Last.BY values on every row, then run the PROC PRINT without the WHERE statement.
I hope this helps you understand the first.and last. processing.
Cynthia
First, follow Maxim 2 and read the log. You will see that PROC SORT keeps all observations (since you did not use the nodup option).
Next, follow Maxim 3 (Know Your Data), and inspect your source dataset. Run
proc freq data=pg2.np_yearlyTraffic;
tables parktype;
run;
to see the distribution of values in your variable.
Thanks Sir, why do you use tables here rather than table? I try table again, it's noting different. Thanks.
@Jianan_luna wrote:
Thanks Sir, why do you use tables here rather than table? I try table again, it's noting different. Thanks.
In the documentation, the statement is TABLES.
Hi:
I would recommend trying a PROC FREQ after your PROC SORT. When I do this:
I do see that there are 5 ParkType values in the sorted data.
So then, when I run the next step for question #2:
I get 478 rows and 2 columns in the output table work.TypeTraffic to answer #4.
Next, for question #4, I do THIS:
And then in my output (when I do a PROC PRINT), I see 5 rows, one for each value of ParkType, as shown in the PROC FREQ:
And the answer to #5 is shown in the PROC PRINT results as highlighted above.
If you want to see the result of using BY group processing in a data step, modify the first program to save the values of first.ParkType and Last.ParkType, as shown below:
data TypeTraffic;
set work.sortedTraffic;
by ParkType;
if first.ParkType=1 then TypeCount=0;
TypeCount+Count;
** save the values of the first. and last. variables;
first_by = first.ParkType;
last_by = last.ParkType;
format typeCount comma12.;
keep ParkType first_by last_by ParkName Count TypeCount;
run;
proc print data=TypeTraffic;
where first_by = 1 or last_by = 1;
var ParkType first_by last_by ParkName Count TypeCount;
run;
And then you'll see the beginning values for each park and the ending values for each park:
If you want to see all the First.BY and Last.BY values on every row, then run the PROC PRINT without the WHERE statement.
I hope this helps you understand the first.and last. processing.
Cynthia
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.