BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jianan_luna
Obsidian | Level 7

I am currently studying course Programming 2. In lesson (summarizing data), level 1 practice. I make sense the answer, but I am confused about question 2. When I write code as following:

 

proc sort data=pg2.np_yearlyTraffic   
          out=sortedTraffic(keep=ParkType ParkName Location Count);
    by ParkType ParkName;
run;

proc sort data=pg2.np_yearlyTraffic   
          out=sortedTraffic(keep=ParkType ParkName Location Count);
    by ParkType ParkName;
run;

 SAS sorted column Parktype, but when I check output data, there are only 2 type of park in ParkType column. Actually, there are 5 parktypes. I am super confused about it. Can you please help me figure it out?

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Cynthia_sas
Diamond | Level 26

Hi:

  I would recommend trying a PROC FREQ after your PROC SORT. When I do this:

Cynthia_sas_0-1595096565374.png

 

I do see that there are 5 ParkType values in the sorted data.

So then, when I run the next step for question #2:

Cynthia_sas_1-1595096663766.png

I get 478 rows and 2 columns in the output table work.TypeTraffic to answer #4.

 

  Next, for question #4, I do THIS:

Cynthia_sas_2-1595096988290.png

And then in my output (when I do a PROC PRINT), I see 5 rows, one for each value of ParkType, as shown in the PROC FREQ:

Cynthia_sas_3-1595097151218.png

And the answer to #5 is shown in the PROC PRINT results as highlighted above.

 

If you want to see the result of using BY group processing in a data step, modify the first program to save the values of first.ParkType and Last.ParkType, as shown below:

data TypeTraffic;
    set work.sortedTraffic;
    by ParkType;
    if first.ParkType=1 then TypeCount=0;
    TypeCount+Count;
	** save the values of the first. and last. variables;
	first_by = first.ParkType;
	last_by = last.ParkType;
    format typeCount comma12.;
    keep ParkType first_by last_by ParkName Count TypeCount;
run;
     
proc print data=TypeTraffic;
  where first_by = 1 or last_by = 1;
  var ParkType first_by last_by ParkName Count TypeCount;
run;

And then you'll see the beginning values for each park and the ending values for each park:

Cynthia_sas_4-1595098104687.png

 

If you want to see all the First.BY and Last.BY values on every row, then run the PROC PRINT without the WHERE statement.

 

I hope this helps you understand the first.and last. processing.

 

Cynthia

 

 

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

First, follow Maxim 2 and read the log. You will see that PROC SORT keeps all observations (since you did not use the nodup option).

Next, follow Maxim 3 (Know Your Data), and inspect your source dataset. Run

proc freq data=pg2.np_yearlyTraffic;
tables parktype;
run;

to see the distribution of values in your variable.

 

Jianan_luna
Obsidian | Level 7

Thanks Sir, why do you use tables here rather than table? I try table again, it's noting different. Thanks.

Cynthia_sas
Diamond | Level 26

Hi:

  I would recommend trying a PROC FREQ after your PROC SORT. When I do this:

Cynthia_sas_0-1595096565374.png

 

I do see that there are 5 ParkType values in the sorted data.

So then, when I run the next step for question #2:

Cynthia_sas_1-1595096663766.png

I get 478 rows and 2 columns in the output table work.TypeTraffic to answer #4.

 

  Next, for question #4, I do THIS:

Cynthia_sas_2-1595096988290.png

And then in my output (when I do a PROC PRINT), I see 5 rows, one for each value of ParkType, as shown in the PROC FREQ:

Cynthia_sas_3-1595097151218.png

And the answer to #5 is shown in the PROC PRINT results as highlighted above.

 

If you want to see the result of using BY group processing in a data step, modify the first program to save the values of first.ParkType and Last.ParkType, as shown below:

data TypeTraffic;
    set work.sortedTraffic;
    by ParkType;
    if first.ParkType=1 then TypeCount=0;
    TypeCount+Count;
	** save the values of the first. and last. variables;
	first_by = first.ParkType;
	last_by = last.ParkType;
    format typeCount comma12.;
    keep ParkType first_by last_by ParkName Count TypeCount;
run;
     
proc print data=TypeTraffic;
  where first_by = 1 or last_by = 1;
  var ParkType first_by last_by ParkName Count TypeCount;
run;

And then you'll see the beginning values for each park and the ending values for each park:

Cynthia_sas_4-1595098104687.png

 

If you want to see all the First.BY and Last.BY values on every row, then run the PROC PRINT without the WHERE statement.

 

I hope this helps you understand the first.and last. processing.

 

Cynthia

 

 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2191 views
  • 2 likes
  • 3 in conversation