BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jianan_luna
Obsidian | Level 7

I am currently studying course Programming 2. In lesson (summarizing data), level 1 practice. I make sense the answer, but I am confused about question 2. When I write code as following:

 

proc sort data=pg2.np_yearlyTraffic   
          out=sortedTraffic(keep=ParkType ParkName Location Count);
    by ParkType ParkName;
run;

proc sort data=pg2.np_yearlyTraffic   
          out=sortedTraffic(keep=ParkType ParkName Location Count);
    by ParkType ParkName;
run;

 SAS sorted column Parktype, but when I check output data, there are only 2 type of park in ParkType column. Actually, there are 5 parktypes. I am super confused about it. Can you please help me figure it out?

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Cynthia_sas
SAS Super FREQ

Hi:

  I would recommend trying a PROC FREQ after your PROC SORT. When I do this:

Cynthia_sas_0-1595096565374.png

 

I do see that there are 5 ParkType values in the sorted data.

So then, when I run the next step for question #2:

Cynthia_sas_1-1595096663766.png

I get 478 rows and 2 columns in the output table work.TypeTraffic to answer #4.

 

  Next, for question #4, I do THIS:

Cynthia_sas_2-1595096988290.png

And then in my output (when I do a PROC PRINT), I see 5 rows, one for each value of ParkType, as shown in the PROC FREQ:

Cynthia_sas_3-1595097151218.png

And the answer to #5 is shown in the PROC PRINT results as highlighted above.

 

If you want to see the result of using BY group processing in a data step, modify the first program to save the values of first.ParkType and Last.ParkType, as shown below:

data TypeTraffic;
    set work.sortedTraffic;
    by ParkType;
    if first.ParkType=1 then TypeCount=0;
    TypeCount+Count;
	** save the values of the first. and last. variables;
	first_by = first.ParkType;
	last_by = last.ParkType;
    format typeCount comma12.;
    keep ParkType first_by last_by ParkName Count TypeCount;
run;
     
proc print data=TypeTraffic;
  where first_by = 1 or last_by = 1;
  var ParkType first_by last_by ParkName Count TypeCount;
run;

And then you'll see the beginning values for each park and the ending values for each park:

Cynthia_sas_4-1595098104687.png

 

If you want to see all the First.BY and Last.BY values on every row, then run the PROC PRINT without the WHERE statement.

 

I hope this helps you understand the first.and last. processing.

 

Cynthia

 

 

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

First, follow Maxim 2 and read the log. You will see that PROC SORT keeps all observations (since you did not use the nodup option).

Next, follow Maxim 3 (Know Your Data), and inspect your source dataset. Run

proc freq data=pg2.np_yearlyTraffic;
tables parktype;
run;

to see the distribution of values in your variable.

 

Jianan_luna
Obsidian | Level 7

Thanks Sir, why do you use tables here rather than table? I try table again, it's noting different. Thanks.

Cynthia_sas
SAS Super FREQ

Hi:

  I would recommend trying a PROC FREQ after your PROC SORT. When I do this:

Cynthia_sas_0-1595096565374.png

 

I do see that there are 5 ParkType values in the sorted data.

So then, when I run the next step for question #2:

Cynthia_sas_1-1595096663766.png

I get 478 rows and 2 columns in the output table work.TypeTraffic to answer #4.

 

  Next, for question #4, I do THIS:

Cynthia_sas_2-1595096988290.png

And then in my output (when I do a PROC PRINT), I see 5 rows, one for each value of ParkType, as shown in the PROC FREQ:

Cynthia_sas_3-1595097151218.png

And the answer to #5 is shown in the PROC PRINT results as highlighted above.

 

If you want to see the result of using BY group processing in a data step, modify the first program to save the values of first.ParkType and Last.ParkType, as shown below:

data TypeTraffic;
    set work.sortedTraffic;
    by ParkType;
    if first.ParkType=1 then TypeCount=0;
    TypeCount+Count;
	** save the values of the first. and last. variables;
	first_by = first.ParkType;
	last_by = last.ParkType;
    format typeCount comma12.;
    keep ParkType first_by last_by ParkName Count TypeCount;
run;
     
proc print data=TypeTraffic;
  where first_by = 1 or last_by = 1;
  var ParkType first_by last_by ParkName Count TypeCount;
run;

And then you'll see the beginning values for each park and the ending values for each park:

Cynthia_sas_4-1595098104687.png

 

If you want to see all the First.BY and Last.BY values on every row, then run the PROC PRINT without the WHERE statement.

 

I hope this helps you understand the first.and last. processing.

 

Cynthia

 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1483 views
  • 2 likes
  • 3 in conversation