BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
melissagodfrey
Fluorite | Level 6

Proposed steps for Question1. 

data parks monuments;
    set pg1.np_summary;
    where type in ('NM', 'NP');
    Campers= sum(OtherCamping, RVCampers, TentCampers, BackcountryCampers);
format campers comma20.;
length Parktype $ 8;
if type='NP' then do parktype= 'Park';
output parks;
end;
else if type='NM' then do parktype='Monument';
output monuments;
end;
run;

I don't understand why we would use code in pink? It made more sense for me to do the below code in red and not include the where type in statement? Also, I included my answer for question2. Instead of having to figure out exactly how many digits the sum will contain, is it ok to simply use a high number (20) and see if any data is missing?

 

My code: 

data parks monuments;
set PG1.NP_SUMMARY;
campers= sum(OtherCamping, RVCampers, TentCampers, BackcountryCampers);
format campers comma20.;
length Parktype $ 8;
if Type= 'NP' then do Parktype='Park';
output parks;
end;
else if type='NM' then do ParkType='Monument';
output monuments;
end;
keep Reg ParkName DayVisits OtherCamping Campers ParkType;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

A few comments ....

 

The code in red works (as you have probably seen by now), but would look strange to most programmers.  The suggestion by @novinosrin is the more common way of specifying a DO group.

 

Why use the code in pink?  It is mainly a matter of efficiency.  Consider the contents of the incoming data.  Maybe that data:

 

  • contains 5,000 variables
  • contains 1,000 different values for TYPE

In that case, much of the work of the DATA step would be reading in data that we never need.  The WHERE statement allows SAS to inspect the data before reading it in.  The DATA step never needs to read in the data that it doesn't need, saving time.

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20
if Type= 'NP' then 
    do;
          Parktype='Park';
          output parks;
    end;
else if type='NM' then 
    do;
        ParkType='Monument';
        output monuments;
  end;
Reeza
Super User

I don't understand why we would use code in pink? It made more sense for me to do the below code in red and not include the where type in statement?

For this trivial example, it likely doesn't matter, however, it is inefficient. If you'll only be working with two types, reducing your data set to just those two types means it will be faster since you have less records to process. 

 

 

Also, I included my answer for question2. Instead of having to figure out exactly how many digits the sum will contain, is it ok to simply use a high number (20) and see if any data is missing?

Yes, this is fine and a common thing to do. 

 

As mentioned before, both pieces of code are incorrect and I'd expect you to get syntax errors. You have END statements but no DO statements that match. If you have a DO statement with something after DO, then it doesn't enter a DO/END situation, it does that command and the DO is finished. Review the documentation for a DO/END and DO statements.

melissagodfrey
Fluorite | Level 6
thnx for the response.

i got 0 syntax errors. the first piece of code is the answer from SAS. the second piece of code is my response. both yielded the same exact results.
novinosrin
Tourmaline | Level 20

Hi @Reeza Good afternoon, The reason there isn't any syntax errors is becuase sas treats the do statement as "do iterative" rather than a conditional do group. Not needed but funny

 

Example:

 


/*iterate once*/

data w;
set sashelp.class;
if sex='F' then do gender='Female';
end;
run;

/*iterate more than once below*/


data w;
set sashelp.class;
if sex='F' then do gender='Female',"fem";
output;
end;
run;
Astounding
PROC Star

A few comments ....

 

The code in red works (as you have probably seen by now), but would look strange to most programmers.  The suggestion by @novinosrin is the more common way of specifying a DO group.

 

Why use the code in pink?  It is mainly a matter of efficiency.  Consider the contents of the incoming data.  Maybe that data:

 

  • contains 5,000 variables
  • contains 1,000 different values for TYPE

In that case, much of the work of the DATA step would be reading in data that we never need.  The WHERE statement allows SAS to inspect the data before reading it in.  The DATA step never needs to read in the data that it doesn't need, saving time.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
LIBNAME 101

Follow along as SAS technical trainer Dominique Weatherspoon expertly answers all your questions about SAS Libraries.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 3085 views
  • 8 likes
  • 4 in conversation