BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
melissagodfrey
Fluorite | Level 6

Proposed steps for Question1. 

data parks monuments;
    set pg1.np_summary;
    where type in ('NM', 'NP');
    Campers= sum(OtherCamping, RVCampers, TentCampers, BackcountryCampers);
format campers comma20.;
length Parktype $ 8;
if type='NP' then do parktype= 'Park';
output parks;
end;
else if type='NM' then do parktype='Monument';
output monuments;
end;
run;

I don't understand why we would use code in pink? It made more sense for me to do the below code in red and not include the where type in statement? Also, I included my answer for question2. Instead of having to figure out exactly how many digits the sum will contain, is it ok to simply use a high number (20) and see if any data is missing?

 

My code: 

data parks monuments;
set PG1.NP_SUMMARY;
campers= sum(OtherCamping, RVCampers, TentCampers, BackcountryCampers);
format campers comma20.;
length Parktype $ 8;
if Type= 'NP' then do Parktype='Park';
output parks;
end;
else if type='NM' then do ParkType='Monument';
output monuments;
end;
keep Reg ParkName DayVisits OtherCamping Campers ParkType;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
Opal | Level 21

A few comments ....

 

The code in red works (as you have probably seen by now), but would look strange to most programmers.  The suggestion by @novinosrin is the more common way of specifying a DO group.

 

Why use the code in pink?  It is mainly a matter of efficiency.  Consider the contents of the incoming data.  Maybe that data:

 

  • contains 5,000 variables
  • contains 1,000 different values for TYPE

In that case, much of the work of the DATA step would be reading in data that we never need.  The WHERE statement allows SAS to inspect the data before reading it in.  The DATA step never needs to read in the data that it doesn't need, saving time.

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20
if Type= 'NP' then 
    do;
          Parktype='Park';
          output parks;
    end;
else if type='NM' then 
    do;
        ParkType='Monument';
        output monuments;
  end;
Reeza
Super User

I don't understand why we would use code in pink? It made more sense for me to do the below code in red and not include the where type in statement?

For this trivial example, it likely doesn't matter, however, it is inefficient. If you'll only be working with two types, reducing your data set to just those two types means it will be faster since you have less records to process. 

 

 

Also, I included my answer for question2. Instead of having to figure out exactly how many digits the sum will contain, is it ok to simply use a high number (20) and see if any data is missing?

Yes, this is fine and a common thing to do. 

 

As mentioned before, both pieces of code are incorrect and I'd expect you to get syntax errors. You have END statements but no DO statements that match. If you have a DO statement with something after DO, then it doesn't enter a DO/END situation, it does that command and the DO is finished. Review the documentation for a DO/END and DO statements.

melissagodfrey
Fluorite | Level 6
thnx for the response.

i got 0 syntax errors. the first piece of code is the answer from SAS. the second piece of code is my response. both yielded the same exact results.
novinosrin
Tourmaline | Level 20

Hi @Reeza Good afternoon, The reason there isn't any syntax errors is becuase sas treats the do statement as "do iterative" rather than a conditional do group. Not needed but funny

 

Example:

 


/*iterate once*/

data w;
set sashelp.class;
if sex='F' then do gender='Female';
end;
run;

/*iterate more than once below*/


data w;
set sashelp.class;
if sex='F' then do gender='Female',"fem";
output;
end;
run;
Astounding
Opal | Level 21

A few comments ....

 

The code in red works (as you have probably seen by now), but would look strange to most programmers.  The suggestion by @novinosrin is the more common way of specifying a DO group.

 

Why use the code in pink?  It is mainly a matter of efficiency.  Consider the contents of the incoming data.  Maybe that data:

 

  • contains 5,000 variables
  • contains 1,000 different values for TYPE

In that case, much of the work of the DATA step would be reading in data that we never need.  The WHERE statement allows SAS to inspect the data before reading it in.  The DATA step never needs to read in the data that it doesn't need, saving time.

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

5 Steps to Your First Analytics Project Using SAS

For SAS newbies, this video is a great way to get started. James Harroun walks through the process using SAS Studio for SAS OnDemand for Academics, but the same steps apply to any analytics project.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 2401 views
  • 8 likes
  • 4 in conversation