BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
alepage
Barite | Level 11

Hello,

 

we have many SAS programs which contains a by statement follow by a if first statement.

At the begining, the variables into the by statement seems to have a certain logic (good variables grouping)

but with time, we need to add more variables and I wonder how can we check if this new grouping does not change

the logic as it was first set.

 

Old version:

DATA TEMP.POLENTR;
     SET TEMP.POLENTR;
     BY MOIS TERR PROVINCE_CD CIE UNDERWRITING_COMPANY_CD BRANCH BROKER TRANSFER APER POLICY DEBU_TER EFFDAT TRSNUM CODE_TX
     	DESCENDING PRIME TYPE3 POLTYP ;
     POLCOUNT=INPUT(PUT(0,15.5),15.5);
     IF FIRST.code_tx ;
     IF CODE_TX IN ('-','1','5','3')  THEN POLCOUNT= TERME/12;
     ELSE IF CODE_TX IN ('2','6','4') THEN POLCOUNT= -1*TERME/12;
RUN;

New Version:

DATA TEMP.POLENTR;
   SET TEMP.POLENTR;
   BY MOIS TERR PROVINCE_CD CIE UNDERWRITING_COMPANY_CD BRANCH AFFINITY_IND GCNA_BRANCH BROKER TRANSFER APER
      POLICY POLTYP DEBU_TER EFFDAT TRSNUM TYPE3 POLICY_CATEGORY NETWORK_PROVIDER CODE_TX DESCENDING PRIME ;
   POLCOUNT=INPUT(PUT(0,15.5),15.5);
   IF FIRST.code_tx ;
   IF CODE_TX IN ('-','1','5','3')  THEN POLCOUNT= TERME/12;
   ELSE IF CODE_TX IN ('2','6','4') THEN POLCOUNT= -1*TERME/12;
RUN;
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

It is not clear what you are asking, or even what the purpose of the code it.

 

It kind of looks like you are trying  take the distinct observations based on a long list of key variables.  To make sure which of the multiple observations you are taking you appear to have added a variable named PRIME ( and possibly others) so that the one you want is the first.

 

Which set of variables are changing?  The key variables that define the BY group?  Or the variables that set the ordering of the repetitions within the by groups? 

 

You could place the lists into macro variables:

%let keys=MOIS TERR PROVINCE_CD CIE UNDERWRITING_COMPANY_CD BRANCH BROKER TRANSFER APER 
          POLICY DEBU_TER EFFDAT TRSNUM CODE_TX
;
%let sorters=DESCENDING PRIME TYPE3 POLTYP ;
%let last_key = %scan(&keys,-1,%str( ));

And then you can use the macro variables to generate the code.  That way you only have the make changes to the lists in one place.

proc sort data=temp.polentr;
  by  &keys &sorters;
run;
DATA TEMP.POLENTR;
     SET TEMP.POLENTR;
     BY &keys &sorters;
     IF FIRST.&last_key ;

     POLCOUNT=INPUT(PUT(0,15.5),15.5);
     IF CODE_TX IN ('-','1','5','3')  THEN POLCOUNT= TERME/12;
     ELSE IF CODE_TX IN ('2','6','4') THEN POLCOUNT= -1*TERME/12;

RUN;

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

Since we don't know your data, and can't answer your concerns directly, let me ask you ... what happens if you run this code ... does it do what you want?

 

Also, I strongly recommend that you not overwrite existing data sets by creating a new data set with the exact same data set name, like this:

 

DATA TEMP.POLENTR;
     SET TEMP.POLENTR;

 

This is poor programming practice. Better would be:

 

DATA TEMP.POLENTR1;
     SET TEMP.POLENTR;

or

 

DATA work.POLENTR;
     SET TEMP.POLENTR;

 

--
Paige Miller
ballardw
Super User

Without data and definitions of what "good variables grouping" means in terms of the data this really can't be answered by us.

 

One comment, as it relates to even testing the behavior of this code is that this structure use the same data set name on Set and Data statements completely replaces the data set.


DATA TEMP.POLENTR;
     SET TEMP.POLENTR;

 

I would be tempted to run both of your codes (with appropriate sorts in between) generating a set with the "old" form as one data set and the "new" into a different. Then compare the values of the variables that you consider needed to define "good variable grouping" between the two.

 

Pretty much can guarantee that unless the added variables only have one value with the BY variables that precede them that you will have more instances of the "old" By variable combinations.

Tom
Super User Tom
Super User

It is not clear what you are asking, or even what the purpose of the code it.

 

It kind of looks like you are trying  take the distinct observations based on a long list of key variables.  To make sure which of the multiple observations you are taking you appear to have added a variable named PRIME ( and possibly others) so that the one you want is the first.

 

Which set of variables are changing?  The key variables that define the BY group?  Or the variables that set the ordering of the repetitions within the by groups? 

 

You could place the lists into macro variables:

%let keys=MOIS TERR PROVINCE_CD CIE UNDERWRITING_COMPANY_CD BRANCH BROKER TRANSFER APER 
          POLICY DEBU_TER EFFDAT TRSNUM CODE_TX
;
%let sorters=DESCENDING PRIME TYPE3 POLTYP ;
%let last_key = %scan(&keys,-1,%str( ));

And then you can use the macro variables to generate the code.  That way you only have the make changes to the lists in one place.

proc sort data=temp.polentr;
  by  &keys &sorters;
run;
DATA TEMP.POLENTR;
     SET TEMP.POLENTR;
     BY &keys &sorters;
     IF FIRST.&last_key ;

     POLCOUNT=INPUT(PUT(0,15.5),15.5);
     IF CODE_TX IN ('-','1','5','3')  THEN POLCOUNT= TERME/12;
     ELSE IF CODE_TX IN ('2','6','4') THEN POLCOUNT= -1*TERME/12;

RUN;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 427 views
  • 0 likes
  • 4 in conversation