If then do statement help

Reply
Occasional Contributor
Posts: 15

If then do statement help

I have the following code that I am running in SAS EG and here is my issue:  When I get my results, it is placing the master_ipa into one of the classes as indicated for each activity_date. The problem is that I am only looking for the master_ipa to have one class and if the MM figure changes to a different level for one of the activity_dates, then it will show two (or more) class levels for that master_ipa which means it will get counted two or more times in subsequent queries. How can I resolve? Any help is greatly appreciated.

Mike

proc sort data=NEW_SUM;

by master_ipa descending activity_date;

quit;

data NEW_SUM_CLASS;

set NEW_SUM;

by master_ipa descending activity_date;

length class $100;

retain class;

if first.master_ipa then do;

      class='';

      end;

where risk_type is not null;

if 0<=MM<=50 then class='1. 0-50';

else if 51<=MM<=100 then class='2. 51-100';

else if 101<=MM<=500 then class='3. 101-500';

else if 501<=MM<=1000 then class='4. 501-1000';

else if 1001<=MM<=5000 then class='5. 1001-5000';

else if MM>=5001 then class='6. 5001+';

run;

PROC Star
Posts: 7,363

Re: If then do statement help

Not sure I understand what you are trying to do but, given the code you provided, I'll try to summarize what I think you want:

You want the first most recent activity_date record that ISN'T missing a value for risk_type and, for those records, you want class assigned.

Is that a correct restatement of what you're looking for?

Occasional Contributor
Posts: 15

Re: If then do statement help

Hi Arthur,

Basically, by sorting activity_date by descending, what I am looking for is to just take the first master_ipa record in that grouping and based on what the MM is, assign the class to it and that class should be the same for the rest of the activity_dates.  Here is an example of my data. As you can see, in activity_date 9/1/2011, the class value changes and all I want to do is use the value for the very first field which in this case is activity_date 12/1/2011. It may be different for other master_ipa's as the first field may actually be a different activity_date, but nonetheless, I still just want to use the data in the first record.

master_ipaMMPremiumFundingExpensesDisbursementsclassrisk_typeactivity date
91197$29,290.18 $24,017.95 ($19,008.24).2. 51-100PERC12/1/2011
91195$29,361.23 $24,076.21 ($30,137.51).2. 51-100PERC11/1/2011
91197$29,819.89 $24,452.31 ($31,945.87).2. 51-100PERC10/1/2011
911102$32,752.48 $26,857.03 ($13,572.23).3. 101-500PERC9/1/2011
911104$33,637.76 $27,582.96 ($22,949.40).3. 101-500PERC8/1/2011
911104$32,627.84 $26,754.83 ($17,295.37).3. 101-500PERC7/1/2011
911108$33,339.40 $27,338.31 ($52,679.11).3. 101-500PERC6/1/2011
911106$32,004.77 $26,243.91 ($22,862.32)3693. 101-500PERC5/1/2011
911103$31,209.10 $25,591.46 ($20,956.96)3843. 101-500PERC4/1/2011
91199$30,212.86 $24,774.55 ($28,853.23)23132. 51-100PERC3/1/2011
911101$29,382.39 $24,093.56 ($10,173.90).3. 101-500PERC2/1/2011
911103$29,775.17 $24,415.64 ($13,528.55).3. 101-500PERC1/1/2011
Super Contributor
Posts: 1,636

Re: If then do statement help

Hi Art,

do you think

"else if 51<=MM<=100 then class='2. 51-100';"

should be

"else if 50<MM<=100 then class='2. 51-100';"

Thank you!

Super User
Posts: 5,081

Re: If then do statement help

The programming for this is relatively easy, and I'm sure somebody else is typing up a solution at the same time I'm typing this.  Here are a couple of efficiency considerations to think about.  They only matter if your data set is large.

Do you really want a length of 100 for CLASS?

You can cut the number of comparisons in half by reversing the order:

if mm >= 5001 then class='6.  5001+';

else if mm >= 1001 then class='5.  1001-5000';

else if mm >= 501 then class='4.  501-1000';

else etc.

Good luck.

Occasional Contributor
Posts: 15

Re: If then do statement help

Hi Astounding...I had them placed in that order so that when I export my results to Excel, I can just copy and paste into another report format that I have set up in Excel. I am a very new user to SAS EG (as in no previous experience whatsoever up until 2 months ago). For length of 100 for class, is that length referring to number of characters? If so, then my length probably can be limited to about 15 instead.

Super User
Posts: 5,081

Re: If then do statement help

Yes, the length of 100 indicates the number of characters.

The order of your IF/THEN statement will not impact the results that go to Excel (unless you are actually exporting the program itself, rather than the data).  Regardless of the order of the programming statements, CLASS would end up with the same value on each row.

Valued Guide
Posts: 765

Re: If then do statement help

hi ... one suggestion ...

if 0<=MM<=50           then class='1. 0-50';

else if 51<=MM<=100    then class='2. 51-100';

else if 101<=MM<=500   then class='3. 101-500';

else if 501<=MM<=1000  then class='4. 501-1000';

else if 1001<=MM<=5000 then class='5. 1001-5000';

else if MM>=5001       then class='6. 5001+';

as ...

if ^missing(MM) and

        MM<=50   then class='1. 0-50';

else if MM<=100  then class='2. 51-100';

else if MM<=500  then class='3. 101-500';

else if MM<=1000 then class='4. 501-1000';

else if MM<=5000 then class='5. 1001-5000';

else                  class='6. 5001+';

since you don not have to check the lower bound if the intervals (except for the first) given how IF-THEN-ELSE works

e.g.  if MM is 950, the first three values of CLASS are skipped (not true) and the statement stops executing at the fourth value (true)

(you don't have to reverse the order , just change the statements)

also, what do you want to happen if MM is missing (or is does MM always have a value)

or, just use a format (no need to worry about assigning a length of new variable CLASS)

proc format;

value mm

.         = 'UNKNOWN'

0-50      = '1. 0-50'

51-100    = '2. 51-100'

101-500   = '3. 101-500'

501-1000  = '4. 501-1000'

1001-5000 = '5. 1001-5000'

5001-high = '6. 5001+'

;

run;

data new_sum_class;

retain class;

set new_sum;

* no need for the descending date stuff ... assignment of MM is just within a master_ipa group;

by master_ipa;

where risk_type is not null;

* assign value to CLASS once ... when 1st observation in a group is encountered;

if first.master_ipa then class = put(mm,mm.);

run;T

Occasional Contributor
Posts: 15

Re: If then do statement help

Thanks Mike. I could change those statements although not sure it will correct my other problem. And MM should always have a value because the preceding queries are keying in on Active master_ipa's only.

Super User
Posts: 5,081

Re: If then do statement help

Mike,

Picky details here and there ...

Reversing the order of the IF/THEN statements makes the logic much easier to work with.  For example, the IF/THEN code that you used would actually put missing values into the category of "2.  51-100".

Also note, that with a small change, you can eliminate the RETAIN statement.  Just use:

if first.master_ipa;

class = put(mm, mm.);

It's not really faster, just a matter of style as to which is easier to read.

Valued Guide
Posts: 765

Re: If then do statement help

hi ... correct about missing (my error ... it's not picky, but it's correct) ... me, I'd use the format

not sure if this is what's wanted ...

* this is a subsetting IF;

if first.master_ipa;

class = put(mm, mm.);

that would limit the new data set to having only the first observation within a group

I think he wants all of them and each should have the same date (assigned based on first obs in a group)

your code answers a different question, yes/no?

Super User
Posts: 5,081

Re: If then do statement help

Yes, you're right.  All the records are needed.  So keeping the first one and eliminating RETAIN isn't a possibility.  Must be late in the day for both of us!

Occasional Contributor
Posts: 15

Re: If then do statement help

So there has to be a way to repeat the class value from the first record in that group and apply to each succeeding record in that group until it changes. For my sample data above, I need that class of  2. 51-100 to be repeated for all of the remaining records for master_ipa 911. Then when the next master_ipa appears, it will do the same thing for that grouping. It seems simple, but I have no clue how to write it in the code.

Valued Guide
Posts: 765

Re: If then do statement help

hi ... that's what this does ...

proc format;

value mm

.         = 'UNKNOWN'

0-50      = '1. 0-50';

51-100    = '2. 51-100';

101-500   = '3. 101-500';

501-1000  = '4. 501-1000';

1001-5000 = '5. 1001-5000';

5001-high = '6. 5001+'

;

run;

data new_sum_class;

retain class;

set new_sum;

* no need for the descending date stuff ... assignment of MM is just within a master_ipa group;

by master_ipa;

where risk_type is not null;

* assign value to CLASS once ... when 1st observation in a group is encountered;

if first.master_ipa then class = put(mm,mm.);

run;

Occasional Contributor
Posts: 15

Re: If then do statement help

Thanks Mike. I will try this revised code, but one question. I think I still need the descending activity date because I need to have the MM be equal to the latest activity_date month. For example, in my data above, if I don't have the descending activity_date and the data is sorted by activity_date automatically in ascending order, the MM would kick out the class '3. 101-500' instead of '2 51-100' Would your code above still work then and should I keep the sort in there?

Ask a Question
Discussion stats
  • 20 replies
  • 749 views
  • 0 likes
  • 7 in conversation