DATA Step, Macro, Functions and more

do loops

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 87
Accepted Solution

do loops

Hi, I have Amt field and I want to group it in 100's increments and get the field amt_grp. I could do it using If statements like below but I would like to have a more efficient way like do loops or something to code it. Because I have data spreading upto $100,000 I cannot write if statements 1000 times. Please help. 

 

if amt < 100 then amt_grp = '> 0';

else if amt < 200  then amt_grp = '> 100';

.

.

.

else if amt < 50000 then amt_grp = '>50000'; 

 

Amt Amt_grp
$834.19 > 800
$202.85 > 200
$342.70 > 300
$790.16 > 700
$51.03 > 0
$129.55 > 100
$65.84 > 0
$79.98 > 0
$148.38 >100
$825.92 > 800
$551.32 > 500
$59.88 > 0
$319.29 > 300
$112.70 > 100
$666.60 > 600
$306.12 > 300
$1,025.51 > 1000
$145.44 > 100
$58.08 > 0
$616.79 > 600
$152.51 > 100

Accepted Solutions
Solution
‎11-17-2016 12:58 PM
Super User
Super User
Posts: 7,401

Re: do loops

You can do this by simple use of division and rounding:

data want;
  input amt;
  amt_group=cat("> ",strip(put(floor(amt/100)*100,best.)));
datalines;
834.19
202.85
;
run;

View solution in original post


All Replies
Solution
‎11-17-2016 12:58 PM
Super User
Super User
Posts: 7,401

Re: do loops

You can do this by simple use of division and rounding:

data want;
  input amt;
  amt_group=cat("> ",strip(put(floor(amt/100)*100,best.)));
datalines;
834.19
202.85
;
run;
Trusted Advisor
Posts: 1,372

Re: do loops

Alternative way is to use formats, then you can define different formats to gather into different groups size :

proc format;

     value grp100f    /* format name must end with a letter */

         0 - 100 = '<= 100'

     100 - 200 = '<= 200'

     200 - 300 = '<= 300'

      ...

     800 - 900 = '<= 800'

     900 -1000 = '<=1000'

      other = '> 1000'

  ; run;

 

data want;

   input amt;

    amt_group = put(amt, grp100f.);

run;

Super User
Posts: 10,495

Re: do loops

In a pedantic mode, since 800 is also > 200 the single > (value) is missleading as your label. Better would be to indicate a range:

 

data want;
  input amt;
  amt_group=catx(" - ",floor(amt/100)*100, (1+floor(amt/100))*100-1 );
datalines;
834.19
202.85
;
run;
Super User
Posts: 5,081

Re: do loops

One way is to create a format.  Because PROC FORMAT can use a SAS data set as input, it wouldn't take a long program.  One assumption I'm building in here is that your amounts only go to two decimal places.  (It could still be done if there are more decimal places, but it takes a slightly more complex program.)

 

data groups;

length label $ 8;

do start=0 to 100000 by 100;

   end = start + 99.99;

   label = '> ' || left(put(start, 6.));

   output;

end;

retain fmtname 'amtgroup';

run;

 

This gives you a SAS data set with all the categories defined, in a form that PROC FORMAT can use.  Next, create the format:

 

proc format cntlin=groups;

run;

 

Finally use it.  You could add this in either a DATA step or any procedure:

 

format amt amtgroup.;

 

Or you could create the character variable you originally asked for in a DATA step:

 

data want;

set have;

amt_grp = put(amt, amtgroup.);

run;

 

It's untested code so it may need to be tweaked.  But it should be at least 98% of the way there.

 

Also note that your original code kept 100 exactly in the "> 100" category so I continued that approach here.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 202 views
  • 3 likes
  • 5 in conversation