BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
EricM0628
Calcite | Level 5

This is probably a basic question, but I'm having trouble finding an answer when searching previously asked questions.  I have a data set with a customer identifier and an employee count, and I'm attempting to create a macro that will assign data data ranges.  I don't use macros often, and I'm having trouble creating one to accomplish this issue.  I'm attempting to create a low range value, and a high range value, that I can then concatenate into a range.  I'm doing this by every 100 employees from 0 to 30,000.

 

I'd appreciate any help you can provide.  FYI, I'm using SAS EG version 6.1.

 

Data Sample

 

Customer_number Employee_Count
1240 4453
1241 279
1242 94
1243 392
1246 79
1248 225
1258 105
1263 80
1265 325
1267 330
1270 365
1272 368
1273 3531
1274 348
1276 193

 

 

Macro attempt:

 

%macro a;
%do i=0 %to 30000 by 100;
%do x=100 %to 30000 by 100;

data median2;
set median1;
if pbt_customer_ee > &i and pbt_customer_ee < &x then low_band = &i;
if pbt_customer_ee > &i and pbt_customer_ee < &x then high_band = &x;

run;

%end;
%end;
%mend;

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Basic math instead?

 

data have;
input count;
cards;
4453
279
94
392
79
;
run;

data want;
set have;

lower_bound = int(count/100)*100;
upper_bound = lower_bound + 100;
range = catx('-', lower_bound, upper_bound);
run;

Results:

 

Obs count lower_bound upper_bound range
1 4453 4400 4500 4400-4500
2 279 200 300 200-300
3 94 0 100 0-100
4 392 300 400 300-400
5 79 0 100 0-100

 


@EricM0628 wrote:

Here is the sample data with the new value I'm trying to create from the employee count value.  I'm not sure creating a data set with the values would work, as I'd still need to link the new data set to the original data, which I imagine would take a lot of case statements (or if then data steps).  Unless you know a better way to link the data?

 

Customer_number Employee_Count New Value
1240 4453 4400 - 4500
1241 279 200 - 300
1242 94 0 - 100
1243 392 300 - 400
1246 79 0 - 100
1248 225 200 - 300
1258 105 100 - 200
1263 80 0 - 100
1265 325 300 - 400
1267 330 300 - 400
1270 365 300 - 400
1272 368 300 - 400
1273 3531 3500 - 3600
1274 348 300 - 400
1276 193 100 - 200

 

View solution in original post

8 REPLIES 8
Astounding
PROC Star

Don't use macros for this purpose!  They're not needed, and they (in this case) will actually run 90,000 DATA steps.  As you can imagine, that  would take a while to run.

 

A much better approach would be to create a data set to be used with CNTLIN in PROC FORMAT. 

 

If you had a data set with these ranges in it, would that allow you to logically complete the task, or is your task more complex than that?

 

1 - 100

101 - 200

201 - 300

...

29801 - 29900

29901 - 30000

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

To get a good answer, provide test data in the form of a datastep, and what the output should be.  It definately is not the best method to use macro for such a task.  There are specific procedures and methods within base sas which are built specifically to do such a thing.  For instance, a proc means or summary by customer number can output min/max values in a tiny fraction of the time of one datastep, let alone one per group.  Use Base SAS to process your data, only resort to macro code when it actually adds something other than obfuscation to the code.

ballardw
Super User

First thing before attempting to use a macro variables or macro coding for anything is to get a version that works without any macro involvement. Then figure out the parts that can be replaced by the text substitution that macros do.

 

I your specific code you are going to create the data set median2 roughly 90,000 times AND the final result will be the one with the last value of the &I and &x variables.

 

I actually do not see what "range" you may actually be trying to create. It does not exactly help that you show two variables in example data whose names do not appear in the code. So we don't really know what values you are actually manipulating.

 

It helps to show what the result for your example input data should look like, both start and end and describe the rules. Code that does not work is sort of suboptimal in describing what the rules might be. For instance what, in words, would be the definition of "low range value" and "high range value"? Does that definition actually change for each customer? You current code would make it appear so. Intuitively "customer_number" is an identifier and those seldom, if ever, should be involved in arithmetic.

 

Grouping values is sometimes a job for custom formats.

 

EricM0628
Calcite | Level 5

Here is the sample data with the new value I'm trying to create from the employee count value.  I'm not sure creating a data set with the values would work, as I'd still need to link the new data set to the original data, which I imagine would take a lot of case statements (or if then data steps).  Unless you know a better way to link the data?

 

Customer_number Employee_Count New Value
1240 4453 4400 - 4500
1241 279 200 - 300
1242 94 0 - 100
1243 392 300 - 400
1246 79 0 - 100
1248 225 200 - 300
1258 105 100 - 200
1263 80 0 - 100
1265 325 300 - 400
1267 330 300 - 400
1270 365 300 - 400
1272 368 300 - 400
1273 3531 3500 - 3600
1274 348 300 - 400
1276 193 100 - 200
Reeza
Super User

Basic math instead?

 

data have;
input count;
cards;
4453
279
94
392
79
;
run;

data want;
set have;

lower_bound = int(count/100)*100;
upper_bound = lower_bound + 100;
range = catx('-', lower_bound, upper_bound);
run;

Results:

 

Obs count lower_bound upper_bound range
1 4453 4400 4500 4400-4500
2 279 200 300 200-300
3 94 0 100 0-100
4 392 300 400 300-400
5 79 0 100 0-100

 


@EricM0628 wrote:

Here is the sample data with the new value I'm trying to create from the employee count value.  I'm not sure creating a data set with the values would work, as I'd still need to link the new data set to the original data, which I imagine would take a lot of case statements (or if then data steps).  Unless you know a better way to link the data?

 

Customer_number Employee_Count New Value
1240 4453 4400 - 4500
1241 279 200 - 300
1242 94 0 - 100
1243 392 300 - 400
1246 79 0 - 100
1248 225 200 - 300
1258 105 100 - 200
1263 80 0 - 100
1265 325 300 - 400
1267 330 300 - 400
1270 365 300 - 400
1272 368 300 - 400
1273 3531 3500 - 3600
1274 348 300 - 400
1276 193 100 - 200

 

EricM0628
Calcite | Level 5

This worked perfectly.  Thanks for your help!

PaigeMiller
Diamond | Level 26
data want;
    set have;
    length new_value $ 16;
    lower = 100 * floor(employee_count/100);  
    upper = 100 * ceil(employee_count/100);
    new_value = catx('-',lower,upper);
run;

As you can see, no need for macros, and no need for looping.

 

I do wonder how this new_value is going to be used in any meaningful way, usually there's no need to create such a text string in SAS for additional analysis purposes or for reporting purposes.

--
Paige Miller
ballardw
Super User

Your range boundaries as shown would imply to me that if you have a value of exactly 4400 you need to choose between 4300-4400 and 4400-4500 values. So what is your boundary value rule?

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2760 views
  • 6 likes
  • 6 in conversation