I have a dataset on which I am going to be performing a calculation. The calculation will need to be performed for each individual.
The structure of the data is similar to
data app;
input ID $ A1_CITY $ A2_CITY & NUM_OF_CUSTS
datalines;
001 LONDON LONDON 2
002 LONDON LEEDS 2
003 MANCHESTER LONDON 2
004 LEEDS LONDON 2
;
run;
As the calculation is quite long I was thinking of using a do loop within a macro for this. This is what I have come up with...
data calc;
set app;
%let I=1;
%macro calc;
%do %while (&I <= NUM_OF_CUSTS);
%if A&i._CITY = "LONDON" %THEN LONDON = "Y"; %else LONDON = "N"
/*rest of calc here*/
/*%if London = "Y" %then... %else...*/
%let I = %eval(&I+1);
%end;
%mend;
run;
%calc;
the statement
%do %while (&I <= NUM_OF_CUSTS);
also isn't being resolved as expected for some reason
Any help as to where I'm going wrong would be greatly appreciated
Macro code is used to generate SAS code. So first show what SAS code you want to generate and then explain any pattern to the code generation.
It is hard to tell from your description what you want to do but it does not look like it needs any macro code at all.
data calc;
set app;
if a1_city = 'LONDON' or a2_city='LONDON' then london='Y';
else london='N';
run;
So you aim to see in which cities an individual has customers?
so essentially I want to perform a different calculation based on the city in which that particular applicant resides. Currently there are only two calculations - one for 'London' and for 'not London' but eventually there could be a calculation for each city.
Due to the number of potential calculations I wanted to be able code each calculation once and then loop through it for each applicant.
The data step already does that loop, by working through the dataset (where you have one observation per applicant, as I see it) one observation after the other.
@twenty7 wrote:
so essentially I want to perform a different calculation based on the city in which that particular applicant resides. Currently there are only two calculations - one for 'London' and for 'not London' but eventually there could be a calculation for each city.
Due to the number of potential calculations I wanted to be able code each calculation once and then loop through it for each applicant.
Sounds like an XY problem.
Please explain the larger context of what you are trying to do to get a better solution. Provide example output data for the given input data.
I need to be able to calculate the expenditure for each person on an application. The expenditure calculation is different based on the city in which the person resides. In the data I have available I am able to determine the number of persons on the application and the city. The maximum number of persons on an application is 3.
the current solution I have is
if a1_city = London then a1_affordability_calc = 'a' else
if a1_city in (Manchester, Liverpool) then a1_affordability_calc = 'b' else
if a1_city in (Birmingham) then a1_affordability_calc = 'c' else
if a1_city in (Glasgow) then a1_affordability_calc = 'd' else
a1_affordability_calc = 'e';
if a1_affordability_calc = 'a' then a1_calc_result = (x+y+z)*0.25;
if a1_affordability_calc = 'b' then a1_calc_result = (x+y+z)*0.10;
if a1_affordability_calc = 'c' then a1_calc_result = (x+y+z)*0.21;
if a1_affordability_calc = 'd' then a1_calc_result = (x+y+z)*0.23;
if a1_affordability_calc = 'e' then a1_calc_result = (x+y+z)*0.20;
/*REPEAT ABOVE REPLACING THE PREFIX a1 with a2 and a3*/
The calculations will be a lot more complicated but hopefully this explains the problem better
What I wanted to avoid in my potential solution is repeating the same section of code with the calculations for each of the 3 potential applicants.
A data step operates on every observation. So if by applicant you mean observation then you don't have to do anything for that.
If by APPLICANT you mean the 1,2,3 that you have coded into the variable names then perhaps you should just first transpose the data.
data app;
input ID $ NUM_OF_CUSTS @;
do applicant=1 to num_of_custs ;
input city &:$20. @;
output;
end;
datalines;
001 2 LONDON LONDON
002 2 LONDON LEEDS
003 2 MANCHESTER LONDON
004 2 LEEDS LONDON
;
Otherwise use an ARRAY so that you can use the same code for every variable. Example (notice how much easier it is when you put the number at the END of the name):
array city city1 - city3 ;
To assign a grouping based on value of the CITY variable you might want to use a format.
proc format ;
value $calc_type
'London' = 'a'
'Manchester', 'Liverpool' = 'b'
'Birmingham' = 'c'
'Glasgow' = 'd'
other = 'e'
;
run;
Now your calculation code is simple.
data want;
set app;
select (put(city,$calc_type.));
when ('a') calc_result = (x+y+z)*0.25;
when ('b') calc_result = (x+y+z)*0.10;
when ('c') calc_result = (x+y+z)*0.21;
when ('d') calc_result = (x+y+z)*0.23;
when ('e') calc_result = (x+y+z)*0.20;
end;
run;
There is probably even more you could do to simplify.
Notice that macro code is not needed.
data app;
length A1_CITY A2_CITY $20.;
input ID $ A1_CITY $ A2_CITY $ NUM_OF_CUSTS;
datalines;
001 LONDON LONDON 2
002 LONDON LEEDS 2
003 MANCHESTER LONDON 2
004 LEEDS LONDON 2
;
run;
data want;
set app;
array cities $1 LONDON LEEDS MANCHESTER BIRMINGHAM GLASGOW;
array A_CITY A1_CITY A2_CITY;
array afford $1. a1_affordability_calc a2_affordability_calc;
array increment(5) _TEMPORARY_ (0.25 0.10 0.21 0.23 0.20);
array calc $20. a1_calc_result a2_calc_result;
do i=1 to dim(cities);
cities(i)=byte(96+i); /* a, b, c, ... */
end;
do j=1 to dim(A_CITY);
do i=1 to dim(cities);
if A_CITY(j)=vname(cities(i)) then do;
afford(j)=cities(i);
leave;
end;
end;
if i le dim(cities) then calc(j)=cats("(x+y+z)+", increment(i));
end;
drop i j;
run;
Hello,
data app;
length A1_CITY A2_CITY $20.;
input ID $ A1_CITY $ A2_CITY $ NUM_OF_CUSTS;
datalines;
001 LONDON LONDON 2
002 LONDON LEEDS 2
003 MANCHESTER LONDON 2
004 LEEDS LONDON 2
;
run;
data want;
set app;
array cities(3) $1 LONDON LEEDS MANCHESTER;
array A_CITY A1_CITY A2_CITY;
do i=1 to dim(cities);
if vname(cities(i)) in A_CITY then cities(i)="Y";
else cities(i)="N";
end;
drop i;
run;
It seems that you haven't really grabbed the main concepts of SAS macros.
Macros only generate text substitutions and are handled prior to the
program execution. Hence, they have no knowledge of what is inside SAS
datasets.
Thus, the instruction
%do %while (&I <= NUM_OF_CUSTS);
makes no sense as &i. is an integer which is compared to the string "NUM_OF_CUSTS".
Also, while defining a macro inside a datastep is allowed :
data have;
set sashelp.class;
%macro a;
put age;
%mend;
%a
run;
it doesn't make sense to do so as the macro is defined globally (independently
of the datastep), so doing so only obfuscates the code.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.