BookmarkSubscribeRSS Feed
twenty7
Obsidian | Level 7

I have a dataset on which I am going to be performing a calculation. The calculation will need to be performed for each individual. 

 

The structure of the data is similar to 

data app;
input ID $ A1_CITY $ A2_CITY & NUM_OF_CUSTS
datalines;
001 LONDON LONDON 2
002 LONDON LEEDS 2
003 MANCHESTER LONDON 2
004 LEEDS LONDON 2
;
run;

As the calculation is quite long I was thinking of using a do loop within a macro for this. This is what I have come up with...

 

data calc;
set app;

%let I=1;

%macro calc;
%do %while (&I <= NUM_OF_CUSTS);
%if A&i._CITY = "LONDON" %THEN LONDON = "Y"; %else LONDON = "N"

/*rest of calc here*/
/*%if London = "Y" %then... %else...*/ %let I = %eval(&I+1); %end; %mend; run; %calc;

the statement 

%do %while (&I <= NUM_OF_CUSTS);

also isn't being resolved as expected for some reason

 

Any help as to where I'm going wrong would be greatly appreciated 

10 REPLIES 10
Tom
Super User Tom
Super User

Macro code is used to generate SAS code.  So first show what SAS code you want to generate and then explain any pattern to the code generation.

 

It is hard to tell from your description what you want to do but it does not look like it needs any macro code at all.

data calc;
  set app;
  if a1_city = 'LONDON' or a2_city='LONDON' then london='Y';
  else london='N';
run;
twenty7
Obsidian | Level 7

so essentially I want to perform a different calculation based on the city in which that particular applicant resides. Currently there are only two calculations - one for 'London' and for 'not London' but eventually there could be a calculation for each city.

 

Due to the number of potential calculations I wanted to be able code each calculation once and then loop through it for each applicant.

Kurt_Bremser
Super User

The data step already does that loop, by working through the dataset (where you have one observation per applicant, as I see it) one observation after the other.

Tom
Super User Tom
Super User

@twenty7 wrote:

so essentially I want to perform a different calculation based on the city in which that particular applicant resides. Currently there are only two calculations - one for 'London' and for 'not London' but eventually there could be a calculation for each city.

 

Due to the number of potential calculations I wanted to be able code each calculation once and then loop through it for each applicant.


Sounds like an XY problem.

Please explain the larger context of what you are trying to do to get a better solution.  Provide example output data for the given input data.

twenty7
Obsidian | Level 7

I need to be able to calculate the expenditure for each person on an application. The expenditure calculation is different based on the city in which the person resides. In the data I have available I am able to determine the number of persons on the application and the city. The maximum number of persons on an application is 3.

 

the current solution I have is

 

if a1_city = London then a1_affordability_calc = 'a' else 
if a1_city in (Manchester, Liverpool) then a1_affordability_calc = 'b' else 
if a1_city in (Birmingham) then a1_affordability_calc = 'c' else 
if a1_city in (Glasgow) then a1_affordability_calc = 'd' else 
a1_affordability_calc = 'e';

if a1_affordability_calc = 'a' then a1_calc_result = (x+y+z)*0.25;
if a1_affordability_calc = 'b' then a1_calc_result = (x+y+z)*0.10;
if a1_affordability_calc = 'c' then a1_calc_result = (x+y+z)*0.21;
if a1_affordability_calc = 'd' then a1_calc_result = (x+y+z)*0.23;
if a1_affordability_calc = 'e' then a1_calc_result = (x+y+z)*0.20;

/*REPEAT ABOVE REPLACING THE PREFIX a1 with a2 and a3*/

The calculations will be a lot more complicated but hopefully this explains the problem better

 

What I wanted to avoid in my potential solution is repeating the same section of code with the calculations for each of the 3 potential applicants.

 

 

Tom
Super User Tom
Super User

A data step operates on every observation.  So if by applicant you mean observation then you don't have to do anything for that.

If by APPLICANT you mean the 1,2,3 that you have coded into the variable names then perhaps you should just first transpose the data. 

data app;
  input ID $ NUM_OF_CUSTS @;
  do applicant=1 to num_of_custs ;
     input city &:$20. @;
     output;
  end;
datalines;
001 2 LONDON LONDON 
002 2 LONDON LEEDS 
003 2 MANCHESTER LONDON 
004 2 LEEDS LONDON 
;

Otherwise use an ARRAY so that you can use the same code for every variable.  Example (notice how much easier it is when you put the number at the END of the name):

array city city1 - city3 ;

To assign a grouping based on value of the CITY variable you might want to use a format.

proc format ;
value $calc_type 
  'London' = 'a' 
  'Manchester', 'Liverpool' = 'b'
  'Birmingham' = 'c'
  'Glasgow' = 'd'
  other = 'e'
;
run;

Now your calculation code is simple.

data want;
  set app;
  select (put(city,$calc_type.));
    when ('a')  calc_result = (x+y+z)*0.25;
    when ('b')  calc_result = (x+y+z)*0.10;
    when ('c')  calc_result = (x+y+z)*0.21;
    when ('d')  calc_result = (x+y+z)*0.23;
    when ('e')  calc_result = (x+y+z)*0.20;
  end;
run;

There is probably even more you could do to simplify.

Notice that macro code is not needed.

gamotte
Rhodochrosite | Level 12
data app;
    length A1_CITY A2_CITY $20.;
    input ID $ A1_CITY $ A2_CITY $ NUM_OF_CUSTS;
    datalines;
001 LONDON LONDON 2
002 LONDON LEEDS 2
003 MANCHESTER LONDON 2
004 LEEDS LONDON 2
;
run;

data want;
    set app;
    array cities $1 LONDON LEEDS MANCHESTER BIRMINGHAM GLASGOW;
    array A_CITY A1_CITY A2_CITY;
    array afford $1. a1_affordability_calc a2_affordability_calc;
    array increment(5) _TEMPORARY_ (0.25 0.10 0.21 0.23 0.20);
    array calc $20. a1_calc_result a2_calc_result;

    do i=1 to dim(cities);
        cities(i)=byte(96+i); /* a, b, c, ... */
    end;

    do j=1 to dim(A_CITY);
        do i=1 to dim(cities);
            if A_CITY(j)=vname(cities(i)) then do;
                afford(j)=cities(i);
                leave;
            end;
        end;

        if i le dim(cities) then calc(j)=cats("(x+y+z)+", increment(i));
    end;

	drop i j;
run;
gamotte
Rhodochrosite | Level 12

Hello,

 

data app;
    length A1_CITY A2_CITY $20.;
    input ID $ A1_CITY $ A2_CITY $ NUM_OF_CUSTS;
    datalines;
001 LONDON LONDON 2
002 LONDON LEEDS 2
003 MANCHESTER LONDON 2
004 LEEDS LONDON 2
;
run;

data want;
    set app;
    array cities(3) $1 LONDON LEEDS MANCHESTER;
    array A_CITY A1_CITY A2_CITY;

    do i=1 to dim(cities);
        if vname(cities(i)) in A_CITY then cities(i)="Y";
        else cities(i)="N";
    end;

	drop i;
run;
gamotte
Rhodochrosite | Level 12

It seems that you haven't really grabbed the main concepts of SAS macros.

Macros only generate text substitutions and are handled prior to the

program execution. Hence, they have no knowledge of what is inside SAS

datasets.

Thus, the instruction

%do %while (&I <= NUM_OF_CUSTS);

makes no sense as &i. is an integer which is compared to the string "NUM_OF_CUSTS".

 

Also, while defining a macro inside a datastep is allowed :

data have;
set sashelp.class;

%macro a;
   put age;
%mend;

%a

run;

it doesn't make sense to do so as the macro is defined globally (independently

of the datastep), so doing so only obfuscates the code.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2190 views
  • 1 like
  • 4 in conversation