BookmarkSubscribeRSS Feed
tom12122
Obsidian | Level 7

I need to calculate sum of columns multiplied by proper macro variable.

columns are a, x, b, m etc (names can be different and the number of columns may change)

each of columns have corresponding macro value &m1, &m2, &m3  .... &mn (n- number of columns = number of macro vairables)

i need to calculate sum for each row: a*&m1 + x*&m2 + b*&m3 + m*&m4 + .....  and write results to result table

Can anybody help.

7 REPLIES 7
Alpay
Fluorite | Level 6

One way would be by using arrays.

Zafer

data new;

    a = 1;

    b = 2;

    c = 3;

    d = 4;

    e = 5;

    output;

    a = 2;

    b = 3;

    c = 5;

    d = 6;

    e = 1;

    output;

run;

%let M1=1;

%let M2=2;

%let M3=3;

%let M4=4;

%let M5=5;

%let vars = a b c d e;

data x;

    set new;

    array _a_

  • &vars;
  •     dim = DIM(_a_);

        do i=1 to dim;

            Result = SUM(Result, _a_*SYMGET(CATS('M',i)));

        end;

        drop dim i;

    run;

    data_null__
    Jade | Level 19

    This is what PROC SCORE can do.  This examples uses the data macro vars from Alpay's example.

    data coef;

       if 0 then set new(keep=&vars);

       array _n

  • &vars;
  •    length _TYPE_ _NAME_ $8;

       retain _TYPE_ 'SCORE' _NAME_ 'RESULT';

       do i = 1 to dim(_n);

          _n = symgetN(cats('M',i));

          end;

       output;

       stop;

       drop i;

       run;

    Proc print;

       run;

    Proc score data=new score=coef out=scored;

       var &vars;

       run;

    proc print;

       run;

    Ksharp
    Super User

    NULL, symgetN() is very low efficient , that means your code is very slow . I prefer to make a marco varible to hold such expression .

    data _null_;
     set sashelp.vcolumn(keep=libname name memname type where=(libname='SASHELP' and memname='CLASS' and type='num')) end=last; 
     length list $ 4000;
     retain list;
     list=catx(',',list,cats(name,'*','&m ',_n_));
     if last then call symputx('list',list);
    run;
    
    %put %superq(list);
    
    %let M1=1;
    %let M2=2;
    %let M3=3;
    data want;
     set sashelp.class;
     sum=sum(&list );
    run;
    
    

    Ksharp

    data_null__
    Jade | Level 19

    Ksharp wrote:

    NULL, symgetN() is very low efficient , that means your code is very slow . I prefer to make a marco varible to hold such expression .

    The argument that my program is inefficient appears to be based on your misunderstanding of the data step to create COEF.  I believe if you study it more closely you will see SYMGETN would need to be very slow indeed to have any significant impact on performance.

    Ksharp
    Super User

    NULL,

    No, I am not misunderstanding . I know how slow the symgetn() would be at any situation .

    You put it on every single variable and every single observation which make your code very slow .

    Ksharp

    data_null__
    Jade | Level 19

    Yes it is obvious you don't understand because you don't know what you're talking about.

    How many observations do you think COEF has?

    ScottBass
    Rhodochrosite | Level 12

    Excerpts from above:

    %let M1=1;

    %let M2=2;

    %let M3=3;

    %let M4=4;

    %let M5=5;

    %let vars = a b c d e;

    This is what PROC SCORE can do.  This examples uses the data macro vars from Alpay's example.

    data coef;

       if 0 then set new(keep=&vars);

       array _n

  • &vars;
  •    length _TYPE_ _NAME_ $8;

       retain _TYPE_ 'SCORE' _NAME_ 'RESULT';

       do i = 1 to dim(_n);

          _n = symgetN(cats('M',i));

          end;

       output;

       stop;

       drop i;

       run;

    So, reviewing the data coef datastep:

    1) No records are read from set new.  It's only used to set the PDV variable attributes for &vars (a b c d e).

    2) Create an array for a b c d e.

    3) Create additional variables _TYPE_='SCORE' and _NAME_='RESULT'

    4) Loop over the 5 element array, setting a b c d e to &m1, &m2, ..., &m5 respectively.

    5) Output the record

    6) Stop the data step (actually not needed but makes the code crystal clear).

    So, there is only one loop through the data step, and symgetn is called 5 times.  No records are read from an input dataset.

    I gotta agree with Data:  where do you think the poor performance would lie?  Your statement that symgetn is slow is in fact correct, in principle, but not in this instance due to the construction of the data step.

    On the other hand, your code:

    data _null_;

    set sashelp.vcolumn(keep=libname name memname type where=(libname='SASHELP' and memname='CLASS' and type='num')) end=last;

    length list $ 4000;

    retain list;

    list=catx(',',list,cats(name,'*','&m ',_n_));

    if last then call symputx('list',list);

    run;

    will need to process *every column* from *every dataset* in *every allocated library*!  And if some of those libraries are RDBMS libraries it will perform even more poorly.  Even though you're applying a where clause, the sashelp view will not use indexes, and in general will perform badly.  Honestly, allocate 20 libraries or so, with say 100 datasets per library, with say an average of 20 variables per dataset, and see how sashelp.vcolumn will perform.

    If you're going to use your approach, at least consider using PROC CONTENTS against your desired dataset, i.e. proc contents data=sashelp.class out=columns, then use your approach.

    But of the two approaches shown, yours and Data's, I believe Data's will perform much faster.

    Regards,

    Scott


    Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
    I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

    sas-innovate-2024.png

    Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

    Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

     

    Register now!

    How to Concatenate Values

    Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

    Find more tutorials on the SAS Users YouTube channel.

    Click image to register for webinarClick image to register for webinar

    Classroom Training Available!

    Select SAS Training centers are offering in-person courses. View upcoming courses for:

    View all other training opportunities.

    Discussion stats
    • 7 replies
    • 1822 views
    • 0 likes
    • 5 in conversation