DATA Step, Macro, Functions and more

Macro to sum different variable based on dataset name

Accepted Solution Solved
Reply
Contributor
Posts: 60
Accepted Solution

Macro to sum different variable based on dataset name

libname fds "path1";
proc sql;
create table count_fds as
select MEMNAME AS DS_NAME,sum(nobs) as DS_COUNT,
substr(MEMNAME,5,(anydigit(MEMNAME,1)-5)) as comm_Var1
from dictionary.tables
where libname eq 'FDS' and MEMNAME CONTAINS "20170801"
group by memname;
quit;

proc sql;
select ds_name into : ds_nm separated by ' ' from count_fds;
run;

options symbolgen mprint;
%macro tot_soa;
%let i=1;
%do %while (%scan(&ds_nm,&i,' ') ne );
proc sql;
create table work.cnt_&i as select cat("%scan(&ds_nm,&i,' ')") as soa_ds ,
sum((case when cat("%scan(&ds_nm,&i,' ')") like '%_type1_%' then var1
when cat("%scan(&ds_nm,&i,' ')") like '%_type2_%' then var9
when cat("%scan(&ds_nm,&i,' ')") like '%_type3_%' then var14
when cat("%scan(&ds_nm,&i,' ')") like '%_type4_%' then var18
ELSE var21 end)) as soa_sum
from fds.%scan(&ds_nm,&i,' ') t1;
quit;
%put %scan(&ds_nm,&i,' ');
%let i=%eval(&i+1);
%end;
%mend tot_soa;
%tot_soa;


Accepted Solutions
Solution
Monday
Super User
Posts: 10,483

Re: Macro to sum different variable based on dataset name


atul_desh wrote:

Thanks,

but it is giving me error :var1 is not present in other than type1 and var9 is not presnt in other than type2 ... so on

 

ERROR: The following columns were not found in the contributing tables


ANY variable listed in the Proc SQL code will have to exist, either created in the proc or from a contributing data set.

 

I have to assume at this point you did not have working base code before wandering down the macro path as you would have had the error. From the code shown I was assuming that variables variables var1, var9, var14 and var18 were in all of the data sets and you were just selecting which one to calculate. So you do not want a case statement. You have to do the comparison in pure macro code, or perhaps a look up of some time to generate something like:

select "&TempDS" as soa_ds , var1 as soa_sum

 

One approach might be:

%do i=1 %to %sysfunc(countw(&ds_nm));
   %let TempDS = %scan(&ds_nm,&i);
   /* look up desired variable*/
   %if %index (&TempDS,_type1_)>0 %then %let target= var1;
   %if %index (&TempDS,_type2_)>0 %then %let target= var9;
   %if %index (&TempDS,_type3_)>0 %then %let target= var14;
   %if %index (&TempDS,_type4_)>0 %then %let target= var18;

   proc sql;
   create table work.cnt_&i as 
   select "&TempDS" as soa_ds ,&target as soa_sum
   from fds.&TempDS t1;
   quit;
%end;

which will only work if the _type1_ and such appear in the name. You might want a check if target is blank after the look up to skip the SQL and provide a warning in your log. This may be needed if whatever creates your set names has a problem and you end up with _type1 instead. Which may happen if your name builder attempts to build something longer than 32 characters.

 

View solution in original post


All Replies
Super User
Super User
Posts: 7,400

Re: Macro to sum different variable based on dataset name

Very enlightening.  Maybe some test data in the form of a datastep, what you want out, what is not working, logs etc.?

 

As for summing varaibles, why not use proc summary, thats pretty much what the procedure is for, and you can then use _numeric_ for instance to sum all numeric varaibles, much simpler than trying to hash it yourself.

http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473735.htm

Contributor
Posts: 60

Re: Macro to sum different variable based on dataset name

There are multiple dataset in one directory and I need to create consolidated dataset in which dataset name is present and sum of one particular varialbe should be present. when I'm running this code it is giving me error "column no found" why it is not going into one particular case and then exit.
New Contributor
Posts: 2

Re: Macro to sum different variable based on dataset name

Can you please post an dummy example? 

New Contributor
Posts: 2

Re: Macro to sum different variable based on dataset name

Sorry havent seen previous messages. My bad.

Super User
Posts: 5,081

Re: Macro to sum different variable based on dataset name

It's telling you why.  One (or more) of your data sets doesn't contain the variable that you expect.  For example, a data set with the name pattern %_type1_% might not contain VAR1.  It could be any of the conditions that isn't finding the expected variable.  You can probably figure out which one, by examining the log.

 

Secondarily, why use CAT when you aren't concatenating anything?

 

Super User
Super User
Posts: 7,400

Re: Macro to sum different variable based on dataset name

Right, now you have provided some minimal explanation, the simplest action would be:

data main;
  set sashelp.vtable (where=(libname="<yourlib>") keep=memname);
run;

proc sql;
  create table sums (memname char(200),res num);
quit;

data _null_;
  set sashelp.vcolumn (where=(libname="<yourlib>" and name="<thevartosum>"));
  call execute('proc sql;  
                  insert into sums 
                  set memname="'||strip(memname)||'",
                  res=(select sum('||strip(name)||') from <yourlib>.'||strip(memname)||'; 
                quit;');
run;

data final;
merge main sums;
by memname;
run;

What this does is select all dataset names from yourlib (note must by upcase).  Then it selects from those datasets the ones containing the variable in question and for each one generates an insert into a main table.  The main table is finally merged to these sums (as some datasets don't have the variable.  If you only need the one with the sums then you can drop main and the merge at the end.

Super User
Posts: 10,483

Re: Macro to sum different variable based on dataset name

It is very good idea to pass values into a macro as a parameter instead of having them magically appear in the body. For one thing you can get odd behaviors because this macro requires recompiling every time you use a different data set. If you have the name as parameter to pass then you could call this macro for muliple sets such as manually to test behavior/results with a small example set.

 

instead of calling the same code multiple times such as (%scan(&ds_nm,&i,' ') create a single temporary variable so the code is easier to read and under stand. And you might consider for the loop code

%do i=1 %to %sysfunc(countw(&ds_nm));
   %let TempDS = %scan(&ds_nm,&i);
   proc sql;
   create table work.cnt_&i as 
   select "&TempDS" as soa_ds ,
      sum((case when "&TempDS" like '%_type1_%' then var1
                when "&TempDS" like '%_type2_%' then var9
                when "&TempDS" like '%_type3_%' then var14
                when "&TempDS" like '%_type4_%' then var18
                ELSE var21 end)) as soa_sum
   from fds.&TempDS t1;
   quit;
%end;

But I think this seems like an awful complicated way to generate the equivalent of a rename statement.

 

Contributor
Posts: 60

Re: Macro to sum different variable based on dataset name

[ Edited ]

Thanks,

but it is giving me error :var1 is not present in other than type1 and var9 is not presnt in other than type2 ... so on

 

ERROR: The following columns were not found in the contributing tables

Solution
Monday
Super User
Posts: 10,483

Re: Macro to sum different variable based on dataset name


atul_desh wrote:

Thanks,

but it is giving me error :var1 is not present in other than type1 and var9 is not presnt in other than type2 ... so on

 

ERROR: The following columns were not found in the contributing tables


ANY variable listed in the Proc SQL code will have to exist, either created in the proc or from a contributing data set.

 

I have to assume at this point you did not have working base code before wandering down the macro path as you would have had the error. From the code shown I was assuming that variables variables var1, var9, var14 and var18 were in all of the data sets and you were just selecting which one to calculate. So you do not want a case statement. You have to do the comparison in pure macro code, or perhaps a look up of some time to generate something like:

select "&TempDS" as soa_ds , var1 as soa_sum

 

One approach might be:

%do i=1 %to %sysfunc(countw(&ds_nm));
   %let TempDS = %scan(&ds_nm,&i);
   /* look up desired variable*/
   %if %index (&TempDS,_type1_)>0 %then %let target= var1;
   %if %index (&TempDS,_type2_)>0 %then %let target= var9;
   %if %index (&TempDS,_type3_)>0 %then %let target= var14;
   %if %index (&TempDS,_type4_)>0 %then %let target= var18;

   proc sql;
   create table work.cnt_&i as 
   select "&TempDS" as soa_ds ,&target as soa_sum
   from fds.&TempDS t1;
   quit;
%end;

which will only work if the _type1_ and such appear in the name. You might want a check if target is blank after the look up to skip the SQL and provide a warning in your log. This may be needed if whatever creates your set names has a problem and you end up with _type1 instead. Which may happen if your name builder attempts to build something longer than 32 characters.

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 116 views
  • 0 likes
  • 5 in conversation