Looking to do something like this more efficiently:
proc sql;
create table newtable as
select names,
a.category1-b.category1 as category1,
a.category2-b.category2 as category2,
a.category3-b.category3 as category3,
...
a.category100-b.category100 as category100
from a.tablea as a left join b.tableb as b
on a.names=b.names;
quit;
I'm trying to create just one table and decrease the amount of lines I take up in my code. Figured that macro do loops may help, but not sure how. Please help! Thank you!!
Do this in a DATA step, with ARRAYs. Do-loops are available in a DATA step. Do-loops do not exist in SQL. The whole idea of processing lots of variables through some mathematical or statistical operation SCREAMS ... do this in a DATA step. Do not use SQL for processing lots of variables through some mathematical or statistical operation. Get out of your mind the idea of doing lots of math or statistics in SQL. Just don't go there.
data newtable;
merge tablea tableb(rename=(category1-category100=bcategory1-bcategory100));
by names;
array a category1-category100;
array b bcategory1-bcategory100;
array diff diff1-diff100;
do i=1 to dim(a);
diff(i)=a(i)-b(i);
end;
drop i;
run;
Assumes both data sets are sorted by variable called NAMES. Code is UNTESTED as you did not provide sample data.
Unlike @PaigeMiller , I do not necessarily think it is a bad idea to do some simple additions or subtractions in SQL. What you want to do can be done with a macro, e.g.:
%macro category(n);
%local i;
%do i=1 %to &n;
/* I put the comma first, as there is a comma after the key variable anyway */
,a.category&i-b.category&i as category&i
%end;
%mend;
proc sql;
create table newtable as
select names /* no comma here, comes with the macro */
%category(100)
from a.tablea as a left join b.tableb as b
on a.names=b.names;
quit;
Or you can create the expression you want using SQL SELECT INTO:
proc sql noprint;
select catx(' ','a.'!!a.name,'-','b.'!!a.name,'as',a.name) into :categories separated by ','
from
dictionary.columns(where=(libname='A' and memname='TABLEA')) a join
dictionary.columns(where=(libname='B' and memname='TABLEB')) b
on a.name=b.name and a.name like 'CATEGORY%';
create table newtable as
select names, &categories
from a.tablea as a left join b.tableb as b
on a.names=b.names;
quit;
But I didn't say I have a problem doing simple additions or subtractions in SQL. I said I have a problem doing LOTS OF simple additions or subtractions in SQL.
Macros are another valid approach, although I think users would still be better off thinking of DATA steps for this type of task, rather than SQL with macros.
If you have a one-to-one relationship, you should also consider PROC COMPARE.
Generally in this situation of super-wide tables the easiest way to decrease the amount of code you have to write is to restructure your data.
Currently it looks like you have one row per name, with variables category1-category100.
If you can transpose it to make your data have 100 rows per name, with three variables: name, categoryID (1-100), category, then your code should become much easier to write. And as an added bonus, when someone adds category101, you won't have to update your code.
@Quentin wrote:
Generally in this situation of super-wide tables the easiest way to decrease the amount of code you have to write is to restructure your data.
Currently it looks like you have one row per name, with variables category1-category100.
If you can transpose it to make your data have 100 rows per name, with three variables: name, categoryID (1-100), category, then your code should become much easier to write. And as an added bonus, when someone adds category101, you won't have to update your code.
All excellent points! Long beats wide!
That would be easy for SAS/IML if it is one to one or one to many matched.
data tablea;
input name $ category1 category2 category3;
cards;
a 1 2 2
a 1 2 2
b 2 3 3
b 2 3 3
c 5 6 3
d 2 7 3
d 2 7 3
;
data tableb;
input name $ category1 category2 category3;
cards;
d 32 37 3
b 22 23 3
e 22 23 3
;
proc iml;
use tablea;
read all var{name} ;
read all var _num_ into cat1[c=vnames];
close;
use tableb;
read all var{name} into _name;
read all var _num_ into cat2;
close;
c=j(nrow(cat1),ncol(cat1),.);
do i=1 to nrow(_name);
idx=loc(name=_name[i]);
if ^ isempty(idx) then c[idx,]=repeat(cat2[i,],ncol(idx));
end;
want=cat1-c;
create want from name want[c=('name'||vnames)];
append from name want;
close;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.