BookmarkSubscribeRSS Feed
ccc1431
Calcite | Level 5

Looking to do something like this more efficiently:

proc sql;
create table newtable as
select names,
a.category1-b.category1 as category1,
a.category2-b.category2 as category2,
a.category3-b.category3 as category3,
...
a.category100-b.category100 as category100
from a.tablea as a left join b.tableb as b
on a.names=b.names;
quit;

I'm trying to create just one table and decrease the amount of lines I take up in my code. Figured that macro do loops may help, but not sure how. Please help! Thank you!!

7 REPLIES 7
PaigeMiller
Diamond | Level 26

Do this in a DATA step, with ARRAYs. Do-loops are available in a DATA step. Do-loops do not exist in SQL. The whole idea of processing lots of variables through some mathematical or statistical operation SCREAMS ... do this in a DATA step. Do not use SQL for processing lots of variables through some mathematical or statistical operation. Get out of your mind the idea of doing lots of math or statistics in SQL. Just don't go there.

 

data newtable;
    merge tablea tableb(rename=(category1-category100=bcategory1-bcategory100));
    by names;
    array a category1-category100;
    array b bcategory1-bcategory100;
    array diff diff1-diff100;
    do i=1 to dim(a);
         diff(i)=a(i)-b(i);
    end;
    drop i;
run;

Assumes both data sets are sorted by variable called NAMES. Code is UNTESTED as you did not provide sample data.

--
Paige Miller
s_lassen
Meteorite | Level 14

Unlike @PaigeMiller , I do not necessarily think it is a bad idea to do some simple additions or subtractions in SQL. What you want to do can be done with a macro, e.g.:

%macro category(n);
  %local i;
  %do i=1 %to &n;
    /* I put the comma first, as there is a comma after the key variable anyway */
    ,a.category&i-b.category&i as category&i
    %end;
%mend;
proc sql;
create table newtable as
select names /* no comma here, comes with the macro */
%category(100) 
from a.tablea as a left join b.tableb as b
on a.names=b.names;
quit;

Or you can create the expression you want using SQL SELECT INTO:

proc sql noprint;
  select catx(' ','a.'!!a.name,'-','b.'!!a.name,'as',a.name) into :categories separated by ','
  from
    dictionary.columns(where=(libname='A' and memname='TABLEA')) a join
    dictionary.columns(where=(libname='B' and memname='TABLEB')) b
    on a.name=b.name and a.name like 'CATEGORY%';
  create table newtable as
    select names, &categories
  from a.tablea as a left join b.tableb as b
      on a.names=b.names;
quit;
PaigeMiller
Diamond | Level 26

But I didn't say I have a problem doing simple additions or subtractions in SQL. I said I have a problem doing LOTS OF simple additions or subtractions in SQL.

 

Macros are another valid approach, although I think users would still be better off thinking of DATA steps for this type of task, rather than SQL with macros.

--
Paige Miller
Quentin
Super User

Generally in this situation of super-wide tables the easiest way to decrease the amount of code you have to write is to restructure your data.  

 

Currently it looks like you have one row per name, with variables category1-category100.

 

If you can transpose it to make your data have 100 rows per name, with three variables: name, categoryID (1-100), category, then your code should become much easier to write. And as an added bonus, when someone adds category101, you won't have to update your code.

BASUG is hosting free webinars Next up: Mike Sale presenting Data Warehousing with SAS April 10 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
PaigeMiller
Diamond | Level 26



@Quentin wrote:

Generally in this situation of super-wide tables the easiest way to decrease the amount of code you have to write is to restructure your data.  

 

Currently it looks like you have one row per name, with variables category1-category100.

 

If you can transpose it to make your data have 100 rows per name, with three variables: name, categoryID (1-100), category, then your code should become much easier to write. And as an added bonus, when someone adds category101, you won't have to update your code.


All excellent points! Long beats wide!

--
Paige Miller
Ksharp
Super User

That would be easy for SAS/IML  if it is one to one or one to many matched.

 

data tablea;
input name $ category1 category2 category3;
cards;
a 1 2 2
a 1 2 2
b 2 3 3
b 2 3 3
c 5 6 3
d 2 7 3
d 2 7 3
;
data tableb;
input name $ category1 category2 category3;
cards;
d 32 37 3
b 22 23 3
e 22 23 3
;

proc iml;
use tablea;
read all var{name} ;
read all var _num_ into cat1[c=vnames];
close;
use tableb;
read all var{name} into _name;
read all var _num_ into cat2;
close;
c=j(nrow(cat1),ncol(cat1),.);
do i=1 to nrow(_name);
 idx=loc(name=_name[i]);
 if ^ isempty(idx) then c[idx,]=repeat(cat2[i,],ncol(idx));
end;
want=cat1-c;
create want from name want[c=('name'||vnames)];
append from name want;
close;
quit;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 384 views
  • 5 likes
  • 6 in conversation