04-08-2014 04:28 AM
Hi friends, I need to do the following and I was wondering whether I could write a macro with a proc sql loop inside or something like that:
- I have a data set in which one of the columns is a the name of a data set
- I need to do a left join in a proc sql with that data set in the column I mentioned to retrieve a date for each person in the original data set
So, the data set I have looks like this:
And I need to append a column to this dataset with a date for each person. And that date is located in the corresponding dataset.
Now, how can I do that?
Hope you can understand what I'm trying to do.
Thank you very much in advance,
04-08-2014 04:58 AM
Not exactly sure, perhaps some sample input and desired out data would help. And, the underlying requirement.
From the brief description it sounds like you are doing some low level programming - perhaps changing the data model would make your task easier.
04-08-2014 05:14 AM
Hi LinusH, thanks for your reply.
The thing is easy, maybe I didn't make myself understood.
The dataset I have looks like this:
So, I need to retrieve a date for each of the names, which is in the dataset specified in the column 'dataset'. The date for Mike is in the dataset 'Oct2011_dates'; the date for Susan is in the dataset 'Mar2010_dates', etc...
And at the end, my original dataset will look like this:
name dataset date
-------- ---------------------- ------------------
Mike Oct2011_dates 2010-11-28
John Nov2012_dates 2012-10-12
Susan Mar2010_dates 2009-03-26
Liz Oct2011_dates 2010-07-19
Hope it's clear now,
04-08-2014 05:59 AM
You can union them all together by SET,then do left join.
data have; input name $ dataset : $20.; cards; Mike Oct2011_dates John Nov2012_dates Susan Mar2010_dates Liz Oct2011_dates ; run; data Oct2011_dates; input date : $20. name $; cards; 2010-11-28 Mike ; run; data Nov2012_dates; input date : $20. name $; cards; 2012-10-12 John ; run; data x; length dsname ds $ 20; set Oct2011_dates Nov2012_dates indsname=dsname; ds=scan(dsname,-1,'.'); run; proc sql; create table want as select a.*,b.date from have as a left join x as b on upcase(dataset)= ds and a.name=b.name ; quit;
Message was edited by: xia keshan
04-08-2014 06:28 AM
The thing is that I was looking for a method to do it with a loop inside a macro, in such a way that I can pass the dataset name as an argument to the sql proc, because I have about 150.000 names (well, actually id numbers) and maybe 70 or 80 different datasets to look for the date :-)
04-08-2014 07:18 AM
OK. That be more simple.
%macro match(dsname=); data x; set &dsname ; retain ds "&dsname"; run; proc sql; create table want as select a.*,b.date from have as a left join x as b on dataset= ds and a.name=b.name ; quit; %mend match; %match(dsname=Oct2011_dates)
04-08-2014 08:30 AM
Yes, LinusH, but in this way I must call the macro as many times as different datasets (with dates) I have.
What I was trying to find is a way of forcing the macro to take the dataset itself each time, since I already know in which dataset to look for the date, for a given person.
Don't know, maybe I'm trying to complicate things, but i cannot figure out a way of doing it without writing loads of proc sqls.
Just trying to automatize it with a macro...
04-08-2014 07:55 AM
As I might slipped in my first answer, I'm more into the original suggestion. Fetching individual rows might be a useful technique when building OLTP /data entry-type applications - not ideal for data management/reporting. Seems much simpler to keep all data in the same table.
04-08-2014 08:54 AM
Almost, but not quite :-(
I'm running your macro but there are errors in the proc sql inside the loop saying that the macro references for dataset and name weren't resolved. I'm trying to figure it out.
04-09-2014 04:41 AM
Just to add another method avoiding macros and macro variables:
attrib calc_date format=$20.;
call execute('proc sql;
set CALC_DATE=(select DATE from WORK.'||strip(dataset)||' where NAME="'||strip(name)||'")
04-08-2014 09:11 AM
And I would use.
%macro match(list=); data x; length dsname ds $ 20; set &list indsname=dsname; ds=scan(dsname,-1,'.'); run; proc sql; create table want as select a.*,b.date format=yymmdd10. from have as a left join x as b on upcase(dataset)= ds and a.name=b.name ; quit; %mend match; data d; input dsn : $20.; cards; Oct2011_dates Nov2012_dates ; run; proc sql; select dsn into : list separated by ' ' from d; quit; %match(list=&list)