I'm doing a bit of research on different methods to match/merge data in SAS. I came across the following example in a paper. Is this a DOW loop?
DATA alldata0;
SET ae;
DROP _: match; * Drop temporary variables;
match=0;
DO i=1 TO xnobs;
SET patdata (rename=(subject=_subject)) NOBS=xnobs POINT=i;
IF subject=_subject THEN DO;
match=1; OUTPUT; END;
END;
IF match=0 THEN DO; * Output AE record if no match in CM;
CALL MISSING(trt_code); OUTPUT; END;
RUN;
xnobs is populated with the number of observations at data step compile time. So it is already known at the very beginning of the execution phase of the data step.
No, it is not a DOW loop. It is rather a set/set point= combination, which accomplishes a many-to-many join.
In a DOW loop you should see something similar to this:
do until(condition);
set dataset;
...
end;
DOW loop itself is not a technique for joining or merging. It is rather reading and analyzing data in blocks (the block is sometimes the whole dataset). It can be slightly faster then the standard data step, because it skips a couple of "maintenance tasks" that data step is doing by default at the beginning of each loop (setting variables to missing).
But you can put a merge statement inside the loop if you want. Then it merges datasets.
almost your do loop has a problem as the xnobs may not yet be known at that moment ( set is not executed yet).
Could be solved by a getvarn or a while construct.
Thanks,
I didn't write the code, i copied it directly from a paper. Ultimately I am trying to figure out if a DOW loop can be used to merge two data sets and what the code would look like. I have read a few papers on the DOW loop but merging two data sets is not explicitly mentioned.
xnobs is populated with the number of observations at data step compile time. So it is already known at the very beginning of the execution phase of the data step.
No, it is not a DOW loop. It is rather a set/set point= combination, which accomplishes a many-to-many join.
In a DOW loop you should see something similar to this:
do until(condition);
set dataset;
...
end;
DOW loop itself is not a technique for joining or merging. It is rather reading and analyzing data in blocks (the block is sometimes the whole dataset). It can be slightly faster then the standard data step, because it skips a couple of "maintenance tasks" that data step is doing by default at the beginning of each loop (setting variables to missing).
But you can put a merge statement inside the loop if you want. Then it merges datasets.
If you are examining merging techniques it may be time to look into Proc SQL and various types of joins as they may be easier to understand and maintain.
Is it a DOW loop? Not in my opinion. DOW loops should not be using point= (but you may encounter other opinions about that).
Is it good code? Definitely not. It may work (xnobs will not be a problem, since nobs= variables are populated during the compilation phase of the DATA step), but it will run through 100% of the observations in PATDATA for each observation in AE. Moreover, it loses some functionality because it cannot detect when PATDATA contains multiple observations for the same SUBJECT. A MERGE would automatically print a note in the log if both data sets contained multiple observations for a SUBJECT:
data alldata0;
merge ae (in=in1) patdata (in=in2);
by subject;
if in1;
treat = in2;
run;
My interpretation: the author was either playing around or was shooting for job security. Or maybe wanted to achieve a many-to-many match but didn't know SQL.
Depending on what you are trying to document, you may be interested in searching for look-up techniques. That would include formats and hash tables, not just merge.
Good luck.
Thanks,
I already looked into Format and Hash tables and I am very familiar with SQL. I'm just exploring different techniques for my general knowledge. Thanks for the input
OK. For general knowledge, here's an example of a DOW loop that use AE data. The intent is to output all AEs for a subject that had at least one "Serious" AE.
data want;
do until (last.subject);
set have;
by subject;
if severity='Serious' then flag=1;
end;
do until (last.subject);
set have;
by subject;
if flag=1 then output;
end;
run;
The first loop determines whether the subject had a serious AE. Then the second loop reads the same records and outputs if needed.
for merging joining there are some older principles as "balance line" they can be nicely programmed using this kind of technique.
You can solve some problems with this that are not easily possible in a other way.
Recently there was some question on getting a record back/forward within the same dataset with a timed value condition.
The number of records that could be there are unknown, it just a condition to find
.
I came up with this one a merge on the own dataset moving the readpointer up/backward :
data fixings300;
set fixings nobs=fn_nobs ;
Retain eventlk 1 ; check=1 ;
do while (eventlk < fn_nobs & check ) ;
set fixings (rename=(bidprc=bidprc5 offerprc=offerprc5m eventid=event5m dtcur=dtcur5m) ) point=eventlk end=end5m ;
if ( dtcur+300 > dtcur5m ) then eventlk=eventlk+1 ; else check=0;
end;
run;
Another more a merge. but same kind of question. Just a condition to find not knowing any fixed merged value.
I had the problem of the value not being defined here when validating the code. Getting back tot the attrn function.
%let dsid = %sysfunc(open(prices));
%let pr_nobs =%sysfunc(attrn(&dsid,NOBS));
%let rc = %sysfunc(close(&dsid));
%put prices has &pr_nobs observations;
/* using assumption oredered events - prices */
data results;
set Events nobs=evn_nobs ;
Retain eventlk 1 ;
check=1 ;
do while (eventlk < &pr_nobs & check ) ;
set Prices (rename=(Ticker=pTicker TradeDate=PriceDate TradeTime=Pricetime ) ) point=eventlk end=endprc ;
if ( Ticker > PTicker | ( Ticker = PTicker & TradeDate > PriceDate) |( Ticker = PTicker & TradeDate = PriceDate & TradeTime > Pricetime ) ) then eventlk=eventlk+1 ;
else check=0;
end;
if not endprc then eventlk=eventlk-1 ; /* one back that has passed set equation */
set Prices (rename=(Ticker=pTicker TradeDate=PriceDate TradeTime=Pricetime ) ) point=eventlk end=endprc ;
IF ( Ticker=PTicker & TradeDate = PriceDate) then output; /* */
run;
proc print;
run;
http://support.sas.com/resources/papers/proceedings09/038-2009.pdf
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.