Hello,
I need to create a new variable that flags last assay date prior to exposure date.
it seems simple, but I am unsure of the best method?
can you provide some suggestions?
thank you in advanced!
Sample data:
SUBJECT ASSAY_NAME ASSAY_DATE EXPOSURE_DATE WANT: BASELINE
001 Alanine Aminotransferase 2017-07-05T12:01 2017-07-19T09:55
001 Alanine Aminotransferase 2017-07-19T09:22 2017-07-19T09:55 Y
001 Albumin 2017-08-15T09:38 2017-07-19T09:55
001 Albumin 2017-08-30T09:09 2017-07-19T09:55
001 Albumin 2017-07-15T08:10 2017-07-19T09:55 Y
Are your dates numeric SAS dates?
Hmm so, the dates are character format in this example. I have a previous date vale in sas numeric format but not the time. However, I can create numeric sas date time values.
@mglogan wrote:
Hmm so, the dates are character format in this example. I have a previous date vale in sas numeric format but not the time. However, I can create numeric sas date time values.
If the dates are as in your post it doesn't matter if they are valid datetime values with an ISO format attached to make them print like that or just character strings in that ISO format. Either way they will sort the same since the lexical ordering of that string is the same as the numeric ordering of the dates they represent.
The logic is very simple
Let me know if you can understand the following
proc sql;
create table want(drop=dif) as
select *, ifn(EXPOSURE_DATE-ASSAY_DATE<0,.,EXPOSURE_DATE-ASSAY_DATE) as dif,ifc(min(calculated dif)=calculated dif,'Y',' ') as want
from have
group by subject,ASSAY_NAME ;
quit;
Here is an simple way. Just sort the data by descending date and flag the first one in the group.
proc sort data=have ;
by SUBJECT ASSAY_NAME EXPOSURE_DATE descending ASSAY_DATE ;
run;
data want ;
set have;
by SUBJECT ASSAY_NAME EXPOSURE_DATE ;
if first.expose_date then BASELINE='Y';
run;
Thank for the quick response! I'll try this method too.
Hi @mglogan Here is a test with taking sample as numeric sas dates as yymmdd10.
data have;
input SUBJECT ASSAY_NAME & $40. ASSAY_DATE : yymmdd10. EXPOSURE_DATE : yymmdd10.;
format ASSAY_DATE EXPOSURE_DATE yymmdd10.;
cards;
001 Alanine Aminotransferase 2017-07-05T12:01 2017-07-19T09:55
001 Alanine Aminotransferase 2017-07-19T09:22 2017-07-19T09:55 Y
001 Albumin 2017-08-15T09:38 2017-07-19T09:55
001 Albumin 2017-08-30T09:09 2017-07-19T09:55
001 Albumin 2017-07-15T08:10 2017-07-19T09:55
;
proc sql;
create table want(drop=dif) as
select *, ifn(EXPOSURE_DATE-ASSAY_DATE<0,.,EXPOSURE_DATE-ASSAY_DATE) as dif,ifc(min(calculated dif)=calculated dif,'Y',' ') as want
from have
group by subject,ASSAY_NAME ;
quit;
Feel free to to use an order by clause after group by to get the order the way want.
So I noticed that I will need to include time in the calculation, since there are assays performed on the day of exposure but prior to exposure time. In this case can I use the raw sas datetime value or do I need to apply a datetime format?
thank you!
No need to apply format. As long as your date or datetime value are numbers, the code will comfortably work
If your data are sorted by subject/assay_name/assay_date, the a single data step can do what you need:
data want (drop=nxt_:);
set have (keep=subject assay_name);
by subject assay_name ;
merge have
have (firstobs=2 keep=assay_date rename=(assay_date=nxt_assay_date));
if assay_date<exposure_date and
(last.assay_name=1 or nxt_assay_date>exposure_date)
then flag='Y';
run;
Hello everyone, thank you for your suggestions.
After looking at my data, it turns out that I will need to flag baseline records by subject, test (i.e. glucose, albumin etc.) and category (chemistry, hematology, urinalysis, serology etc.), Date values are in character format.
My output sample data is attached. The code I am working on is below, but it doesn't quite do what I want. It is flagging all dates prior to exposure by test category, but I only need the latest (or first.lbcat), and I have not been able to get it to flag correctly.
data lbflag;
set _lbflag;
by usubjid lbtest lbcat descending lbdtc;
retain flag;
if first.lbtest then flag=0;
*** flag test dates prior to exposure ;
if flag=0 and lbdtc <= rfxstdtc and lbstat ~='NOT DONE' then flag=1 ;
*** if test is prior to exposure, flag by lbcat ;
if flag=1 then do;
retain _flag ;
if first.lbcat then _flag=0;
if lbdtc <= rfxstdtc
then do ;
_flag=1 ;
lbblfl = 'Y' ;
end;
end;
run;
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.