BookmarkSubscribeRSS Feed
mglogan
Obsidian | Level 7

Hello,

 

I need to create a new variable that flags last assay date prior to exposure date.

 

it seems simple, but I am unsure of the best method?

 

can you provide some suggestions?

 

thank you in advanced!

 

 

Sample data:

SUBJECT   ASSAY_NAME                      ASSAY_DATE          EXPOSURE_DATE   WANT: BASELINE

001             Alanine Aminotransferase     2017-07-05T12:01    2017-07-19T09:55
001             Alanine Aminotransferase     2017-07-19T09:22    2017-07-19T09:55     Y
001             Albumin                                 2017-08-15T09:38    2017-07-19T09:55
001             Albumin                                 2017-08-30T09:09    2017-07-19T09:55
001             Albumin                                 2017-07-15T08:10    2017-07-19T09:55     Y

 

11 REPLIES 11
novinosrin
Tourmaline | Level 20

Are your dates numeric SAS dates?

mglogan
Obsidian | Level 7

Hmm so, the dates are character format in this example. I have a previous date vale in sas numeric format but not the time. However, I can create numeric sas date time values.

Tom
Super User Tom
Super User

@mglogan wrote:

Hmm so, the dates are character format in this example. I have a previous date vale in sas numeric format but not the time. However, I can create numeric sas date time values.


If the dates are as in your post it doesn't matter if they are valid datetime values with an ISO format attached to make them print like that or just character strings in that ISO format. Either way they will sort the same since the lexical ordering of that string is the same as the numeric ordering of the dates they represent.

novinosrin
Tourmaline | Level 20

The logic is very simple

 

Let me know if you can understand the following

 

proc sql;
create table want(drop=dif) as
select *, ifn(EXPOSURE_DATE-ASSAY_DATE<0,.,EXPOSURE_DATE-ASSAY_DATE) as dif,ifc(min(calculated dif)=calculated dif,'Y',' ') as want
from have
group by subject,ASSAY_NAME	;
quit;
Tom
Super User Tom
Super User

Here is an simple way.  Just sort the data by descending date and flag the first one in the group.

proc sort data=have ;
  by SUBJECT ASSAY_NAME    EXPOSURE_DATE  descending ASSAY_DATE ;
run;

data want ;
  set have;
  by SUBJECT ASSAY_NAME    EXPOSURE_DATE ;
  if first.expose_date then BASELINE='Y';
run;
mglogan
Obsidian | Level 7

Thank for the quick response! I'll try this method too.

novinosrin
Tourmaline | Level 20

Hi @mglogan  Here is a test with taking sample as numeric sas dates as yymmdd10.

 

data have;
input SUBJECT   ASSAY_NAME   & $40.                   ASSAY_DATE  : yymmdd10.        EXPOSURE_DATE : yymmdd10.;  
format 	ASSAY_DATE EXPOSURE_DATE yymmdd10.;
cards;
001             Alanine Aminotransferase     2017-07-05T12:01    2017-07-19T09:55
001             Alanine Aminotransferase     2017-07-19T09:22    2017-07-19T09:55     Y
001             Albumin                                 2017-08-15T09:38    2017-07-19T09:55
001             Albumin                                 2017-08-30T09:09    2017-07-19T09:55
001             Albumin                                 2017-07-15T08:10    2017-07-19T09:55 
;

proc sql;
create table want(drop=dif) as
select *, ifn(EXPOSURE_DATE-ASSAY_DATE<0,.,EXPOSURE_DATE-ASSAY_DATE) as dif,ifc(min(calculated dif)=calculated dif,'Y',' ') as want
from have
group by subject,ASSAY_NAME	;
quit;

Feel free to to use an order by clause after group by to get the order the way want.

 

mglogan
Obsidian | Level 7

So I noticed that I will need to include time in the calculation, since there are assays performed on the day of exposure but prior to exposure time. In this case can I use the raw sas datetime value or do I need  to apply a datetime format? 

 

thank you!

novinosrin
Tourmaline | Level 20

No need to apply format. As long as your date or datetime value are numbers, the code will comfortably work

mkeintz
PROC Star

If your data are sorted by subject/assay_name/assay_date, the a single data step can do what you need:

 

data want (drop=nxt_:);
  set have (keep=subject assay_name);
  by subject assay_name ;

  merge have
        have (firstobs=2 keep=assay_date rename=(assay_date=nxt_assay_date));

  if assay_date<exposure_date and
     (last.assay_name=1 or nxt_assay_date>exposure_date)
  then flag='Y';
run;

 

  1.  The set statement only reads the BY variables  subject and assay_name.  This provides the automatic dummy variable last.assay_name which is used later to test whether the record-in-hand is the last record for that subject/assay_name combination.
  2.  The MERGE statement provide a way to simultaneously read the current record and the next record (the assay_date variable only, renamed to nxt_assay_date).  This allows a test to determine whether the current record is the last one with assay_date < exposure_date.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mglogan
Obsidian | Level 7

Hello everyone, thank you for your suggestions.

 

After looking at my data, it turns out that I will need to flag baseline records by subject, test (i.e. glucose, albumin etc.) and category (chemistry, hematology, urinalysis, serology etc.), Date values are in character format.

 

My output sample data is attached. The code I am working on is below, but it doesn't quite do what I want. It is flagging all dates prior to exposure by test category, but I only need the latest (or first.lbcat), and I have not been able to get it to flag correctly. 

 

data lbflag; 
  set _lbflag;
  by usubjid lbtest lbcat descending lbdtc;
  retain flag;
  if first.lbtest then flag=0;

  *** flag test dates prior to exposure ;
  if flag=0 and lbdtc <= rfxstdtc and lbstat ~='NOT DONE' then flag=1 ; 

  *** if test is prior to exposure, flag by lbcat ;
  if flag=1 then do;
  retain _flag ;
  if first.lbcat then _flag=0;
   if lbdtc <= rfxstdtc
	then do ;
	 _flag=1 ;
	 lbblfl = 'Y' ;
        end;
     end;
run;

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 3541 views
  • 0 likes
  • 4 in conversation