BookmarkSubscribeRSS Feed
knveraraju91
Barite | Level 11

Dear,

 

In my dataset, the following is present. I need assign a base line flag to all OBS done before treatmentstartdate and the test must be done at least after once after treatment date.

 

ID         test                         testdate                          treatmendate               period

1           a                          01AUG2015                     15AUG2016                     1

1           b                          02AUG2015                     15AUG2016                      1

1           c                          03AUG2015                     15AUG2016                      1

1           d                          04AUG2015                     15AUG2016                      1

1           e                          11AUG2015                     15AUG2016                       1

1           f                          12AUG2015                     15AUG2016                       1

1           a                          20AUG2015                     15AUG2016                       2

1           b                          21AUG2015                     15AUG2016                       2

1           c                          23AUG2015                     15AUG2016                        2

1           d                          22AUG2015                     15AUG2016                       2

 

output needed;

ID         test                         testdate                          treatmenstarttdate       period                   flag

1           a                          01AUG2015                     15AUG2016                     1                          Y

1           b                          02AUG2015                     15AUG2016                      1                         Y

1           c                          03AUG2015                     15AUG2016                      1                         Y

1           d                          04AUG2015                     15AUG2016                      1                         Y

1           e                          11AUG2015                     15AUG2016                       1

1           f                          12AUG2015                     15AUG2016                       1

1           a                          20AUG2015                     15AUG2016                       2

1           b                          21AUG2015                     15AUG2016                       2

1           c                          23AUG2015                     15AUG2016                        2

1           d                          22AUG2015                     15AUG2016                       2

 

The flag not assigned to OBS 5 and 6 because the test are not repeated after treatment date. Please help. Thanks.

 

 

9 REPLIES 9
Astounding
PROC Star

I'm assuming that a test ON the treatment date is the same as a test AFTER.  But it's not that difficult to change it up.

 

Also assuming that the dates are SAS dates:

 

proc sort data=have;

by id test testdate;

run;

 

data flag_these:

set have;

by id test;

where testdate >= treatmentdate;

if last.test;

keep id test;

run;

 

data want;

merge have flag_these (in=set_flag);

by id test;

if set_flag and testdate < treatmentdate then flag='Y';

run;

 

You can always re-sort the data set at the end if that's a requirement.

 

It wouldn't surprise me if there are some clever SQL programmers out there that can make this happen it one step.

mkeintz
PROC Star

@knveraraju91

 

I don't think this will do what you intend.  Testdate (all in 2015) is never >= treatmendate (all in 2016)

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
s_lassen
Meteorite | Level 14

It is easy to do in a single step:

 

proc sort;
  by id test testdate;
run;


data want;
  merge have(where=(testdate>treatmendate) in=after) have;
  by id test;
  if first.test and after then flag='Y';
run;

 

Just remember to put the un-subsetted table last in the MERGE statement, so that the variables overwrite the ones from the subset.

 

Ksharp
Super User

I don't understand how to define " test are not repeated after treatment date"

and why period=2 don't have flag=Y ?

knveraraju91
Barite | Level 11

Sorry. The Treatmentdate 15AUG2015. 

mkeintz
PROC Star

I take it the treatmenstartdate in your output is the earliest value of treatmendate for a given id/test.  In your sample, it doesn't matter since you treatmendate is constant, but this program assume it can change, even within a given id/test.

 

This program needs your dataset to be sorted by any of:  (id/testdate  id/test/testdate testdate/id).  It's a single pass through the data, and it uses two hash objects (think lookup tables): one with the last testdate for each id/test, and one with the earliest treatmendate for each id/test.

 


data want;
  set have (drop=treatmendate)
      have (obs=0 
            keep=testdate treatmendate 
            rename=(testdate=last_tdate treatmendate=treatmenstartdate));

  if _n_=1 then do;
    /* Get lookup table for last testdate, by id/test */
    declare hash last_td (dataset:'have (keep=id test testdate rename=(testdate=last_tdate))'
                         ,duplicate:'Replace');
	  last_td.definekey('id','test');
	  last_td.definedata('last_tdate');
	  last_td.definedone();

	/* Get lookup table for first treatmendate, b id/test */
	declare hash strtdate (dataset:'have (keep=id test treatmendate rename=(treatmendate=treatmenstartdate))');
	  strtdate.definekey('id','test');
	  strtdate.definedata('treatmenstartdate');
	  strtdate.definedone();
  end;

  last_td.find();  /* retrieve last_tdate*/
  strtdate.find(); /* retrieve treatmenstartdate*/
  if testdate<=treatmenstartdate and testdate<last_tdate then flag='Y';
  drop last_tdate;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
Yeah. If you don't have big table ,you could try SQL, otherwise try Hash Table.


data have;
input ID         test  $          testdate : date9.      treatmendate : date9.   period;
format testdate treatmendate date9.;
cards;
1           a                          01AUG2015                     15AUG2015                     1
1           b                          02AUG2015                     15AUG2015                      1
1           c                          03AUG2015                     15AUG2015                      1
1           d                          04AUG2015                     15AUG2015                      1
1           e                          11AUG2015                     15AUG2015                       1
1           f                          12AUG2015                     15AUG2015                       1
1           a                          20AUG2015                     15AUG2015                       2
1           b                          21AUG2015                     15AUG2015                       2
1           c                          23AUG2015                     15AUG2015                        2
1           d                          22AUG2015                     15AUG2015                       2
;
run;
proc sql;
create table want as
select *,case when testdate le treatmendate and
exists(select * from have 
where test=a.test and testdate gt treatmendate) then 'Y' 
else ' ' end as flag
 from have as a;
quit;

rogerjdeangelis
Barite | Level 11
Flag earliest test date when a test has at least one test date after treatment end date

Flag earliest date when a test has at
least one test date after treatment end date

I sorted the original have datset by test testdate

HAVE
====

Up to 40 obs from havsrt total obs=10

Obs    TEST     TESTDATE    TREATMENDATE

  1     a      01AUG2015     15AUG2015
  2     a      20AUG2015     15AUG2015

  3     b      02AUG2015     15AUG2015
  4     b      21AUG2015     15AUG2015

  5     c      03AUG2015     15AUG2015
  6     c      23AUG2015     15AUG2015

  7     d      04AUG2015     15AUG2015
  8     d      22AUG2015     15AUG2015

  9     e      11AUG2015     15AUG2015

 10     f      12AUG2015     15AUG2015

WANT
====

Up to 40 obs WORK.WANT total obs=10

Obs    TEST    TESTDATE    TREATMENDATE    FLAG

  1     a        20301         20315        Y    Flag eariest date when test 'a' has at
  2     a        20320         20315             least one testdate after treatment end date

  3     b        20302         20315        Y
  4     b        20321         20315

  5     c        20303         20315        Y
  6     c        20323         20315

  7     d        20304         20315        Y
  8     d        20322         20315

  9     e        20311         20315
 10     f        20312         20315


 WORKING CODE
===============

 do until (last.test);
    if testdate > treatmendate then flg='Y';
 end;
 do until (last.test);
    if first.test and flg='Y' then flag='Y';
    else flag=' ';
 end;


FULL SOLUTION
=============

data have;
input test $ testdate date10. treatmendate date10.;
cards4;
a 01AUG2015 15AUG2015
b 02AUG2015 15AUG2015
c 03AUG2015 15AUG2015
d 04AUG2015 15AUG2015
e 11AUG2015 15AUG2015
f 12AUG2015 15AUG2015
a 20AUG2015 15AUG2015
b 21AUG2015 15AUG2015
c 23AUG2015 15AUG2015
d 22AUG2015 15AUG2015
;;;;
run;quit;


libname wrk "%sysfunc(pathname(work))";
proc sort data=wrk.have out=havsrt;
by test testdate;
run;quit;

data wrk.want;
 retain flg "N";
 do until (last.test);
    set havsrt;
    by test;
    if testdate > treatmendate then flg="Y";
 end;
 do until (last.test);
    set havsrt;
    by test;
    if first.test and flg="Y" then flag="Y";
    else flag=" ";
    output;
 end;
 flg="N";
 drop flg;
run;quit;

rogerjdeangelis
Barite | Level 11

 

Flag earliest test date when a test has at least one test date after treatment end date

Flag earliest date when a test has at
least one test date after treatment end date

I sorted the original have datset by test testdate

HAVE
====

Up to 40 obs from havsrt total obs=10

Obs    TEST     TESTDATE    TREATMENDATE

  1     a      01AUG2015     15AUG2015
  2     a      20AUG2015     15AUG2015

  3     b      02AUG2015     15AUG2015
  4     b      21AUG2015     15AUG2015

  5     c      03AUG2015     15AUG2015
  6     c      23AUG2015     15AUG2015

  7     d      04AUG2015     15AUG2015
  8     d      22AUG2015     15AUG2015

  9     e      11AUG2015     15AUG2015

 10     f      12AUG2015     15AUG2015

WANT
====

Up to 40 obs WORK.WANT total obs=10

Obs    TEST    TESTDATE    TREATMENDATE    FLAG

  1     a        20301         20315        Y    Flag eariest date when test 'a' has at
  2     a        20320         20315             least one testdate after treatment end date

  3     b        20302         20315        Y
  4     b        20321         20315

  5     c        20303         20315        Y
  6     c        20323         20315

  7     d        20304         20315        Y
  8     d        20322         20315

  9     e        20311         20315
 10     f        20312         20315


 WORKING CODE
===============

 do until (last.test);
    if testdate > treatmendate then flg='Y';
 end;
 do until (last.test);
    if first.test and flg='Y' then flag='Y';
    else flag=' ';
 end;


FULL SOLUTION
=============

data have;
input test $ testdate date10. treatmendate date10.;
cards4;
a 01AUG2015 15AUG2015
b 02AUG2015 15AUG2015
c 03AUG2015 15AUG2015
d 04AUG2015 15AUG2015
e 11AUG2015 15AUG2015
f 12AUG2015 15AUG2015
a 20AUG2015 15AUG2015
b 21AUG2015 15AUG2015
c 23AUG2015 15AUG2015
d 22AUG2015 15AUG2015
;;;;
run;quit;


libname wrk "%sysfunc(pathname(work))";
proc sort data=wrk.have out=havsrt;
by test testdate;
run;quit;

data wrk.want;
 retain flg "N";
 do until (last.test);
    set havsrt;
    by test;
    if testdate > treatmendate then flg="Y";
 end;
 do until (last.test);
    set havsrt;
    by test;
    if first.test and flg="Y" then flag="Y";
    else flag=" ";
    output;
 end;
 flg="N";
 drop flg;
run;quit;

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1269 views
  • 4 likes
  • 6 in conversation