BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jhan1
Calcite | Level 5

Hello,

 

I'm working with a wide dataset. Each row represents a person and includes treatment records as well as corresponding dates in which treatments are performed. My data essentially looks like this:

 

ID        treatment1 treatment2 treatment3 treatment4 treatment5  date1   date2    date3    date4    date5

John           A              B                   B              B               A            Jan1    Jan15   Jan16   Jan17    Feb1

David          A              A                   B             B                A            Jan1   March15 Jan5   Jan5      Feb15

 

What I need to do is for each person is to select the latest date in which he received treatment A (date5 for John and date2 for David in this case). I thought a simple array with a do-loop would do the trick but I realized that the dates1-5 are not necessarily recorded in chronological order. Any help would be greatly appreciated!

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Further improvised:

 

data have;
infile cards truncover;
input (ID        treatment1 treatment2 treatment3 treatment4 treatment5) ($) (date1   date2    date3    date4    date5) (:date9.);
format date: date9.;
cards;
John           A              B                   B              B               A            01Jan2018    15Jan2018   16Jan2018   17Jan2018    01Feb2018
David          A              A                   B             B                A            01Jan2018   15Mar2018 05Jan2018   05Jan2018      15Feb2018
;

data want;
set have;
array j(*) treatment:;
array t(*) date:;
do _n_=dim(j) to 1 by -1;
if j(_n_)= 'A' then latest_date=max(latest_date,t(_n_));
end;
format latest_date date9.;
keep id latest_Date;
run;

View solution in original post

10 REPLIES 10
novinosrin
Tourmaline | Level 20

 

@Jhan1 wrote:

Hello,

 

I'm working with a wide dataset. Each row represents a person and includes treatment records as well as corresponding dates in which treatments are performed. My data essentially looks like this:

perhaps-->

 

data want;

set have;

latest_date=max(of date:  );

format latest_date your_format;

run;

 

Reeza
Super User

@novinosrin only for treatment A though. 

It's probably easiest to transpose it to a long format. Then you can sort and get it pretty quickly. 

 

 

novinosrin
Tourmaline | Level 20

Thank you @Reeza let me go back to starb and ask for refund 

novinosrin
Tourmaline | Level 20
data have;
infile cards truncover;
input (ID        treatment1 treatment2 treatment3 treatment4 treatment5) ($) (date1   date2    date3    date4    date5) (:date9.);
format date: date9.;
cards;
John           A              B                   B              B               A            01Jan2018    15Jan2018   16Jan2018   17Jan2018    01Feb2018
David          A              A                   B             B                A            01Jan2018   15Mar2018 05Jan2018   05Jan2018      15Feb2018
;

data want;
set have;
array j(*) treatment:;
array t(*) date:;
do _n_=dim(j) to 1 by -1;
if j(_n_) ne 'A' then call missing(j(_n_),t(_n_));
end;
latest_date=max(of t(*));
format latest_date date9.;
keep id latest_Date;
run;
CharlotteCain
Quartz | Level 8

Great solution. Super fast performance!!!

novinosrin
Tourmaline | Level 20

Further improvised:

 

data have;
infile cards truncover;
input (ID        treatment1 treatment2 treatment3 treatment4 treatment5) ($) (date1   date2    date3    date4    date5) (:date9.);
format date: date9.;
cards;
John           A              B                   B              B               A            01Jan2018    15Jan2018   16Jan2018   17Jan2018    01Feb2018
David          A              A                   B             B                A            01Jan2018   15Mar2018 05Jan2018   05Jan2018      15Feb2018
;

data want;
set have;
array j(*) treatment:;
array t(*) date:;
do _n_=dim(j) to 1 by -1;
if j(_n_)= 'A' then latest_date=max(latest_date,t(_n_));
end;
format latest_date date9.;
keep id latest_Date;
run;
Jhan1
Calcite | Level 5

Just wanted say thank-you for your input!

I've tried different ways you've suggested and others (including transposing the data) - this suggestion fit my data the best...I had other criteria I had to consider and working with a very large dataset and wanted to avoid transposing data if I could although that was an excellent suggestion as well.  

 

Thanks to others who responded - it was first time posting on this community and I'm glad I chose to!

 

 

AlokR
Fluorite | Level 6

Its easy to solve this problem for specific hard coded treatment using array. But for a generic solution I will suggest to use proc transpose. Something like this should work:

Data Have;
	infile cards truncover;
	input (ID Treatment1 Treatment2 Treatment3 Treatment4 Treatment5) ($) (Date1 Date2 Date3 Date4 Date5) (:date9.);
	format date: date9.;
	cards;
John A B B B A '01Jan2018'd '15Jan2018'd '16Jan2018'd '17Jan2018'd '01Feb2018'd
David A A B B A '01Jan2018'd '15Mar2018'd '05Jan2018'd '05Jan2018'd '05Feb2018'd
;
run;

proc sort data=Have;
	by ID;
run;

proc transpose data=Have out=Treatments(Drop=_Name_ Rename=(Treatment1=Treatment)) Prefix=Treatment;
	by ID;
	Var Treatment:;
run;

proc transpose data=Have out=Dates(Drop=_Name_ rename=(Date1=Date)) Prefix=Date;
	by ID;
	Var Date:;
run;

Data Have_Transposed;
	merge Treatments Dates;
	by ID;
run;

proc sql;
	Select ID, Treatment,max(Date) as Max_Date format date9. from Have_Transposed 
		group by ID, Treatment;
quit;
novinosrin
Tourmaline | Level 20

Some generic fun for finding latest_Date for all treatments:

 

data have;
infile cards truncover;
input (ID        treatment1 treatment2 treatment3 treatment4 treatment5) ($) (date1   date2    date3    date4    date5) (:date9.);
format date: date9.;
cards;
John           A              B                   B              B               A            01Jan2018    15Jan2018   16Jan2018   17Jan2018    01Feb2018
David          A              A                   B             B                A            01Jan2018   15Mar2018 08Jan2018   05Jan2018      15Feb2018
;

data want;
if _n_=1 then do;
if 0 then set have;
length treat $8 dt 8;
format dt latest_date date9.;
dcl hash H (ordered: "A",multidata:'y') ;
   h.definekey  ("treat") ;
   h.definedata ("treat","dt") ;
   h.definedone () ;
   call missing(treat,dt);
end;
set have ;
array j(*) treatment1-treatment5;
array t(*) date1-date5;
do i=1 to dim(t);
	rc=h.add(key:j(i),data:j(i),data:t(i));
end; 
do i=1 to dim(j);
	rc=h.check(key:j(i));
	if rc=0 then do;
	do _iorc_=h.find(key:j(i)) by 0 while(_iorc_=0);
	latest_date=max(latest_date,dt);
	_iorc_=h.find_next();
	end;
	output;
	h.remove(key:j(i));
	call missing(latest_date,dt);
	end;
end;
keep id treat latest_date;
run;
novinosrin
Tourmaline | Level 20
data have;
infile cards truncover;
input (ID        treatment1 treatment2 treatment3 treatment4 treatment5) ($) (date1   date2    date3    date4    date5) (:date9.);
format date: date9.;
cards;
John           A              B                   B              B               A            01Jan2018    15Jan2018   16Jan2018   17Jan2018    01Feb2018
David          A              A                   B             B                A            01Jan2018   15Mar2018 08Jan2018   05Jan2018      15Feb2018
;

data temp;
set have;
array j(*) treatment1-treatment5;
array t(*) date1-date5;
do i=1 to dim(t);
treatment=j(i);
dt=t(i);
output;
end;
keep id treatment dt;
format dt date9.;
run;

proc sql;
create table want as
select id,treatment,max(dt) as latest_date format=date9.
from temp
group by id,treatment;
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1939 views
  • 3 likes
  • 5 in conversation