BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jhan1
Calcite | Level 5

Hello,

 

I'm working with a wide dataset. Each row represents a person and includes treatment records as well as corresponding dates in which treatments are performed. My data essentially looks like this:

 

ID        treatment1 treatment2 treatment3 treatment4 treatment5  date1   date2    date3    date4    date5

John           A              B                   B              B               A            Jan1    Jan15   Jan16   Jan17    Feb1

David          A              A                   B             B                A            Jan1   March15 Jan5   Jan5      Feb15

 

What I need to do is for each person is to select the latest date in which he received treatment A (date5 for John and date2 for David in this case). I thought a simple array with a do-loop would do the trick but I realized that the dates1-5 are not necessarily recorded in chronological order. Any help would be greatly appreciated!

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Further improvised:

 

data have;
infile cards truncover;
input (ID        treatment1 treatment2 treatment3 treatment4 treatment5) ($) (date1   date2    date3    date4    date5) (:date9.);
format date: date9.;
cards;
John           A              B                   B              B               A            01Jan2018    15Jan2018   16Jan2018   17Jan2018    01Feb2018
David          A              A                   B             B                A            01Jan2018   15Mar2018 05Jan2018   05Jan2018      15Feb2018
;

data want;
set have;
array j(*) treatment:;
array t(*) date:;
do _n_=dim(j) to 1 by -1;
if j(_n_)= 'A' then latest_date=max(latest_date,t(_n_));
end;
format latest_date date9.;
keep id latest_Date;
run;

View solution in original post

10 REPLIES 10
novinosrin
Tourmaline | Level 20

 

@Jhan1 wrote:

Hello,

 

I'm working with a wide dataset. Each row represents a person and includes treatment records as well as corresponding dates in which treatments are performed. My data essentially looks like this:

perhaps-->

 

data want;

set have;

latest_date=max(of date:  );

format latest_date your_format;

run;

 

Reeza
Super User

@novinosrin only for treatment A though. 

It's probably easiest to transpose it to a long format. Then you can sort and get it pretty quickly. 

 

 

novinosrin
Tourmaline | Level 20

Thank you @Reeza let me go back to starb and ask for refund 

novinosrin
Tourmaline | Level 20
data have;
infile cards truncover;
input (ID        treatment1 treatment2 treatment3 treatment4 treatment5) ($) (date1   date2    date3    date4    date5) (:date9.);
format date: date9.;
cards;
John           A              B                   B              B               A            01Jan2018    15Jan2018   16Jan2018   17Jan2018    01Feb2018
David          A              A                   B             B                A            01Jan2018   15Mar2018 05Jan2018   05Jan2018      15Feb2018
;

data want;
set have;
array j(*) treatment:;
array t(*) date:;
do _n_=dim(j) to 1 by -1;
if j(_n_) ne 'A' then call missing(j(_n_),t(_n_));
end;
latest_date=max(of t(*));
format latest_date date9.;
keep id latest_Date;
run;
CharlotteCain
Quartz | Level 8

Great solution. Super fast performance!!!

novinosrin
Tourmaline | Level 20

Further improvised:

 

data have;
infile cards truncover;
input (ID        treatment1 treatment2 treatment3 treatment4 treatment5) ($) (date1   date2    date3    date4    date5) (:date9.);
format date: date9.;
cards;
John           A              B                   B              B               A            01Jan2018    15Jan2018   16Jan2018   17Jan2018    01Feb2018
David          A              A                   B             B                A            01Jan2018   15Mar2018 05Jan2018   05Jan2018      15Feb2018
;

data want;
set have;
array j(*) treatment:;
array t(*) date:;
do _n_=dim(j) to 1 by -1;
if j(_n_)= 'A' then latest_date=max(latest_date,t(_n_));
end;
format latest_date date9.;
keep id latest_Date;
run;
Jhan1
Calcite | Level 5

Just wanted say thank-you for your input!

I've tried different ways you've suggested and others (including transposing the data) - this suggestion fit my data the best...I had other criteria I had to consider and working with a very large dataset and wanted to avoid transposing data if I could although that was an excellent suggestion as well.  

 

Thanks to others who responded - it was first time posting on this community and I'm glad I chose to!

 

 

AlokR
Fluorite | Level 6

Its easy to solve this problem for specific hard coded treatment using array. But for a generic solution I will suggest to use proc transpose. Something like this should work:

Data Have;
	infile cards truncover;
	input (ID Treatment1 Treatment2 Treatment3 Treatment4 Treatment5) ($) (Date1 Date2 Date3 Date4 Date5) (:date9.);
	format date: date9.;
	cards;
John A B B B A '01Jan2018'd '15Jan2018'd '16Jan2018'd '17Jan2018'd '01Feb2018'd
David A A B B A '01Jan2018'd '15Mar2018'd '05Jan2018'd '05Jan2018'd '05Feb2018'd
;
run;

proc sort data=Have;
	by ID;
run;

proc transpose data=Have out=Treatments(Drop=_Name_ Rename=(Treatment1=Treatment)) Prefix=Treatment;
	by ID;
	Var Treatment:;
run;

proc transpose data=Have out=Dates(Drop=_Name_ rename=(Date1=Date)) Prefix=Date;
	by ID;
	Var Date:;
run;

Data Have_Transposed;
	merge Treatments Dates;
	by ID;
run;

proc sql;
	Select ID, Treatment,max(Date) as Max_Date format date9. from Have_Transposed 
		group by ID, Treatment;
quit;
novinosrin
Tourmaline | Level 20

Some generic fun for finding latest_Date for all treatments:

 

data have;
infile cards truncover;
input (ID        treatment1 treatment2 treatment3 treatment4 treatment5) ($) (date1   date2    date3    date4    date5) (:date9.);
format date: date9.;
cards;
John           A              B                   B              B               A            01Jan2018    15Jan2018   16Jan2018   17Jan2018    01Feb2018
David          A              A                   B             B                A            01Jan2018   15Mar2018 08Jan2018   05Jan2018      15Feb2018
;

data want;
if _n_=1 then do;
if 0 then set have;
length treat $8 dt 8;
format dt latest_date date9.;
dcl hash H (ordered: "A",multidata:'y') ;
   h.definekey  ("treat") ;
   h.definedata ("treat","dt") ;
   h.definedone () ;
   call missing(treat,dt);
end;
set have ;
array j(*) treatment1-treatment5;
array t(*) date1-date5;
do i=1 to dim(t);
	rc=h.add(key:j(i),data:j(i),data:t(i));
end; 
do i=1 to dim(j);
	rc=h.check(key:j(i));
	if rc=0 then do;
	do _iorc_=h.find(key:j(i)) by 0 while(_iorc_=0);
	latest_date=max(latest_date,dt);
	_iorc_=h.find_next();
	end;
	output;
	h.remove(key:j(i));
	call missing(latest_date,dt);
	end;
end;
keep id treat latest_date;
run;
novinosrin
Tourmaline | Level 20
data have;
infile cards truncover;
input (ID        treatment1 treatment2 treatment3 treatment4 treatment5) ($) (date1   date2    date3    date4    date5) (:date9.);
format date: date9.;
cards;
John           A              B                   B              B               A            01Jan2018    15Jan2018   16Jan2018   17Jan2018    01Feb2018
David          A              A                   B             B                A            01Jan2018   15Mar2018 08Jan2018   05Jan2018      15Feb2018
;

data temp;
set have;
array j(*) treatment1-treatment5;
array t(*) date1-date5;
do i=1 to dim(t);
treatment=j(i);
dt=t(i);
output;
end;
keep id treatment dt;
format dt date9.;
run;

proc sql;
create table want as
select id,treatment,max(dt) as latest_date format=date9.
from temp
group by id,treatment;
quit;

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 3202 views
  • 3 likes
  • 5 in conversation