BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lizwarr
Fluorite | Level 6

I have a data set containing ID, Visit Type, and Date.

 

IDVisitDate
1clinic20060505
1hospital20060506
1hospital20061217
2clinic20060301
2clinic20060305
2clinic20070503
2clinic20070506
2clinic20100505
3clinic20061112
4clinic20080103
4clinic20081012
4hospital20081227
5clinic20050325
5hospital20050412
5hospital20070510
5clinic20061010
5clinic20061231
5clinic20070125
6clinic20060718
6clinic20060817
6clinic20070918

 

Now based on this, I want to define "Index Date."

For patients who visited both hospital and clinic, I want the index date to be the first visit of hospital.

For patients who only visited clinics, I want the index date to be the first visit of clinic.

 

So the new table I want would look something like this

 

IDVisitDateIndex Date
1clinic2006050520060506
1hospital2006050620060506
1hospital2006121720060506
2clinic2006030120060301
2clinic2006030520060301
2clinic2007050320060301
2clinic2007050620060301
2clinic2010050520060301
3clinic2006111220061112
4clinic2008010320081227
4clinic2008101220081227
4hospital2008122720081227
5clinic2005032520050412
5hospital2005041220050412
5hospital2007051020050412
5clinic2006101020050412
5clinic2006123120050412
5clinic2007012520050412
6clinic2006071820060718
6clinic2006081720060718
6clinic2007091820060718

 

I tried using DO loop but seems like I just can't get it right.

How to solve this?!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
infile cards expandtabs;
input ID	Visit $	Date : yymmdd.;
format date yymmddn8.;
cards;
1	clinic	20060505
1	hospital	20060506
1	hospital	20061217
2	clinic	20060301
2	clinic	20060305
2	clinic	20070503
2	clinic	20070506
2	clinic	20100505
3	clinic	20061112
4	clinic	20080103
4	clinic	20081012
4	hospital	20081227
5	clinic	20050325
5	hospital	20050412
5	hospital	20070510
5	clinic	20061010
5	clinic	20061231
5	clinic	20070125
6	clinic	20060718
6	clinic	20060817
6	clinic	20070918
;

proc sort data=have out=temp;
by id descending visit date;
run;
proc sort data=temp out=temp1 nodupkey;
by id ;
run;
data want;
 merge have temp1(keep=id date rename=(date=index_date));
 by id;
run;

View solution in original post

8 REPLIES 8
ed_sas_member
Meteorite | Level 14

HI @lizwarr 

 

You can try the following code.

However, in your output, I don't understand why index_date for ID 3 should be 2006-03-01 and not 2006-11-12, and why index_date for ID 5 should not be 2005-04-12 for all the records?

Is it a typing mistake?

Best,

proc sql;

	create table want as
	
	/*For patients who only visited clinics*/
	select *, min(date) as index_date format=YYMMDD10.
	from have
	group by id
	having count(distinct visit) =1 and visit = "clinic"

	union corr all 

	/*For patients who visited both clinics and hospitals*/
	select b.*, a.index_date
	from (select distinct id, min(date) as index_date format=YYMMDD10.
		from (select *
			 from have
			 group by id
			 having count(distinct visit) =2 and visit in("clinic","hospital"))
		where visit = "hospital"
		group by id) as a
		inner join
		(select *
		from have
		group by id
		having count(distinct visit) =2 and visit in("clinic","hospital")) as b
		on a.id = b.id
		
	order by id, date;
quit;
lizwarr
Fluorite | Level 6
Yes, those are my terrible typos, since I was in a rush to get somewhere.. Sorry for the confusion. Thank you for your quick response! I tried the code you provided, but I keep getting the error code at the end saying "character expression requires a character format" for the line "order by id, date;" My ID is numeric variable, whereas the date is character..
ed_sas_member
Meteorite | Level 14

Hi @lizwarr 

 

Thank you for the clarification.

I thought date was a regular SAS date.

As it is actually character, you can remove 'format=YYMMDD10." as follows.

NB: there is no need to convert it in a valid SAS date as it is in the form YYMMDD, so the alphabetical order is OK to get the minimal value.

Best,

proc sql;

	create table want as
	
	/*For patients who only visited clinics*/

	select *, min(date) as index_date
	from have
	group by id
	having count(distinct visit) =1 and visit = "clinic"

	union corr all 

	/*For patients who visited both clinics and hospitals*/
	select b.*, a.index_date
	from (select distinct id, min(date) as index_date
		from (select *
			 from have
			 group by id
			 having count(distinct visit) =2 and visit in("clinic","hospital"))
		where visit = "hospital"
		group by id) as a
		inner join
		(select *
		from have
		group by id
		having count(distinct visit) =2 and visit in("clinic","hospital")) as b
		on a.id = b.id
		
	order by id, date;
quit;

 

Ksharp
Super User
data have;
infile cards expandtabs;
input ID	Visit $	Date : yymmdd.;
format date yymmddn8.;
cards;
1	clinic	20060505
1	hospital	20060506
1	hospital	20061217
2	clinic	20060301
2	clinic	20060305
2	clinic	20070503
2	clinic	20070506
2	clinic	20100505
3	clinic	20061112
4	clinic	20080103
4	clinic	20081012
4	hospital	20081227
5	clinic	20050325
5	hospital	20050412
5	hospital	20070510
5	clinic	20061010
5	clinic	20061231
5	clinic	20070125
6	clinic	20060718
6	clinic	20060817
6	clinic	20070918
;

proc sort data=have out=temp;
by id descending visit date;
run;
proc sort data=temp out=temp1 nodupkey;
by id ;
run;
data want;
 merge have temp1(keep=id date rename=(date=index_date));
 by id;
run;
lizwarr
Fluorite | Level 6
Ah! As simple as that!! Thank you so much!
novinosrin
Tourmaline | Level 20

Hi @lizwarr  The loop you mean perhaps could be the following. Also, I request you to kindly please review the contents of your description and make sure the expected output matches the description as mentioned by @ed_sas_member .  The below code took the Date as numeric. No big deal, just remove the format statement in the code if your date variable is actually character.



data have;
infile cards expandtabs;
input ID	Visit $	Date : yymmdd.;
format date yymmddn8.;
cards;
1	clinic	20060505
1	hospital	20060506
1	hospital	20061217
2	clinic	20060301
2	clinic	20060305
2	clinic	20070503
2	clinic	20070506
2	clinic	20100505
3	clinic	20061112
4	clinic	20080103
4	clinic	20081012
4	hospital	20081227
5	clinic	20050325
5	hospital	20050412
5	hospital	20070510
5	clinic	20061010
5	clinic	20061231
5	clinic	20070125
6	clinic	20060718
6	clinic	20060817
6	clinic	20070918
;

data want;
 _iorc_=.;
 do _n_=1 by 1 until(last.id);
  set have;
  by id visit notsorted;
  if first.visit then _iorc_+1;
  if not _h_date and visit='hospital' then _h_date=date;
  if not _c_date and visit='clinic' then _c_date=date;
 end;
 do _n_=1 to _n_;
  set have;
  if _iorc_>=2 then index_date=_h_date;
  else index_date=_c_date;
  output;
 end;
 format index_date yymmdd10.;
 drop _:;
run;

proc print noobs;run;
novinosrin
Tourmaline | Level 20

Generally elders/wise SAS veterans often advice that SAS functionality works best when you have dates as numeric SAS dates. This helps in computing intervals of various kinds etc for many kinds of analysis. 

So if numeric-->Proc SQL can be handy



proc sql;
create table want(drop=_n_) as
select *,min(ifn(not _n_,.,_n_)) as index_date format=yymmddn8.
from 
(select *, ifn(count(distinct Visit)=2,date*(visit='hospital'),date*(visit='clinic')) as _n_
from have 
group by id)
group by id
order by id,date;
quit;
Shmuel
Garnet | Level 18

Try next code:

proc sort data=have;
   by ID descending visit date;
run;
data want;
 set have;
  by ID;
      retain index_date;
      if first.ID then index_date = date;
run;

If need you can sort output to desired order - for example:

proc sort data=want; by ID date; run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1665 views
  • 8 likes
  • 5 in conversation