BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ck3
Calcite | Level 5 ck3
Calcite | Level 5

Hi, in the following data I have the first seven variables on the left from ID to FLAGSPEC. I want to make the last four variables on the right starting with EP_N. The records are sorted by ascending ID and DATE. One row per ID-DATE is permitted. A person does not have multiple rows for the same date.

 

The data are event rows for coding into episodes. People can have more than one episode.

A new episode starts,
(1) when a person's very first data row is encountered, or
(2) when, for the same person, there are >60 days between one row's date and the next row's date.

 
 

IDDATEDIF_DATEA_TYPEFLAGANYFLAGREGFLAGSPECEP_NEP_TYPEEP_TIMEEP_FREQ
106Jan2017.011010403
110Feb201735000010403
115Feb20175000010403
111May20178501102011
118Sep201713001103172
125Sep2017710003172
104Dec20177011014111
202Jul2016.01101172
209Jul2016710001172

 

 

I have these:
ID: identifies people, two here.
DATE: is the service date.
DIF_DATE: is the interval from one date to the next, for the same person.
A_TYPE: is a dichotomous qualitative variable.
FLAGANY: codes an episode's first row. If the episode only has one row then it's also the last row.
FLAGREG: if it's the episode's first row and A_TYPE=0, then FLAGREG=1, otherwise FLAGREG=0.
FLAGSPEC: if it's the episode's first row and A_TYPE=1, then FLAGSPEC=1, otherwise FLAGSPEC=0.

I want to make these:
EP_N: an episode counter. It writes the same number to each episode's row(s); increments on the row when the person's next episode starts (when FLAGANY=1); and resets to 1 when encountering a new person.


EP_TYPE: characterizes the episode as regular or special. If all rows in the episode have A_TYPE=0, then EP_TYPE=0. If any row in the episode has A_TYPE=1, then EP_TYPE=1.

 

EP_TIME: the interval between the date of the episode's first and last rows, in days. If the episode is one row then EP_TIME=1.


EP_FREQ: the number of rows in the episode, written to each episode row.

 

I think that making EP_N is a key step because it would be a BY-group variable with ID. Thanks for any suggestions.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@novinosrin

 

See what happens in your code if you replace

select *,max(A_TYPE=1) as EP_TYPE,ifn(intck('days',min(date),max(date))=0,1,intck('days',min(date),max(date))) as EP_TIME,count(EP_N) as EP_FREQ

with

 

select *,max(A_TYPE=1) as EP_TYPE,Range(date) as EP_TIME,count(EP_N) as EP_FREQ

 

Range is max(value) - min(value) and returns 0 if all the values are the same (one date)

View solution in original post

5 REPLIES 5
Astounding
PROC Star

You are absolutely on the right path ... create EP_N first, then go back and create the rest.  Here's an easy way.

 

data want;

set have;

by id;

if first.id then ep_n = 1;

else if dif_date > 60 then ep_n + 1;

run;

 

Technically, you could compute DIF_DATE as you go ... it wouldn't need to be part of the incoming data.  But it's easy when DIF_DATE is already there.

novinosrin
Tourmaline | Level 20
data have;
input ID	DATE :date7.	DIF_DATE	A_TYPE	FLAGANY	FLAGREG	FLAGSPEC;
format date date7.;
datalines;
1	6-Jan-17	.	0	1	1	0
1	10-Feb-17	35	0	0	0	0
1	15-Feb-17	5	0	0	0	0
1	11-May-17	85	0	1	1	0
1	18-Sep-17	130	0	1	1	0
1	25-Sep-17	7	1	0	0	0
1	4-Dec-17	70	1	1	0	1
2	2-Jul-16	.	0	1	1	0
2	9-Jul-16	7	1	0	0	0
;

data temp;
if 0 then set have;
EP_N=0;
do until(last.id);
set have;
by id date;
if FLAGANY then EP_N+1;
output;
end;
run;

proc sql;
create table want as
select *,max(A_TYPE=1) as EP_TYPE,ifn(intck('days',min(date),max(date))=0,1,intck('days',min(date),max(date))) as EP_TIME,count(EP_N) as EP_FREQ
from temp
group by id,EP_N;
quit;
ballardw
Super User

@novinosrin

 

See what happens in your code if you replace

select *,max(A_TYPE=1) as EP_TYPE,ifn(intck('days',min(date),max(date))=0,1,intck('days',min(date),max(date))) as EP_TIME,count(EP_N) as EP_FREQ

with

 

select *,max(A_TYPE=1) as EP_TYPE,Range(date) as EP_TIME,count(EP_N) as EP_FREQ

 

Range is max(value) - min(value) and returns 0 if all the values are the same (one date)

novinosrin
Tourmaline | Level 20

@ballardw My oh My, Thank you sir. I envy your striking presence of mind. Jeez, why didn't i think of it. Thanks again for touch class catch. Simply neat......Cheers!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1738 views
  • 0 likes
  • 4 in conversation