BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
OJohn_StaT
Obsidian | Level 7

Hello ,

This is my dataset;

data Member_Data;
input VISIT_DATE :date9. MEMBER_ID $ VISIT_YEAR VISIT_STATUS;
format  VISIT_DATE :date9.;
datalines;
06NOV2013 AA10450Z 2013 0 
05JUN2014 AA10450Z 2014 1 
26JUN2014 AA10450Z 2014 1 
21AUG2014 AA10450Z 2014 1 
18SEP2014 AA10450Z 2014 1 
03DEC2014 AA10450Z 2014 1 
11MAR2015 AA10450Z 2015 1 
15APR2015 AA10450Z 2015 1 
07OCT2015 AA10450Z 2015 1 
13OCT2015 AA10450Z 2015 1 
26OCT2015 AA10450Z 2015 1 
09NOV2015 AA10450Z 2015 1 
07DEC2015 AA10450Z 2015 1 
21DEC2015 AA10450Z 2015 1 
19APR2017 AA10450Z 2017 1 
10SEP2018 AA10450Z 2018 0 
15APR2019 AA10450Z 2019 1 
;
run;



data Member_Data_New;
    set Member_Data;
    by member_id visit_year;
    delta_days=visit_date-lag(visit_date);
    if first.visit_year then do;
        delta_days=.;
        counter=0;
    end;
    counter+1;
run;

OJohn_StaT_2-1657806169467.png

 

And I am trying to create a new variable for total number of visit for each calendar year. Could you please help me for this issue because I am struggling. Here is a sample of the wanted dataset(below). Thank you for your time and your help in advance.

 

OJohn_StaT_0-1657806103753.png

 

1 ACCEPTED SOLUTION

Accepted Solutions
tarheel13
Rhodochrosite | Level 12

I think this is what you want.

data Member_Data;
input VISIT_DATE :date9. MEMBER_ID $ VISIT_YEAR VISIT_STATUS;
format  VISIT_DATE :date9.;
datalines;
06NOV2013 AA10450Z 2013 0 
05JUN2014 AA10450Z 2014 1 
26JUN2014 AA10450Z 2014 1 
21AUG2014 AA10450Z 2014 1 
18SEP2014 AA10450Z 2014 1 
03DEC2014 AA10450Z 2014 1 
11MAR2015 AA10450Z 2015 1 
15APR2015 AA10450Z 2015 1 
07OCT2015 AA10450Z 2015 1 
13OCT2015 AA10450Z 2015 1 
26OCT2015 AA10450Z 2015 1 
09NOV2015 AA10450Z 2015 1 
07DEC2015 AA10450Z 2015 1 
21DEC2015 AA10450Z 2015 1 
19APR2017 AA10450Z 2017 1 
10SEP2018 AA10450Z 2018 0 
15APR2019 AA10450Z 2019 1 
;
run;

*only keep records that had disease;
proc sort data=member_data out=member2;
    by member_id visit_date visit_year;
    where visit_status=1;
run;

data member3;
    set member2;
    by member_id  visit_year;
    format prev_visit date9.;
    retain prev_visit;
    if first.visit_year then do;
        counter=0;
        delta_days=.;
        prev_visit=visit_date;
    end;
    counter+1;
    *only keep first and second visits;
    if counter in (1,2);
    
    *calculate number of days between first and second visit;
    if counter=2 then do;
        delta_days=visit_date-prev_visit;
    end;
run;

proc sort data=member3;
    by member_id visit_year descending delta_days;
run;

*remove duplicates by year and member_id;
data member4;
    set member3;
    by member_id visit_year;
    if first.visit_year;
run;

*count # of visits in same year;
proc sql number;
    create table num_visits as 
    select member_id, visit_year, count(distinct visit_date) as visit_number
        from member2 
        group by member_id, visit_year;
quit;

proc sql;
    select m.member_id, m.visit_year, m.visit_status, m.counter, m.delta_days, n.visit_number
        from member4 as m
        join num_visits as n 
        on m.member_id=n.member_id and n.visit_year=m.visit_year
        order by m.member_id, m.visit_year;
quit;

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26

You need to tell us the logic used.

 

Please explain in the final data set, how DELTA_DAYS and COUNTER and VISIT_STATUS is computed from the original data set.

 

Please explain in the final data set, why 2013 and 2017 don't appear.

--
Paige Miller
OJohn_StaT
Obsidian | Level 7

Hello PaigeMiller,

 

Thank you for your time and help. Sorry there was a mistake and I fixed it. Here is my dataset.

 

data Member_Data;
input VISIT_DATE :date9. MEMBER_ID $ VISIT_YEAR VISIT_STATUS;
format  VISIT_DATE :date9.;
datalines;
06NOV2013 AA10450Z 2013 0 
05JUN2014 AA10450Z 2014 1 
26JUN2014 AA10450Z 2014 1 
21AUG2014 AA10450Z 2014 1 
18SEP2014 AA10450Z 2014 1 
03DEC2014 AA10450Z 2014 1 
11MAR2015 AA10450Z 2015 1 
15APR2015 AA10450Z 2015 1 
07OCT2015 AA10450Z 2015 1 
13OCT2015 AA10450Z 2015 1 
26OCT2015 AA10450Z 2015 1 
09NOV2015 AA10450Z 2015 1 
07DEC2015 AA10450Z 2015 1 
21DEC2015 AA10450Z 2015 1 
19APR2017 AA10450Z 2017 1 
10SEP2018 AA10450Z 2018 0 
15APR2019 AA10450Z 2019 1 
;
run;



data want;
    set Member_Data;
    by member_id visit_year;
    delta_days=visit_date-lag(visit_date);
    if first.visit_year then do;
        delta_days=.;
        counter=0;
    end;
    counter+1;
	do;
	if VISIT_STATUS=0 then counter=0;else counter=counter;
end;
run;

I created the following dataset using the above code;

 

OJohn_StaT_0-1657810796625.png

 

The data set below is the data set I want to create;

OJohn_StaT_1-1657811016104.png

What I would like to find are those who visit at least once a year while showing symptoms of the disease. If anyone has visited while showing symptoms more than once during the year, then how many times did they visit in total and the difference in days between their first visit and their second visit.

 

delta_days=day difference between the current visit date and the previous visit date

Visit_status =showing symptoms of illness when visiting

counter=Calculation of the number of visits while showing symptoms of the disease in the same year

 

Thank you so much

PaigeMiller
Diamond | Level 26

I think this still leaves some questions. I don't see the logic you are using to create the 2nd table.

 

I don't know what you mean by "while showing symptoms", where is that in the input data set?

 

How do you get the final numbers for COUNTER in the output data set? What is the logic? What is the math?

--
Paige Miller
OJohn_StaT
Obsidian | Level 7

Hello PaigeMiller,

 

I created second dataset by hand which means deleted some rows to show what I need to have as a final dataset. So Second dataset is what I want to create and I could not created the logic so that is why I asked for help.

while showing symptoms=visit_status. 

This data for doctor visit. Each row is the for each visit. Sometimes patient has symptoms sometimes it does not have it. So if visit_status=1 then patient has symptoms during the dr visit. 

Thank you

PaigeMiller
Diamond | Level 26

@OJohn_StaT wrote:

Hello PaigeMiller,

 

I created second dataset by hand which means deleted some rows to show what I need to have as a final dataset. So Second dataset is what I want to create and I could not created the logic so that is why I asked for help.

while showing symptoms=visit_status. 

This data for doctor visit. Each row is the for each visit. Sometimes patient has symptoms sometimes it does not have it. So if visit_status=1 then patient has symptoms during the dr visit. 

Thank you


You must know what you want, or you couldn't have created the 2nd data set. So, I am not asking for the logic to create the 2nd data set in SAS code, I am asking for you to describe the steps to create the 2nd data set in words. If you can't explain that logic in words, then code cannot be written. I still don't know how COUNTER in the 2nd data set is created from the original data.

--
Paige Miller
tarheel13
Rhodochrosite | Level 12

I think this is what you want.

data Member_Data;
input VISIT_DATE :date9. MEMBER_ID $ VISIT_YEAR VISIT_STATUS;
format  VISIT_DATE :date9.;
datalines;
06NOV2013 AA10450Z 2013 0 
05JUN2014 AA10450Z 2014 1 
26JUN2014 AA10450Z 2014 1 
21AUG2014 AA10450Z 2014 1 
18SEP2014 AA10450Z 2014 1 
03DEC2014 AA10450Z 2014 1 
11MAR2015 AA10450Z 2015 1 
15APR2015 AA10450Z 2015 1 
07OCT2015 AA10450Z 2015 1 
13OCT2015 AA10450Z 2015 1 
26OCT2015 AA10450Z 2015 1 
09NOV2015 AA10450Z 2015 1 
07DEC2015 AA10450Z 2015 1 
21DEC2015 AA10450Z 2015 1 
19APR2017 AA10450Z 2017 1 
10SEP2018 AA10450Z 2018 0 
15APR2019 AA10450Z 2019 1 
;
run;

*only keep records that had disease;
proc sort data=member_data out=member2;
    by member_id visit_date visit_year;
    where visit_status=1;
run;

data member3;
    set member2;
    by member_id  visit_year;
    format prev_visit date9.;
    retain prev_visit;
    if first.visit_year then do;
        counter=0;
        delta_days=.;
        prev_visit=visit_date;
    end;
    counter+1;
    *only keep first and second visits;
    if counter in (1,2);
    
    *calculate number of days between first and second visit;
    if counter=2 then do;
        delta_days=visit_date-prev_visit;
    end;
run;

proc sort data=member3;
    by member_id visit_year descending delta_days;
run;

*remove duplicates by year and member_id;
data member4;
    set member3;
    by member_id visit_year;
    if first.visit_year;
run;

*count # of visits in same year;
proc sql number;
    create table num_visits as 
    select member_id, visit_year, count(distinct visit_date) as visit_number
        from member2 
        group by member_id, visit_year;
quit;

proc sql;
    select m.member_id, m.visit_year, m.visit_status, m.counter, m.delta_days, n.visit_number
        from member4 as m
        join num_visits as n 
        on m.member_id=n.member_id and n.visit_year=m.visit_year
        order by m.member_id, m.visit_year;
quit;
OJohn_StaT
Obsidian | Level 7

Thank you Tarhell13,

 

I think this is it. It work and I will try it with a bigger sample. I will let you now ASAP after I applied it.

tarheel13
Rhodochrosite | Level 12

Okay, let me know if it works for you!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 1203 views
  • 4 likes
  • 3 in conversation