BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sas21
Calcite | Level 5

I have two datasets.  One is a roster (one record per person) which I then used to pull their medical records.  I applied a case definition to their medical records and then kept their incident medical encounter.  So, the medical data has one record per person as well.  Everyone in the medical data is in the roster but not everyone in the roster is in the medical data.  

Examples of the data imports for the datasets below:

data medical_data;

    input team $ patientSSN $ Validation cy;

    datalines;

1234567891 1 2018

12345678912 1 2015

1234567893 1 2017

1234567894 1  2019

1234567895 1 2018

1234567896 1 2015

1234567897 1 2016

1234567898 1 2019

1234567899 1 2017

;

run;

 

data roster;

    input team $ SSN $country;

    datalines;

1234567891 qa

12345678912 af

1234567893 au

1234567894 qa

1234567895 af

1234567896 af

1234567897 bh

1234567898 qa

1234567899 qa

1234567818 qa

1234567819 bh

 

;

run;

An example of the code that I created to try to structure the data how I want:

 
proc sql;
create table roster_and_medical as
select *
from roster as a
left join medical_data as b
on a.ssn=b.patientssn ;
quit;
 
data want;
set roster_and_medical ; 
 
do i=2015 to 2020;
numerator=0;
denominator=0;
 
if (cy eq i) then
do;
numerator=1;
denominator=1;
end;
else
do;
if (i lt cy)  or (cy= .) then
do;
denominator=1;
end;
else;
end;
 
output;
end;
 
if cy=. then
do;
cy=i;
end;
 
run;
 

It is not structuring the data how I want it to look but it's close.  I want to count the incident encounter in the numerator and then count them in the denominator for years 2015-2020 up until their incident encounter or count them in the denominator for the entire timeframe if they don't have an incident encounter.  I need the cases to drop out of the denominator once they are no longer considered "at risk" because they have already developed the condition.  If someone could help with what I currently have going or if they have another idea that would work better or be more efficient, I would greatly appreciate it.  Thank you in advance.

 
An example how I want the data to look:
 
patientssnssnvalidationcy icountrynumerator denominator
12345678911234567891..2015qa01
12345678911234567891..2016qa01
12345678911234567891..2017qa01
12345678911234567891120182018qa11
12345678911234567891..2019qa00
12345678911234567891..2020qa00
 1234567818..2015qa01
 1234567818..2016qa01
 1234567818..2017qa01
 1234567818..2018qa01
 1234567818..2019qa01
 1234567818..2020qa01
1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

Your initial data steps did not work as intended, so I rewrote them - they are here, in case somebody else want to help with this:

data medical_data;
  length patientSSN $20;
    input patientSSN Validation cy;
    datalines;
1234567891 1 2018
12345678912 1 2015
1234567893 1 2017
1234567894 1  2019
1234567895 1 2018
1234567896 1 2015
1234567897 1 2016
1234567898 1 2019
1234567899 1 2017
;
run;
 
data roster;
  length SSN $20 country $2;
    input  SSN country 
    ;
    datalines;
1234567891 qa
12345678912 af
1234567893 au
1234567894 qa
1234567895 af
1234567896 af
1234567897 bh
1234567898 qa
1234567899 qa
1234567818 qa
1234567819 bh
;
run;

I rewrote your last data step to this:


data want;
  set roster_and_medical ; 
  numerator=0;
  denominator=missing(cy) or cy ge 2015;
  do i=2015 to 2020;
    if (cy eq i) then
      numerator=1;
    else if numerator=1 then do;
      numerator=0;
      denominator=0;
      end;
    output;
    end;
run;

which seems to give the result you want, as simple as I could make it.

View solution in original post

1 REPLY 1
s_lassen
Meteorite | Level 14

Your initial data steps did not work as intended, so I rewrote them - they are here, in case somebody else want to help with this:

data medical_data;
  length patientSSN $20;
    input patientSSN Validation cy;
    datalines;
1234567891 1 2018
12345678912 1 2015
1234567893 1 2017
1234567894 1  2019
1234567895 1 2018
1234567896 1 2015
1234567897 1 2016
1234567898 1 2019
1234567899 1 2017
;
run;
 
data roster;
  length SSN $20 country $2;
    input  SSN country 
    ;
    datalines;
1234567891 qa
12345678912 af
1234567893 au
1234567894 qa
1234567895 af
1234567896 af
1234567897 bh
1234567898 qa
1234567899 qa
1234567818 qa
1234567819 bh
;
run;

I rewrote your last data step to this:


data want;
  set roster_and_medical ; 
  numerator=0;
  denominator=missing(cy) or cy ge 2015;
  do i=2015 to 2020;
    if (cy eq i) then
      numerator=1;
    else if numerator=1 then do;
      numerator=0;
      denominator=0;
      end;
    output;
    end;
run;

which seems to give the result you want, as simple as I could make it.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 759 views
  • 0 likes
  • 2 in conversation