Hi,
My data looks like this.
PATIENT_ID | EVENT_ID | VISIT_AGE | VISIT_DT | SEX | Genital Exposure | Oral Exposure | Rectal Exposure | Sex With Female | Sex With Male | Sex with Transgender |
5855 | 57927 | 36 | 9/12/2017 | Male | Y | B | 0 | 1 | 0 | 0 |
5855 | 58140 | 36 | 9/22/2017 | Male | 0 | 0 | 0 | |||
7490 | 54781 | 38 | 4/17/2017 | Female | Y | B | R | 0 | 1 | 0 |
7490 | 54923 | 38 | 4/24/2017 | Female | 0 | 0 | 0 | 0 | 0 | 0 |
8013 | 54361 | 42 | 3/27/2017 | Female | Y | B | 0 | 0 | 1 | 0 |
8013 | 54499 | 42 | 4/3/2017 | Female | 0 | 0 | 0 | |||
8013 | 54642 | 42 | 4/10/2017 | Female | 0 | 0 | 0 |
Please note that same patient(patient_id=5855) in their next visit has incomplete info for example in the table above sex with female info is missing or says 0 in second visit on 09/22. i want it to 1 as it was on 09/12 visit. Similarly, patient id 8013 visited three times but sex with male info is 1 in only on first visit, i want it to be 1 all three visits. How do i code to get the data i want?
@Dhana18 wrote:
Yes, if in any visit sex with male or female is recorded as 1, and no info on other visits all visits should be 1
Here's one way with a simpler data set:
data have; input patient_id swf swm; datalines; 5855 1 0 5855 0 0 5855 0 0 7777 0 1 7777 0 0 7777 0 0 8888 0 0 8888 1 0 8888 0 0 8888 0 1 ; proc sql; create table want as select a.patient_id, b.swf, b.swm from have as a left join (select patient_id, max(swf)as swf, max(swm) as swm from have group by patient_id) as b on a.patient_id =b.patient_id ; quit;
HAVE is your existing data. you would include the other variables other than the ones that are to be reset using the a.<variable> on the first select.
The
(select patient_id, max(swf)as swf, max(swm) as swm from have group by patient_id) as b
finds the largest value for the variables involved for each patient and refers the set of the results as b. This merges the base data with the matching largest.
Please note the use of a data step to provide example data which can be used to test code.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
@Dhana18 wrote:
Hi,
My data looks like this.
PATIENT_ID EVENT_ID VISIT_AGE VISIT_DT SEX Genital Exposure Oral Exposure Rectal Exposure Sex With Female Sex With Male Sex with Transgender 5855 57927 36 9/12/2017 Male Y B 0 1 0 0 5855 58140 36 9/22/2017 Male 0 0 0 7490 54781 38 4/17/2017 Female Y B R 0 1 0 7490 54923 38 4/24/2017 Female 0 0 0 0 0 0 8013 54361 42 3/27/2017 Female Y B 0 0 1 0 8013 54499 42 4/3/2017 Female 0 0 0 8013 54642 42 4/10/2017 Female 0 0 0 Please note that same patient(patient_id=5855) in their next visit has incomplete info for example in the table above sex with female info is missing or says 0 in second visit on 09/22. i want it to 1 as it was on 09/12 visit. Similarly, patient id 8013 visited three times but sex with male info is 1 in only on first visit, i want it to be 1 all three visits. How do i code to get the data i want?
Does your rule only involve maintaining a value of 1 if it was in the First visit? Or if any of the previous visits had a 1?
Consider if you have a patient_id that has 0 for "sex with male" for the first 3 visits and then has a 1 on the 4th visit. Would the 5th and later visits have to be 1? Would the first 3 have to be recoded to 1 as well?
@Dhana18 wrote:
Yes, if in any visit sex with male or female is recorded as 1, and no info on other visits all visits should be 1
So just take the max over the patient.
@Dhana18 wrote:
Yes, if in any visit sex with male or female is recorded as 1, and no info on other visits all visits should be 1
Here's one way with a simpler data set:
data have; input patient_id swf swm; datalines; 5855 1 0 5855 0 0 5855 0 0 7777 0 1 7777 0 0 7777 0 0 8888 0 0 8888 1 0 8888 0 0 8888 0 1 ; proc sql; create table want as select a.patient_id, b.swf, b.swm from have as a left join (select patient_id, max(swf)as swf, max(swm) as swm from have group by patient_id) as b on a.patient_id =b.patient_id ; quit;
HAVE is your existing data. you would include the other variables other than the ones that are to be reset using the a.<variable> on the first select.
The
(select patient_id, max(swf)as swf, max(swm) as swm from have group by patient_id) as b
finds the largest value for the variables involved for each patient and refers the set of the results as b. This merges the base data with the matching largest.
Please note the use of a data step to provide example data which can be used to test code.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
You could possibly try to retain the variables Sex_With_Female Sex_With_Male Sex_with_Transgender
The below code is a untested one
data want;
set have;
by patient_id;
retain Sex_With_Female2 Sex_With_Male2 Sex_with_Transgender2;
array var1(3) Sex_With_Female Sex_With_Male Sex_with_Transgender;
array var2(3) Sex_With_Female2 Sex_With_Male2 Sex_with_Transgender2;
do i = 1 to dim(var1);
if var1(i)=0 then var1(i)=.;
if first.patient_id then var2(i)=.;
if var1(i) ne . then var2(i)=var2(i);
end;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.