Hi,
I have the following data with these variables. For the same ID, the race/ethnicity is different, when it is supposed to be the same.
ID_person | Admit Date | RaceEthnicity |
1234 | 1/03/2019 | Non-Hispanic Black |
1234 | 2/06/2019 | Non-Hispanic White |
3421 | 5/14/2020 | Hispanic |
3421 | 3/21/2021 | Non-Hispanic White |
4536 | 6/4/2020 | Unknown |
To standardize the race/ethnicity information, I decided to make the race/ethnicity for each ID same as it was at the earliest admission date. How do I code for this?
I need the data to be like this:
ID_person | Admit Date | RaceEthnicity | Raceeth_new |
1234 | 1/03/2019 | Non-Hispanic Black | Non-Hispanic Black |
1234 | 2/06/2019 | Non-Hispanic White | Non-Hispanic Black |
3421 | 5/14/2020 | Hispanic | Hispanic |
3421 | 3/21/2021 | Non-Hispanic White | Hispanic |
4536 | 6/4/2020 | Unknown | Unknown |
Assuming your data is sorted by ID_person and Admit_Date
data want;
set have;
by id_person;
retain raceeth_new;
if first.id_person then raceeth_new=racethnicity;
run;
Code is untested, as I cannot test the code against your screen captures (I can only test code against SAS data sets, please from now on provide SAS data sets as working SAS data step code, which you can type in yourself or follow these instructions)
Assuming your data is sorted by ID_person and Admit_Date
data want;
set have;
by id_person;
retain raceeth_new;
if first.id_person then raceeth_new=racethnicity;
run;
Code is untested, as I cannot test the code against your screen captures (I can only test code against SAS data sets, please from now on provide SAS data sets as working SAS data step code, which you can type in yourself or follow these instructions)
Hi, this worked great! But what if the earliest admission date has 'Unknown" Race and Ethnicity? How do I code to take the race/ethnicity for next admission date? Thank you!
Delete the rows with missing RACEETHNICITY and then run my code.
I cannot delete the rows as I have other variables which has important information. Any other way?
@Gayatrikunchay wrote:
I cannot delete the rows as I have other variables which has important information. Any other way?
Delete the rows, run my code, obtain one row per ID, and then merge the results ReceEthnicity_new back into the original data set by ID, then no rows are missing.
This worked!! Thank you!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.