I have a dataset with several fields. There are two fields for county - County and County_1. I want to use County_1. However, there are 5 text strings in County_1 that need to be replaced with the corresponding value from County as well as missing values in county_1 provided county has a corresponding value. Not every observation between the two sets is populated. There will end up being a few missing.
So for example this is what the data look like: The five strings in county_1 that need replacement are North, Warm, Confederated, OPHD, and Yellowhawk as well as missing.
County_1 County
Polk
Baker Baker
North Columbia
Siskiyou
Warm Baker
Confederated Washington
Malheur
Yellowhawk Jefferson
Wasco
Warm
OPHD Yamhill
So the two missing obs in county_1 need to be replaced with text from county as well as when the other
erroneous text appear (Confederated, OPHD, Warm, North, Yellowhawk). There may be some like the example with Warm in county_1 and a missing value in county. That is okay, since there will inevitably be some missing values. The idea is to populate as many observations in county_1 with acceptable substitutes.
Thank you for your help.
wlierman
Do you want "Warm" replaced with "Baker" when County is missing? Meaning "always replace Warm with Baker"? or does the value of County vary for "Warm"?
One is just as easy to program as the other. Replace with county:
Data want; set have; select (county_1); when ("","Confederated", "OPHD", "Warm", "North", "Yellowhawk") county_1=County; otherwise; end; run;
Replace with fixed value
Data want; set have; select (county_1); when ("") county_1=County; when ("Confederated") county_1="Washington"; when ("OPHD") county_1="Yamhill"; when ("Warm") county_1="Baker"; when ("North") county_1="Columbia"; when ("Yellowhawk") county_1="Jefferson"; otherwise; end; run;
Do you want "Warm" replaced with "Baker" when County is missing? Meaning "always replace Warm with Baker"? or does the value of County vary for "Warm"?
One is just as easy to program as the other. Replace with county:
Data want; set have; select (county_1); when ("","Confederated", "OPHD", "Warm", "North", "Yellowhawk") county_1=County; otherwise; end; run;
Replace with fixed value
Data want; set have; select (county_1); when ("") county_1=County; when ("Confederated") county_1="Washington"; when ("OPHD") county_1="Yamhill"; when ("Warm") county_1="Baker"; when ("North") county_1="Columbia"; when ("Yellowhawk") county_1="Jefferson"; otherwise; end; run;
Thank you very much for the quick response and solution to this question.
I really appreciate your help (and this hasn't been the first time nor will it be the last.)
wlierman
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.