Folks,
I've a bit of a query which maybe people can clear up. I have two string variables one being an id and the other being a location. What I would like to do is if two ids are the same then geographic location gets written to all identical ids.
So from something like this;
| ID | County | 
| 41A55F2DD6000000 | |
| 41A55F2DD6000000 | KILKENNY | 
| 41A7508F18000000 | |
| 41A7508F18000000 | |
| 41CA171A0A000000 | |
| 41CA171A0A000000 | DUBLIN | 
| 41CD59FD4B000000 | GALWAY | 
| 41CD59FD4B000000 | GALWAY | 
| 41D0A741A4400000 | KILDARE | 
| 41D0A741A4400000 | 
To this;
| ID | County | 
| 41A55F2DD6000000 | KILKENNY | 
| 41A55F2DD6000000 | KILKENNY | 
| 41A7508F18000000 | |
| 41A7508F18000000 | |
| 41CA171A0A000000 | DUBLIN | 
| 41CA171A0A000000 | DUBLIN | 
| 41CD59FD4B000000 | GALWAY | 
| 41CD59FD4B000000 | GALWAY | 
| 41D0A741A4400000 | KILDARE | 
| 41D0A741A4400000 | KILDARE | 
Any help is appreicated.
Do you ever have a case in your data where an ID has two different non-missing values for the county/ geography?
If not:
data have;
   infile datalines missover;
   informat id $17. county $10.;
   input ID County ;
datalines;
41A55F2DD6000000   
41A55F2DD6000000 KILKENNY 
41A7508F18000000   
41A7508F18000000   
41CA171A0A000000   
41CA171A0A000000 DUBLIN 
41CD59FD4B000000 GALWAY 
41CD59FD4B000000 GALWAY 
41D0A741A4400000 KILDARE 
41D0A741A4400000   
;
run;
proc sql;
   create table want as
   select a.id, b.county
   from have as a left join
        (select distinct id,county from have 
         where not missing(county)) as b
       on a.id=b.id;
quit;
The proc sql part is the important one, the data step is just to have something to test the code with.
Please check the new_county variable with the expected output
data have;
infile cards missover;
input ID$20.	County$15.;
cards;
41A55F2DD6000000	 
41A55F2DD6000000	KILKENNY
41A7508F18000000	 
41A7508F18000000	 
41CA171A0A000000	 
41CA171A0A000000	DUBLIN
41CD59FD4B000000	GALWAY
41CD59FD4B000000	GALWAY
41D0A741A4400000	KILDARE
41D0A741A4400000	      
;
proc sort data=have;
by id descending County;
run;
data want;
set have;
by id descending County;
retain new_county;
if first.id then new_county=county;
run;
data have; infile datalines missover; informat id $17. county $10.; input ID County ; datalines; 41A55F2DD6000000 41A55F2DD6000000 KILKENNY 41A7508F18000000 41A7508F18000000 41CA171A0A000000 41CA171A0A000000 DUBLIN 41CD59FD4B000000 GALWAY 41CD59FD4B000000 GALWAY 41D0A741A4400000 KILDARE 41D0A741A4400000 ; run; data want; merge have(keep=id) have(where=(county is not missing)); by id; run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
