Folks,
I've a bit of a query which maybe people can clear up. I have two string variables one being an id and the other being a location. What I would like to do is if two ids are the same then geographic location gets written to all identical ids.
So from something like this;
ID | County |
41A55F2DD6000000 | |
41A55F2DD6000000 | KILKENNY |
41A7508F18000000 | |
41A7508F18000000 | |
41CA171A0A000000 | |
41CA171A0A000000 | DUBLIN |
41CD59FD4B000000 | GALWAY |
41CD59FD4B000000 | GALWAY |
41D0A741A4400000 | KILDARE |
41D0A741A4400000 |
To this;
ID | County |
41A55F2DD6000000 | KILKENNY |
41A55F2DD6000000 | KILKENNY |
41A7508F18000000 | |
41A7508F18000000 | |
41CA171A0A000000 | DUBLIN |
41CA171A0A000000 | DUBLIN |
41CD59FD4B000000 | GALWAY |
41CD59FD4B000000 | GALWAY |
41D0A741A4400000 | KILDARE |
41D0A741A4400000 | KILDARE |
Any help is appreicated.
Do you ever have a case in your data where an ID has two different non-missing values for the county/ geography?
If not:
data have; infile datalines missover; informat id $17. county $10.; input ID County ; datalines; 41A55F2DD6000000 41A55F2DD6000000 KILKENNY 41A7508F18000000 41A7508F18000000 41CA171A0A000000 41CA171A0A000000 DUBLIN 41CD59FD4B000000 GALWAY 41CD59FD4B000000 GALWAY 41D0A741A4400000 KILDARE 41D0A741A4400000 ; run; proc sql; create table want as select a.id, b.county from have as a left join (select distinct id,county from have where not missing(county)) as b on a.id=b.id; quit;
The proc sql part is the important one, the data step is just to have something to test the code with.
Please check the new_county variable with the expected output
data have;
infile cards missover;
input ID$20. County$15.;
cards;
41A55F2DD6000000
41A55F2DD6000000 KILKENNY
41A7508F18000000
41A7508F18000000
41CA171A0A000000
41CA171A0A000000 DUBLIN
41CD59FD4B000000 GALWAY
41CD59FD4B000000 GALWAY
41D0A741A4400000 KILDARE
41D0A741A4400000
;
proc sort data=have;
by id descending County;
run;
data want;
set have;
by id descending County;
retain new_county;
if first.id then new_county=county;
run;
data have; infile datalines missover; informat id $17. county $10.; input ID County ; datalines; 41A55F2DD6000000 41A55F2DD6000000 KILKENNY 41A7508F18000000 41A7508F18000000 41CA171A0A000000 41CA171A0A000000 DUBLIN 41CD59FD4B000000 GALWAY 41CD59FD4B000000 GALWAY 41D0A741A4400000 KILDARE 41D0A741A4400000 ; run; data want; merge have(keep=id) have(where=(county is not missing)); by id; run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.