Folks,
I've a bit of a query which maybe people can clear up. I have two string variables one being an id and the other being a location. What I would like to do is if two ids are the same then geographic location gets written to all identical ids.
So from something like this;
| ID | County |
| 41A55F2DD6000000 | |
| 41A55F2DD6000000 | KILKENNY |
| 41A7508F18000000 | |
| 41A7508F18000000 | |
| 41CA171A0A000000 | |
| 41CA171A0A000000 | DUBLIN |
| 41CD59FD4B000000 | GALWAY |
| 41CD59FD4B000000 | GALWAY |
| 41D0A741A4400000 | KILDARE |
| 41D0A741A4400000 |
To this;
| ID | County |
| 41A55F2DD6000000 | KILKENNY |
| 41A55F2DD6000000 | KILKENNY |
| 41A7508F18000000 | |
| 41A7508F18000000 | |
| 41CA171A0A000000 | DUBLIN |
| 41CA171A0A000000 | DUBLIN |
| 41CD59FD4B000000 | GALWAY |
| 41CD59FD4B000000 | GALWAY |
| 41D0A741A4400000 | KILDARE |
| 41D0A741A4400000 | KILDARE |
Any help is appreicated.
Do you ever have a case in your data where an ID has two different non-missing values for the county/ geography?
If not:
data have;
infile datalines missover;
informat id $17. county $10.;
input ID County ;
datalines;
41A55F2DD6000000
41A55F2DD6000000 KILKENNY
41A7508F18000000
41A7508F18000000
41CA171A0A000000
41CA171A0A000000 DUBLIN
41CD59FD4B000000 GALWAY
41CD59FD4B000000 GALWAY
41D0A741A4400000 KILDARE
41D0A741A4400000
;
run;
proc sql;
create table want as
select a.id, b.county
from have as a left join
(select distinct id,county from have
where not missing(county)) as b
on a.id=b.id;
quit;
The proc sql part is the important one, the data step is just to have something to test the code with.
Please check the new_county variable with the expected output
data have;
infile cards missover;
input ID$20. County$15.;
cards;
41A55F2DD6000000
41A55F2DD6000000 KILKENNY
41A7508F18000000
41A7508F18000000
41CA171A0A000000
41CA171A0A000000 DUBLIN
41CD59FD4B000000 GALWAY
41CD59FD4B000000 GALWAY
41D0A741A4400000 KILDARE
41D0A741A4400000
;
proc sort data=have;
by id descending County;
run;
data want;
set have;
by id descending County;
retain new_county;
if first.id then new_county=county;
run;
data have; infile datalines missover; informat id $17. county $10.; input ID County ; datalines; 41A55F2DD6000000 41A55F2DD6000000 KILKENNY 41A7508F18000000 41A7508F18000000 41CA171A0A000000 41CA171A0A000000 DUBLIN 41CD59FD4B000000 GALWAY 41CD59FD4B000000 GALWAY 41D0A741A4400000 KILDARE 41D0A741A4400000 ; run; data want; merge have(keep=id) have(where=(county is not missing)); by id; run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.