Hi Folks:
Could you please help me convert summary rows under municipality column (Current structure) to a variable shown as 'district' (Desired output) in the image below? I need district names as 'ID1NAME' variable and city names as 'IDNAME' variable for the ensuing data linkages.
data have;
input municipality : $9. count pop rate;
cards;
district1 1000 10000 10
city1 300 2500 12
city2 200 6000 3.3
city3 500 1500 33.3
district2 100 500 20
city4 50 200 25
city5 50 300 16.7
;
Hi @Cruise
You can do this, using the retain statement:
data want;
length ID1NAME $ 20;
set have;
retain ID1NAME;
if find(municipality,"district")>0 then do;
ID1NAME=municipality;
delete;
end;
run;
Hi @Cruise
You can do this, using the retain statement:
data want;
length ID1NAME $ 20;
set have;
retain ID1NAME;
if find(municipality,"district")>0 then do;
ID1NAME=municipality;
delete;
end;
run;
Thank you @ed_sas_member . Sorry for the ambiguity. The values that variable municipality takes don't share prefix: district but different district names. There are 9 districts that I could specify for the 'find' function. But some cities takes same names as their districts. Is there any solution to the update problem?
data have;
input municipality : $9. count pop rate;
cards;
Daejeon 1000 10000 10
A_city 300 2500 12
Daejeon 200 6000 3.3
B_city 500 1500 33.3
Busan 100 500 20
C_city 50 200 25
Busan 50 300 16.7
;
Given your structure, there is nothing to discern a district observation from a city observation as soon as they have the same name.
If the line containing the district will always be the first for such a group, you can do this, using 2 lookup hash tables:
data have;
input municipality : $9. count pop rate;
cards;
Daejeon 1000 10000 10
A_city 300 2500 12
Daejeon 200 6000 3.3
B_city 500 1500 33.3
Busan 100 500 20
C_city 50 200 25
Busan 50 300 16.7
;
data districts;
input municipality :$9.;
datalines;
Daejeon
Busan
;
data want;
set have;
if _n_ = 1
then do;
declare hash di (dataset:"districts");
di.definekey("municipality");
di.definedone();
declare hash df ();
df.definekey("municipality");
df.definedone();
end;
retain district;
if di.check() = 0 and df.check() ne 0
then do;
district = municipality;
df.add();
delete;
end;
run;
di holds the reference which names can appear as districts, and df holds the reference if a certain name was already used for the district.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.