BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cruise
Ammonite | Level 13

Hi Folks: 

 

Could you please help me convert summary rows under municipality column (Current structure) to a variable shown as 'district' (Desired output) in the image below? I need district names as 'ID1NAME' variable and city names as 'IDNAME' variable for the ensuing data linkages.    

 

table wanted.png

data have;
input municipality : $9. count pop rate;
cards;
district1 1000 10000 10
city1 300 2500 12
city2 200 6000 3.3
city3 500 1500 33.3
district2 100 500 20
city4 50 200 25
city5 50 300 16.7
;
1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @Cruise 

You can do this, using the retain statement:

data want;
	length ID1NAME $ 20;
	set have;
	retain ID1NAME;
	if find(municipality,"district")>0 then do;
		ID1NAME=municipality;
		delete;
	end;
run;

Capture d’écran 2020-05-02 à 20.39.05.png

View solution in original post

5 REPLIES 5
ed_sas_member
Meteorite | Level 14

Hi @Cruise 

You can do this, using the retain statement:

data want;
	length ID1NAME $ 20;
	set have;
	retain ID1NAME;
	if find(municipality,"district")>0 then do;
		ID1NAME=municipality;
		delete;
	end;
run;

Capture d’écran 2020-05-02 à 20.39.05.png

Cruise
Ammonite | Level 13

Thank you @ed_sas_member . Sorry for the ambiguity. The values that variable municipality takes don't share prefix: district but different district names. There are 9 districts that I could specify for the 'find' function. But some cities takes same names as their districts. Is there any solution to the update problem? 

data have;
input municipality : $9. count pop rate;
cards;
Daejeon 1000 10000 10
A_city 300 2500 12
Daejeon 200 6000 3.3
B_city 500 1500 33.3
Busan 100 500 20
C_city 50 200 25
Busan 50 300 16.7
;

 

Cruise
Ammonite | Level 13
True, however, District gets the sum value of cities combined, or mean rates of cities combined. Can this logic be used here?
Kurt_Bremser
Super User

If the line containing the district will always be the first for such a group, you can do this, using 2 lookup hash tables:

data have;
input municipality : $9. count pop rate;
cards;
Daejeon 1000 10000 10
A_city 300 2500 12
Daejeon 200 6000 3.3
B_city 500 1500 33.3
Busan 100 500 20
C_city 50 200 25
Busan 50 300 16.7
;

data districts;
input municipality :$9.;
datalines;
Daejeon
Busan
;

data want;
set have;
if _n_ = 1
then do;
  declare hash di (dataset:"districts");
  di.definekey("municipality");
  di.definedone();
  declare hash df ();
  df.definekey("municipality");
  df.definedone();
end;
retain district;
if di.check() = 0 and df.check() ne 0
then do;
  district = municipality;
  df.add();
  delete;
end;
run;

di holds the reference which names can appear as districts, and df holds the reference if a certain name was already used for the district.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 843 views
  • 2 likes
  • 3 in conversation