BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cruise
Ammonite | Level 13

Hi Folks: 

 

Could you please help me convert summary rows under municipality column (Current structure) to a variable shown as 'district' (Desired output) in the image below? I need district names as 'ID1NAME' variable and city names as 'IDNAME' variable for the ensuing data linkages.    

 

table wanted.png

data have;
input municipality : $9. count pop rate;
cards;
district1 1000 10000 10
city1 300 2500 12
city2 200 6000 3.3
city3 500 1500 33.3
district2 100 500 20
city4 50 200 25
city5 50 300 16.7
;
1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @Cruise 

You can do this, using the retain statement:

data want;
	length ID1NAME $ 20;
	set have;
	retain ID1NAME;
	if find(municipality,"district")>0 then do;
		ID1NAME=municipality;
		delete;
	end;
run;

Capture d’écran 2020-05-02 à 20.39.05.png

View solution in original post

5 REPLIES 5
ed_sas_member
Meteorite | Level 14

Hi @Cruise 

You can do this, using the retain statement:

data want;
	length ID1NAME $ 20;
	set have;
	retain ID1NAME;
	if find(municipality,"district")>0 then do;
		ID1NAME=municipality;
		delete;
	end;
run;

Capture d’écran 2020-05-02 à 20.39.05.png

Cruise
Ammonite | Level 13

Thank you @ed_sas_member . Sorry for the ambiguity. The values that variable municipality takes don't share prefix: district but different district names. There are 9 districts that I could specify for the 'find' function. But some cities takes same names as their districts. Is there any solution to the update problem? 

data have;
input municipality : $9. count pop rate;
cards;
Daejeon 1000 10000 10
A_city 300 2500 12
Daejeon 200 6000 3.3
B_city 500 1500 33.3
Busan 100 500 20
C_city 50 200 25
Busan 50 300 16.7
;

 

Cruise
Ammonite | Level 13
True, however, District gets the sum value of cities combined, or mean rates of cities combined. Can this logic be used here?
Kurt_Bremser
Super User

If the line containing the district will always be the first for such a group, you can do this, using 2 lookup hash tables:

data have;
input municipality : $9. count pop rate;
cards;
Daejeon 1000 10000 10
A_city 300 2500 12
Daejeon 200 6000 3.3
B_city 500 1500 33.3
Busan 100 500 20
C_city 50 200 25
Busan 50 300 16.7
;

data districts;
input municipality :$9.;
datalines;
Daejeon
Busan
;

data want;
set have;
if _n_ = 1
then do;
  declare hash di (dataset:"districts");
  di.definekey("municipality");
  di.definedone();
  declare hash df ();
  df.definekey("municipality");
  df.definedone();
end;
retain district;
if di.check() = 0 and df.check() ne 0
then do;
  district = municipality;
  df.add();
  delete;
end;
run;

di holds the reference which names can appear as districts, and df holds the reference if a certain name was already used for the district.

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 393 views
  • 2 likes
  • 3 in conversation