BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
righcoastmike
Quartz | Level 8

HI all, 

 

I am trying to create new variables in my dataset indicating whether or not a person lives in in a rural or urban area. The way that urban/rural status is determined is if the persons postal code has a '0' in the middle of it then it is a rural (eg. B0A), otherwise it is urban.  I could do it if it was the first character, however it's using the middle character that has got me stuck. Any thoughts would be much appreciated! 

 

Thanks so much

 

Mike 

 

Existing Dataset  

data have;
input studyID postal_code $;
cards;
1 B0A
2 B2Y
3 F0N
4 B4T
5 M4T
;

 

Sample output: 

Patient ID   Postal Code  Urban   Rural

      1              B0A               0           1     

      2              B2Y               1           0

      3              F0N               0           1

      4              B4T               1           0

      5              M4T              1           0

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
data have;
input studyID postal_code $;
cards;
1 B0A
2 B2Y
3 F0N
4 B4T
5 M4T
;

data want;
set have;
if index(postal_code,'0')=0 then do;urban=1;rural=0;end;
else if index(postal_code,'0')>0 then do;urban=0;rural=1;end;
run;

View solution in original post

10 REPLIES 10
ballardw
Super User

If the postalcode is always 3 characters you can get the value with

 

middle = substr(postal_code,2,1);

 

the 2 means start reading at the second position and the 1 is read one character.

righcoastmike
Quartz | Level 8

Thanks so much for this, I was not familiar with the 'substr' command. I'm sure it will come in handy in the future! 

 

Mike 

 

 

Reeza
Super User

SUBSTR() or CHAR() functions will allow you to access the individual characters.

 


@righcoastmike wrote:

HI all, 

 

I am trying to create new variables in my dataset indicating whether or not a person lives in in a rural or urban area. The way that urban/rural status is determined is if the persons postal code has a '0' in the middle of it then it is a rural (eg. B0A), otherwise it is urban.  I could do it if it was the first character, however it's using the middle character that has got me stuck. Any thoughts would be much appreciated! 

 

Thanks so much

 

Mike 

 

Existing Dataset  

data have;
input studyID postal_code $;
cards;
1 B0A
2 B2Y
3 F0N
4 B4T
5 M4T
;

 

Sample output: 

Patient ID   Postal Code  Urban   Rural

      1              B0A               0           1     

      2              B2Y               1           0

      3              F0N               0           1

      4              B4T               1           0

      5              M4T              1           0


 

novinosrin
Tourmaline | Level 20
data have;
input studyID postal_code $;
cards;
1 B0A
2 B2Y
3 F0N
4 B4T
5 M4T
;

data want;
set have;
if index(postal_code,'0')=0 then do;urban=1;rural=0;end;
else if index(postal_code,'0')>0 then do;urban=0;rural=1;end;
run;
righcoastmike
Quartz | Level 8

Worked like a charm! 

 

Thank you. 

 

Mike 

 

 

righcoastmike
Quartz | Level 8

I had a quick question about this code. It works great, but it looks like it's doing the opposite of what I would expect. 

 

To me lit looks like:

 

If the middle character of the postal code  =0 then Urban=1 Rural=0

Else

If the middle character of the postal code  > 0 then Urban=0 Rural=1 

 

which would have the 1's and zero's reversed, but for some reason the code get's everything in the right place. Can you explain what I'm missing? 

 

Thanks again. 

 

Mike 

 

 

 

novinosrin
Tourmaline | Level 20

The code doesn't even check for middle char =0 i'm afraid. if your first and second both happen to be  0, then the code will fail. If that's the case I would have to tweak a bit, which is no big deal. I assumed your sample is a good representative of your real and went with a simple check of 0 in the string following with a true(1) and false(0) assignment for rural and urban.

 

For example, please review the test below:

data have;
input studyID postal_code $;
cards;
1 B0A
2 02Y /*notice the change here I made to the sample*/
3 F0N
4 B4T
5 M4T
;

/*now ,please notice the result for ID 2*/

data want;
set have;
if index(postal_code,'0')=0 then do;urban=1;rural=0;end;
else if index(postal_code,'0')>0 then do;urban=0;rural=1;end;
run;

righcoastmike
Quartz | Level 8

Understood, and the assumption holds.

 

The only postal codes with a '0' in them are rural postal codes so that works. 

 

so in essence, if the program finds a 0 in the string then it sets rural to true (1) and urban to false (0) and vice versa? 

 

Thanks again. 

 

Mike

novinosrin
Tourmaline | Level 20

Indeed Sir. 🙂

 

But please feel free to revert back should you need any changes. It's very simple program to make changes. No biggie.

 

The only request to everybody is that please provide a good representative sample to avoid going back and forth ding dong lol

ballardw
Super User

For added fun consider:

data want;
   set have;
   urban= (index(postal_code,'0')=0) ;
   rural = not(urban);
run;

or

 

data want;
   set have;
   rural= (substr(postal_code,2,1)='0') ;
   urban = not(rural);
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 967 views
  • 0 likes
  • 4 in conversation