DATA Step, Macro, Functions and more

How to create a variable based on the middle character of a postal code

Accepted Solution Solved
Reply
Contributor
Posts: 46
Accepted Solution

How to create a variable based on the middle character of a postal code

HI all, 

 

I am trying to create new variables in my dataset indicating whether or not a person lives in in a rural or urban area. The way that urban/rural status is determined is if the persons postal code has a '0' in the middle of it then it is a rural (eg. B0A), otherwise it is urban.  I could do it if it was the first character, however it's using the middle character that has got me stuck. Any thoughts would be much appreciated! 

 

Thanks so much

 

Mike 

 

Existing Dataset  

data have;
input studyID postal_code $;
cards;
1 B0A
2 B2Y
3 F0N
4 B4T
5 M4T
;

 

Sample output: 

Patient ID   Postal Code  Urban   Rural

      1              B0A               0           1     

      2              B2Y               1           0

      3              F0N               0           1

      4              B4T               1           0

      5              M4T              1           0


Accepted Solutions
Solution
a month ago
PROC Star
Posts: 1,836

Re: How to create a variable based on the middle character of a postal code

Posted in reply to righcoastmike
data have;
input studyID postal_code $;
cards;
1 B0A
2 B2Y
3 F0N
4 B4T
5 M4T
;

data want;
set have;
if index(postal_code,'0')=0 then do;urban=1;rural=0;end;
else if index(postal_code,'0')>0 then do;urban=0;rural=1;end;
run;

View solution in original post


All Replies
Super User
Posts: 13,583

Re: How to create a variable based on the middle character of a postal code

Posted in reply to righcoastmike

If the postalcode is always 3 characters you can get the value with

 

middle = substr(postal_code,2,1);

 

the 2 means start reading at the second position and the 1 is read one character.

Contributor
Posts: 46

Re: How to create a variable based on the middle character of a postal code

Thanks so much for this, I was not familiar with the 'substr' command. I'm sure it will come in handy in the future! 

 

Mike 

 

 

Super User
Posts: 23,776

Re: How to create a variable based on the middle character of a postal code

Posted in reply to righcoastmike

SUBSTR() or CHAR() functions will allow you to access the individual characters.

 


@righcoastmike wrote:

HI all, 

 

I am trying to create new variables in my dataset indicating whether or not a person lives in in a rural or urban area. The way that urban/rural status is determined is if the persons postal code has a '0' in the middle of it then it is a rural (eg. B0A), otherwise it is urban.  I could do it if it was the first character, however it's using the middle character that has got me stuck. Any thoughts would be much appreciated! 

 

Thanks so much

 

Mike 

 

Existing Dataset  

data have;
input studyID postal_code $;
cards;
1 B0A
2 B2Y
3 F0N
4 B4T
5 M4T
;

 

Sample output: 

Patient ID   Postal Code  Urban   Rural

      1              B0A               0           1     

      2              B2Y               1           0

      3              F0N               0           1

      4              B4T               1           0

      5              M4T              1           0


 

Solution
a month ago
PROC Star
Posts: 1,836

Re: How to create a variable based on the middle character of a postal code

Posted in reply to righcoastmike
data have;
input studyID postal_code $;
cards;
1 B0A
2 B2Y
3 F0N
4 B4T
5 M4T
;

data want;
set have;
if index(postal_code,'0')=0 then do;urban=1;rural=0;end;
else if index(postal_code,'0')>0 then do;urban=0;rural=1;end;
run;
Contributor
Posts: 46

Re: How to create a variable based on the middle character of a postal code

Posted in reply to novinosrin

Worked like a charm! 

 

Thank you. 

 

Mike 

 

 

Contributor
Posts: 46

Re: How to create a variable based on the middle character of a postal code

Posted in reply to novinosrin

I had a quick question about this code. It works great, but it looks like it's doing the opposite of what I would expect. 

 

To me lit looks like:

 

If the middle character of the postal code  =0 then Urban=1 Rural=0

Else

If the middle character of the postal code  > 0 then Urban=0 Rural=1 

 

which would have the 1's and zero's reversed, but for some reason the code get's everything in the right place. Can you explain what I'm missing? 

 

Thanks again. 

 

Mike 

 

 

 

PROC Star
Posts: 1,836

Re: How to create a variable based on the middle character of a postal code

[ Edited ]
Posted in reply to righcoastmike

The code doesn't even check for middle char =0 i'm afraid. if your first and second both happen to be  0, then the code will fail. If that's the case I would have to tweak a bit, which is no big deal. I assumed your sample is a good representative of your real and went with a simple check of 0 in the string following with a true(1) and false(0) assignment for rural and urban.

 

For example, please review the test below:

data have;
input studyID postal_code $;
cards;
1 B0A
2 02Y /*notice the change here I made to the sample*/
3 F0N
4 B4T
5 M4T
;

/*now ,please notice the result for ID 2*/

data want;
set have;
if index(postal_code,'0')=0 then do;urban=1;rural=0;end;
else if index(postal_code,'0')>0 then do;urban=0;rural=1;end;
run;

Contributor
Posts: 46

Re: How to create a variable based on the middle character of a postal code

Posted in reply to novinosrin

Understood, and the assumption holds.

 

The only postal codes with a '0' in them are rural postal codes so that works. 

 

so in essence, if the program finds a 0 in the string then it sets rural to true (1) and urban to false (0) and vice versa? 

 

Thanks again. 

 

Mike

PROC Star
Posts: 1,836

Re: How to create a variable based on the middle character of a postal code

Posted in reply to righcoastmike

Indeed Sir. Smiley Happy

 

But please feel free to revert back should you need any changes. It's very simple program to make changes. No biggie.

 

The only request to everybody is that please provide a good representative sample to avoid going back and forth ding dong lol

Super User
Posts: 13,583

Re: How to create a variable based on the middle character of a postal code

Posted in reply to righcoastmike

For added fun consider:

data want;
   set have;
   urban= (index(postal_code,'0')=0) ;
   rural = not(urban);
run;

or

 

data want;
   set have;
   rural= (substr(postal_code,2,1)='0') ;
   urban = not(rural);
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 103 views
  • 0 likes
  • 4 in conversation