Solved
Contributor
Posts: 46

# How to create a variable based on the middle character of a postal code

HI all,

I am trying to create new variables in my dataset indicating whether or not a person lives in in a rural or urban area. The way that urban/rural status is determined is if the persons postal code has a '0' in the middle of it then it is a rural (eg. B0A), otherwise it is urban.  I could do it if it was the first character, however it's using the middle character that has got me stuck. Any thoughts would be much appreciated!

Thanks so much

Mike

Existing Dataset

``````data have;
input studyID postal_code \$;
cards;
1 B0A
2 B2Y
3 F0N
4 B4T
5 M4T
;``````

Sample output:

Patient ID   Postal Code  Urban   Rural

1              B0A               0           1

2              B2Y               1           0

3              F0N               0           1

4              B4T               1           0

5              M4T              1           0

Accepted Solutions
Solution
a month ago
PROC Star
Posts: 1,836

## Re: How to create a variable based on the middle character of a postal code

``````data have;
input studyID postal_code \$;
cards;
1 B0A
2 B2Y
3 F0N
4 B4T
5 M4T
;

data want;
set have;
if index(postal_code,'0')=0 then do;urban=1;rural=0;end;
else if index(postal_code,'0')>0 then do;urban=0;rural=1;end;
run;``````

All Replies
Super User
Posts: 13,583

## Re: How to create a variable based on the middle character of a postal code

If the postalcode is always 3 characters you can get the value with

middle = substr(postal_code,2,1);

the 2 means start reading at the second position and the 1 is read one character.

Contributor
Posts: 46

## Re: How to create a variable based on the middle character of a postal code

Thanks so much for this, I was not familiar with the 'substr' command. I'm sure it will come in handy in the future!

Mike

Super User
Posts: 23,776

## Re: How to create a variable based on the middle character of a postal code

SUBSTR() or CHAR() functions will allow you to access the individual characters.

@righcoastmike wrote:

HI all,

I am trying to create new variables in my dataset indicating whether or not a person lives in in a rural or urban area. The way that urban/rural status is determined is if the persons postal code has a '0' in the middle of it then it is a rural (eg. B0A), otherwise it is urban.  I could do it if it was the first character, however it's using the middle character that has got me stuck. Any thoughts would be much appreciated!

Thanks so much

Mike

Existing Dataset

``````data have;
input studyID postal_code \$;
cards;
1 B0A
2 B2Y
3 F0N
4 B4T
5 M4T
;``````

Sample output:

Patient ID   Postal Code  Urban   Rural

1              B0A               0           1

2              B2Y               1           0

3              F0N               0           1

4              B4T               1           0

5              M4T              1           0

Solution
a month ago
PROC Star
Posts: 1,836

## Re: How to create a variable based on the middle character of a postal code

``````data have;
input studyID postal_code \$;
cards;
1 B0A
2 B2Y
3 F0N
4 B4T
5 M4T
;

data want;
set have;
if index(postal_code,'0')=0 then do;urban=1;rural=0;end;
else if index(postal_code,'0')>0 then do;urban=0;rural=1;end;
run;``````
Contributor
Posts: 46

## Re: How to create a variable based on the middle character of a postal code

Worked like a charm!

Thank you.

Mike

Contributor
Posts: 46

## Re: How to create a variable based on the middle character of a postal code

I had a quick question about this code. It works great, but it looks like it's doing the opposite of what I would expect.

To me lit looks like:

If the middle character of the postal code  =0 then Urban=1 Rural=0

Else

If the middle character of the postal code  > 0 then Urban=0 Rural=1

which would have the 1's and zero's reversed, but for some reason the code get's everything in the right place. Can you explain what I'm missing?

Thanks again.

Mike

PROC Star
Posts: 1,836

## Re: How to create a variable based on the middle character of a postal code

[ Edited ]

The code doesn't even check for middle char =0 i'm afraid. if your first and second both happen to be  0, then the code will fail. If that's the case I would have to tweak a bit, which is no big deal. I assumed your sample is a good representative of your real and went with a simple check of 0 in the string following with a true(1) and false(0) assignment for rural and urban.

For example, please review the test below:

data have;
input studyID postal_code \$;
cards;
1 B0A
2 02Y /*notice the change here I made to the sample*/
3 F0N
4 B4T
5 M4T
;

/*now ,please notice the result for ID 2*/

data want;
set have;
if index(postal_code,'0')=0 then do;urban=1;rural=0;end;
else if index(postal_code,'0')>0 then do;urban=0;rural=1;end;
run;

Contributor
Posts: 46

## Re: How to create a variable based on the middle character of a postal code

Understood, and the assumption holds.

The only postal codes with a '0' in them are rural postal codes so that works.

so in essence, if the program finds a 0 in the string then it sets rural to true (1) and urban to false (0) and vice versa?

Thanks again.

Mike

PROC Star
Posts: 1,836

## Re: How to create a variable based on the middle character of a postal code

Indeed Sir.

But please feel free to revert back should you need any changes. It's very simple program to make changes. No biggie.

The only request to everybody is that please provide a good representative sample to avoid going back and forth ding dong lol

Super User
Posts: 13,583

## Re: How to create a variable based on the middle character of a postal code

```data want;
set have;
urban= (index(postal_code,'0')=0) ;
rural = not(urban);
run;```

or

```data want;
set have;
rural= (substr(postal_code,2,1)='0') ;
urban = not(rural);
run;```
☑ This topic is solved.