BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Amy0223
Quartz | Level 8

Hi, I don't know if I'm answering the below question correctly. I tried but my codes just look weird. I'd like to hear your advice. Thank you very much!

 

Variable zipcode is read with format$10. It reads zip codes in the form 07417 or 07417-1280 Create variable_9digit. It equals 1 if zipcode has a hyphen separating the fifth and seven digits. Otherwise it equals 0. Write the statements three ways, using the length, index, ad substr functions.

/* using length function*/
data zipcode;
length zipcode $10;
input zipcode ;
cards;
07417
07417-1280
;
run;
DATA zipcode1;
Set zipcode;
if zipcode= '07417-1280' then zipcode2= '1';
else zipcode2= '0';
proc print noobs;
title 'Using length function';
run;

/* using substr function*/
DATA zipcode2;
format zipcode $10.;
zipcode= '07417-1280';
form1 = Substr( zipcode, 1, 5);
if form1= '07417' then newform1='0';
form2 = Substr( zipcode, 1);
if form2= '07417-1280'  then newform2= '1' ;
proc print noobs;
title 'Using substr function';
RUN; 

/* using index function*/
Data zipcode3;
format zipcode $10.;
zipcode= '07417-1280';
form1 = Substr( zipcode,1, index(zipcode, '-')-1);
if form1= '07417'  then newform1= '0' ;
form2 = Substr( zipcode, 1);
if form2= '07417-1280'  then newform2= '1' ;
proc print noobs;
title 'Using index function';
RUN; 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @Amy0223

 

I don't understand why you need to use the different functions separately, because in my opinion, what defines the zipcode pattern is the conjonction of 4 conditions:
- length of the zipcode = 10 
- digits 1 to 5 = a number
- digits 7 to 9 = a number
- 6th digit = an hyphen

The use of the prxmatch function is a more efficient way to do that but you can also use the traditional length(), index() and substr() functions to create the flag variable. It doesn't make sense to use them separately.

data zipcode_check;
	set zipcode;

	if length(zipcode)= 10 and
	   0 < substr(zipcode, 1, 5) < 99999 and
	   0 < substr(zipcode, 7, 4) < 9999 and
	   index(zipcode,"-")= 6
	   
	   then variable_9digit=1;
	   
	else variable_9digit=0;
run;

 

View solution in original post

10 REPLIES 10
ed_sas_member
Meteorite | Level 14

Hi @Amy0223 

 

It is a typical use case for regular expressions.

The function prxmatch() as written below checked if the zipcode variable match the following pattern: 5 digits (\d), 1 hyphen, 4 digits.

 

data zipcode_flag;
	set zipcode;
	if prxmatch('/\d{5}\-\d{4}/',zipcode) then variable_9digit = 1;
	else variable_9digit = 0;
run;

  The issue with your 3 tests is that your code depends specifically on one zip code in particular and not in general.

 

Best,

Amy0223
Quartz | Level 8
Thank you for taking your time to check my code and provide helpful feedback. I greatly appreciate your help!
Amy0223
Quartz | Level 8

Below is my updated codes, do you think this answers the problem?

/* using length function*/
data zipcode;
length zipcode $10;
input zipcode ;
cards;
07417
07417-1280
;
run;

DATA zipcode1;
Set zipcode;
if prxmatch('/\d{5}\-\d{4}/',zipcode) then variable_9digit = 1;
else variable_9digit = 0;
proc print noobs;
title 'Using length function';
run;

/* using index function*/
data zipcode2;	
set zipcode;
if index(zipcode,'-') then variable_9digit = 1;
else variable_9digit=0;
proc print noobs;
title 'Using index function';
RUN; 

/* using substr function*/
DATA zipcode3;
set zipcode;
if Substr( zipcode, 1, 5) then variable_9digit = 1;
else variable_9digit = 0;
proc print noobs;
title 'Using substr function';
RUN; 
ed_sas_member
Meteorite | Level 14

Hi @Amy0223

 

I don't understand why you need to use the different functions separately, because in my opinion, what defines the zipcode pattern is the conjonction of 4 conditions:
- length of the zipcode = 10 
- digits 1 to 5 = a number
- digits 7 to 9 = a number
- 6th digit = an hyphen

The use of the prxmatch function is a more efficient way to do that but you can also use the traditional length(), index() and substr() functions to create the flag variable. It doesn't make sense to use them separately.

data zipcode_check;
	set zipcode;

	if length(zipcode)= 10 and
	   0 < substr(zipcode, 1, 5) < 99999 and
	   0 < substr(zipcode, 7, 4) < 9999 and
	   index(zipcode,"-")= 6
	   
	   then variable_9digit=1;
	   
	else variable_9digit=0;
run;

 

Amy0223
Quartz | Level 8
Wow, I didn't know I could use them together. I thought I had to use the length, index, ad substr functions separately because the question said to write the statements three ways. I was really confused. Thank you very much for showing me a more efficient way!
ed_sas_member
Meteorite | Level 14
You're welcome @Amy0223!
Could you please set the topic as answered so that it can be accessible to the community? Thank you
Tom
Super User Tom
Super User

You didn't use the LENGTH() function as the problem requested.

You cannot use a character expression as if it was a boolean expression.  But you can use a numeric  expression since SAS will treat 0 (or missing) as FALSE and any other number as TRUE.

 

Also why are you making a character variable instead of a numeric one?

Numeric is a lot easier since SAS will evaluate boolean expressions to 1 for TRUE and 0 for FALSE.

variable_9digit = (  '-' = substr( zipcode, 6, 1) ) ;
Tom
Super User Tom
Super User

Do you have access to SAS to test your programs? Actually trying is the best way to learn. Especially when your programs don't work as you will learn more from the mistakes than from the code you get right the first time.

 

You can download a copy for for free from SAS for use in learning.

https://www.sas.com/en_us/software/university-edition.html 

Tom
Super User Tom
Super User

Did your instructor really write:

Variable zipcode is read with format$10. 

If so you can explain to them that  in SAS you use an INFORMAT to read text into values. FORMATS are used to convert values into text for display.

Amy0223
Quartz | Level 8
Thank you for sharing the link! I learned so much from this SAS community. This question is exactly what my instructor wrote. I will explain to her about the format.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 4749 views
  • 4 likes
  • 3 in conversation