BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GKati
Pyrite | Level 9

Dear SAS Users,

 

I have a large dataset for which I have to create a variable resident, that should be =1 if the individual's zipcode is equal to any one from the list below. I thought I would use the in:() statement but it doesn't seem to be working. 

Is there a better way to do this?

 

data new;

set old;

input id zipcode;

3023

2 1071

3 3191 

4 1081

;

resident=0

zipcode=in:('3197', '3197', '3198', '3198','3151','3191', '3191', '3192', '3193','3104', '3199', '3195', '3011', '3012', '3013', '3014', '3015', '3016', '3021', '3022', '3023', '3024', '3025', '3026', '3027', '3028', '3029', '3030', '3030', '3031',  '3032',  '3033', '3034', '3035', '3036', '3037', '3038', '3039', '3041', '3042', '3043', '3044',  '3045', '3046', '3047', '3051', '3052', '3053', '3054', '3055', '3056', '3057', '3058', '3059', '3060', '3061', '3062', '3063', '3064', '3065', '3066', '3067', '3068', '3069', '3070', '3071', '3072', '3073', '3074', '3075', '3076', '3077', '3181', '3196');

if zipcode then resident=1;

 

output;

RUN;

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Your code looks like pseudo code and, of course, wouldn't run as stated.

 

The following will run correctly if your zipcode is a character variable. Otherwise, you'd have to remove the quotes from the values after the in operator or else you'll get a long note in your log regarding all of the instances where the strings were converted to numbers.

 

data new;
  input id zipcode $;
  if zipcode in ('3197', '3197', '3198', '3198','3151','3191', '3191', '3192', '3193','3104', '3199', '3195', '3011', '3012', '3013', '3014', '3015', '3016', '3021', '3022', '3023', '3024', '3025', '3026', '3027', '3028', '3029', '3030', '3030', '3031',  '3032',  '3033', '3034', '3035', '3036', '3037', '3038', '3039', '3041', '3042', '3043', '3044',  '3045', '3046', '3047', '3051', '3052', '3053', '3054', '3055', '3056', '3057', '3058', '3059', '3060', '3061', '3062', '3063', '3064', '3065', '3066', '3067', '3068', '3069', '3070', '3071', '3072', '3073', '3074', '3075', '3076', '3077', '3181', '3196')
   then resident=1;
  else resident=0;
  cards;
1 3023
2 1071
3 3191 
4 1081
;

HTH,

Art, CEO, AnalystFinder.com

 

View solution in original post

4 REPLIES 4
art297
Opal | Level 21

Your code looks like pseudo code and, of course, wouldn't run as stated.

 

The following will run correctly if your zipcode is a character variable. Otherwise, you'd have to remove the quotes from the values after the in operator or else you'll get a long note in your log regarding all of the instances where the strings were converted to numbers.

 

data new;
  input id zipcode $;
  if zipcode in ('3197', '3197', '3198', '3198','3151','3191', '3191', '3192', '3193','3104', '3199', '3195', '3011', '3012', '3013', '3014', '3015', '3016', '3021', '3022', '3023', '3024', '3025', '3026', '3027', '3028', '3029', '3030', '3030', '3031',  '3032',  '3033', '3034', '3035', '3036', '3037', '3038', '3039', '3041', '3042', '3043', '3044',  '3045', '3046', '3047', '3051', '3052', '3053', '3054', '3055', '3056', '3057', '3058', '3059', '3060', '3061', '3062', '3063', '3064', '3065', '3066', '3067', '3068', '3069', '3070', '3071', '3072', '3073', '3074', '3075', '3076', '3077', '3181', '3196')
   then resident=1;
  else resident=0;
  cards;
1 3023
2 1071
3 3191 
4 1081
;

HTH,

Art, CEO, AnalystFinder.com

 

GKati
Pyrite | Level 9

Thanks! This works well. 

Astounding
PROC Star

On a side note, you mention that you are working with a large data set.  If that's the case, you would speed up the program if you were to take care to remove duplicates from the list.

 

From the emoticon in your post, it looks like you intended to add a colon:  in : (

 

That's fine, but you should note a couple of related features.  First, your ZIPCODE variable must be character, not numeric.  Art's code handles that properly.  And second, you should make sure that both ZIPCODE and the list are constructed properly and you haven't dropped leading zeros.

 

ballardw
Super User

You might have better luck with some more semicolons, indicate the start of the data with a datalines or cards statement, end the data with semicolon and place any of the other code BEFORE the datalines/ cards data.

 

You have a lot of repeated values in your list, why?

 Your Zipcode is numeric, you seem to want to compare it to a list of CHARACTER values but the structure is wrong. Are you trying to see if your zipcode begins with characters or is in a range?

 

With NUMERIC values    zipcode in (3021:3033) checks to see if zipcode is the range from 3021 to 3033.

So you might do IF zipcode in (3021:3033) then resident=1;

or even Resident =  zipcode in (3021:3033);

you can specify multiple ranges and or separate values

in (3019, 3021:3033, 3045, 3049:3066) for instance, but only with numeric values with any reliability.

 

IF you happen to be looking for zipcodes in a specific state you might check on the ZIPSTATE function:

 

If zipstate(zipcode)='NH' then resident=1; for example.

 

If your zips are the first 4 characters of a 5 digit value then you need to tell us that.

 

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 41252 views
  • 3 likes
  • 4 in conversation