DATA Step, Macro, Functions and more

remove value that contain 21

Accepted Solution Solved
Reply
Contributor
Posts: 72
Accepted Solution

remove value that contain 21

I have a dataset which has more than 10 million observations.  In this, I have variable name NAICS it has different values. I want to remove all those values that start with 21. It contains different no of digits like 21, 2145, 210454 etc. So My intention is to remove all values with start with 21.  There are some values of NAICS variable which end with 21 but I don't want them to be affected. Thanking in anticipation.

 


Accepted Solutions
Solution
‎05-24-2017 04:41 AM
PROC Star
Posts: 551

Re: remove value that contain 21

data have;
input NAICS;
datalines;
21234
2134664
42355
235353
214356
;

data want;
	set have;
	if substr(left(NAICS),1,2) NE '21';
run;

View solution in original post


All Replies
PROC Star
Posts: 171

Re: remove value that contain 21

[ Edited ]

data want;

set have;

if NAICS =:'21' then delete;

run;

 

Assuming that variable is a character variable

PROC Star
Posts: 63

Re: remove value that contain 21

@Jahanzaib: Is your variable numeric or character? And what do you mean by "removing" a value? Do you mean setting the variable to missing, or do you mean deleting the whole observation? In your example, all the values that you want to get rid of start with the number 21, does that mean that you do not want to get rid of values like 18212 or 321?

 

 

Contributor
Posts: 72

Re: remove value that contain 21

@s_lassen I mean deleting whole observation. It's a character variable. Right. I don't want to delete values like 18212 or 321. delete only those which start with 21 not the others.
PROC Star
Posts: 551

Re: remove value that contain 21

I assumed a numeric variable, but my solution will work for a character variable as well.

PROC Star
Posts: 63

Re: remove value that contain 21

@Jahanzaib: Then the fastest and simplest is probably using a WHERE clause:

data want;
  set have;
  where NAICS not like '21%';
run;

- this assuming that the values are left aligned, otherwise use

where left(NAICS) not like '21%';
Solution
‎05-24-2017 04:41 AM
PROC Star
Posts: 551

Re: remove value that contain 21

data have;
input NAICS;
datalines;
21234
2134664
42355
235353
214356
;

data want;
	set have;
	if substr(left(NAICS),1,2) NE '21';
run;
Super User
Super User
Posts: 7,401

Re: remove value that contain 21

I would really advise you to try both methods on the 10mill records as I suspect this (from @s_lassen):

data want;
  set have (where=(substr(naics,1,2) ne '21'));
run;

Would be faster than the given solution from @draycut:

data want;
  set have;
  if substr(left(naics),1,2) ne '21';
run;

I can't prove this offhand, but vaguely remember something aboutthe set reading in the data for each row and then outputting on the if, where the where clause restricts what is coming in, so slightly earlier in the process.  If so then this fractional saving would add up of millions of records.  But do test.

Contributor
Posts: 72

Re: remove value that contain 21

Thankyou @RW9 How to delete if first two digits are 90 or greater than 90?
PROC Star
Posts: 171

Re: remove value that contain 21

From your responses, i assume you are not familiar with wild card operators or colon modifiers. If you have a grasping you would have chosen @s_lassen 's answer without having to deal with functions that makes SAS work more.

If you understand collating sequences , the below solution is easy

data want;

set have;

if var>=:'90' then delete;

run;

 

Regards,

Naveen Srinivasan

PROC Star
Posts: 171

Re: remove value that contain 21

Also, the same  can be applied to a where clause:

data want;

set have;

where not (var>=:'90');

run;

Super User
Super User
Posts: 7,401

Re: remove value that contain 21

Well 90-99 all consist of 9x, so you can just do:

data want;
  set have (where=(substr(naics,1,2) ne '21' and char(naics,1) ne '9'));
run;
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 210 views
  • 5 likes
  • 5 in conversation