05-24-2017 04:20 AM
I have a dataset which has more than 10 million observations. In this, I have variable name NAICS it has different values. I want to remove all those values that start with 21. It contains different no of digits like 21, 2145, 210454 etc. So My intention is to remove all values with start with 21. There are some values of NAICS variable which end with 21 but I don't want them to be affected. Thanking in anticipation.
05-24-2017 04:28 AM
@Jahanzaib: Is your variable numeric or character? And what do you mean by "removing" a value? Do you mean setting the variable to missing, or do you mean deleting the whole observation? In your example, all the values that you want to get rid of start with the number 21, does that mean that you do not want to get rid of values like 18212 or 321?
05-24-2017 04:34 AM
05-24-2017 04:38 AM
@Jahanzaib: Then the fastest and simplest is probably using a WHERE clause:
data want; set have; where NAICS not like '21%'; run;
- this assuming that the values are left aligned, otherwise use
where left(NAICS) not like '21%';
05-24-2017 04:51 AM
I would really advise you to try both methods on the 10mill records as I suspect this (from @s_lassen):
data want; set have (where=(substr(naics,1,2) ne '21')); run;
Would be faster than the given solution from @draycut:
data want; set have; if substr(left(naics),1,2) ne '21'; run;
I can't prove this offhand, but vaguely remember something aboutthe set reading in the data for each row and then outputting on the if, where the where clause restricts what is coming in, so slightly earlier in the process. If so then this fractional saving would add up of millions of records. But do test.
05-24-2017 05:17 AM
From your responses, i assume you are not familiar with wild card operators or colon modifiers. If you have a grasping you would have chosen @s_lassen 's answer without having to deal with functions that makes SAS work more.
If you understand collating sequences , the below solution is easy
if var>=:'90' then delete;
Need further help from the community? Please ask a new question.