SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Phil_NZ
Barite | Level 11

Hi SAS Users,

 

I am trying to delete some observations to filter my data.

 

My code is as below:

OPTIONS MPRINT;
data arg_merge2;
  set 'C:\Users\pnguyen\Desktop\arg_merge2';

run;
data screen1;
 set arg_merge2;
 if TYPE_1 ne 'EQ' then delete;
 if INDC3 = 'UTILS' or INDC3='BANKS' or INDC3='FINSV' 
  or INDC3='RLEST' or INDC3='INSUR'
 then delete;
run;

I have three questions relating to the code above:

 

1. If I want to delete observations satisfying multiple conditions, can I list two or more "if" like that? I mean, I want to delete observation with TYPE_1 <> 'EQ' and after that, when I got the result, I want to delete observations has INDC3 = 'UTILS' or 'BANKS',....

 

2. Regarding the code 

 

 if indc3 = 'UTILS' OR INDC3='BANKS' or INDC3='FINSV' 
  or INDC3='RLEST' or INDC3='INSUR'

is there any chance to shorten the code?

 

3. I know there is a way to delete the observation in current dataset without creating a new one by using PROC MEANS , can you please tell me how to use it in my case specifically?

 

Thank you in advance!

 

 

 

 

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

1 and 2. The answer to both is Yes. However, I think it would be more intuituve to specify when out want to output an observation rather than when you want to delete it. Also, you can use the Where Statement instead of the If Statement. The Where Statement specifies what observations are read, so you don't even read them into the data step.

 

data screen1;
   set arg_merge2;
   if TYPE_1 = "EQ" and INDC3 not in ('UTILS', 'BANKS', 'FINSV', 'RLEST', 'INSUR');
run;

 

3. I think what you want to do is to use Proc Means directly on your original data set and filter the observations directly here. You can use the Where= Data Set Option to do so. See if you can use this as a template.

 

proc means data = screen1(where=(TYPE_1 = "EQ" and INDC3 not in ('UTILS', 'BANKS', 'FINSV', 'RLEST', 'INSUR')));
   /* More code here */
run;

 

 

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

1 and 2. The answer to both is Yes. However, I think it would be more intuituve to specify when out want to output an observation rather than when you want to delete it. Also, you can use the Where Statement instead of the If Statement. The Where Statement specifies what observations are read, so you don't even read them into the data step.

 

data screen1;
   set arg_merge2;
   if TYPE_1 = "EQ" and INDC3 not in ('UTILS', 'BANKS', 'FINSV', 'RLEST', 'INSUR');
run;

 

3. I think what you want to do is to use Proc Means directly on your original data set and filter the observations directly here. You can use the Where= Data Set Option to do so. See if you can use this as a template.

 

proc means data = screen1(where=(TYPE_1 = "EQ" and INDC3 not in ('UTILS', 'BANKS', 'FINSV', 'RLEST', 'INSUR')));
   /* More code here */
run;

 

 

Phil_NZ
Barite | Level 11

Hi @PeterClemmensen 

Thank you for your insightful answer, regarding the question 3, the code should be

proc means data = arg_merge2(where=(TYPE_1 = "EQ" and INDC3 not in ('UTILS', 'BANKS', 'FINSV', 'RLEST', 'INSUR')));
run;

I think the data should be arg_merge2 rather than screen1, I just want to cross-check to be sure quite a bit

Can I just write such code and do not write any further "More code" as you mentioned? And the two codes you posted equal to each other ?

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Phil_NZ
Barite | Level 11
Do you mean it should be arg_merge2 ?
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
PeterClemmensen
Tourmaline | Level 20

And no, the two code snippets are not equal at all. One creates data (the data step) and one calculates desciptive statistics (PROC MEANS).

 

For Proc Means code example, check the Documentation.

 

Feel free to ask 🙂

Phil_NZ
Barite | Level 11

Hi @PeterClemmensen 

 

So, the results from proc means and the data step should be the same, am I correct ?. I mean afterward, if I access the datasets screen1 from data step or arg_merge2 from proc means, they should be the same, am I correct? Or Proc Means just have the action inside this procedure and does not change the original dataset ?

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
PeterClemmensen
Tourmaline | Level 20

No. As I say above, they are not the same 🙂

 

 

ballardw
Super User

One thing, by TYPE_1 <> 'EQ' do you mean Type_1 is not equal to 'EQ'?

Except in a Where expression you will find that <> does not do what you think in a data step.

The <> operator is only valid for numeric values in a data step otherwise:

260  data example;
261     y='EQUAL';
262     if y <> 'EQ' then put Y= 'is <> to EQ';
NOTE: The "<>" operator is interpreted as "MAX".
263  run;

NOTE: Character values have been converted to numeric
      values at the places given by: (Line):(Column).
      262:9
NOTE: Invalid numeric data, 'EQUAL' , at line 262 column 9.
y=EQUAL _ERROR_=1 _N_=1
NOTE: The data set WORK.EXAMPLE has 1 observations and 1
      variables.

So for character value comparisons in a data step for "not equal" us NE ,   ~=   or   ^=

 

I would also be very careful with not equal conditions in your use case in general. Missing values may need additional consideration for example if really mean "not missing and not equal to 'EQ' ".

Less than conditions may also need explicit treatment as missing is always less than any explicit value.

sas-innovate-white.png

Join us for our biggest event of the year!

Four days of inspiring keynotes, product reveals, hands-on learning opportunities, deep-dive demos, and peer-led breakouts. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 6618 views
  • 4 likes
  • 3 in conversation