Hi,
I want to drop ID variable if more than 2 observation for var a are zero.e.g. in this case I would want to remove ID 2 and 3 frm my data
Can someone guide me how to do that?
ID | Year | a | b | c | d | e |
1 | 2008 | -3.107 | 5.5707 | 186 | 0.87 | 9.08 |
1 | 2009 | -4.506 | 4.3754 | 200 | 0.83 | 10.811 |
1 | 2010 | -6.811 | 3.2732 | 146 | 0.68 | 9.173 |
2 | 2008 | -4.569 | 10.5325 | 143 | 1.22 | 0.004 |
2 | 2009 | |||||
2 | 2010 | |||||
3 | 2008 | 3.049 | 3.7264 | 96 | 1.15 | 0.02 |
3 | 2009 | 42 | 1.27 | . | ||
3 | 2010 | . | ||||
4 | 2008 | 9.989 | 6.2903 | 4.29 | 0.76 | 0.057 |
4 | 2009 | -9.162 | 3.1848 | 22.61 | 0.88 | 0.071 |
4 | 2010 | 4.367 | 4.847 | 5.65 | 0.8 | 0.2 |
The data illustrated is different than the result you describe. You describe:
"if more than 2 observations for var a are zero"
The data show:
"if more than 1 observation for var a is missing"
Here's an approach to use, but you may need to change it slightly depending on which observations you want. This assumes your data set is sorted by ID:
data want;
n_missing = 0;
do until (last.ID);
set have;
by ID;
if a=. then n_missing + 1;
end;
do until (last.ID);
set have;
by ID;
if n_missing < 2 then output;
end;
drop n_missing;
run;
The top loop processes observations for a single ID, and counts how many missing values exist for A. The bottom loop re-reads the same observation, and outputs according to the results of the top loop.
The following error appears,
99 data want;
100 n_missing = 0;
101 do until (last.ID);
102 ? ?set thdata;
-
180
103 ? ?by ID;
-
180
104 ? ?if te=. then n_missing + 1;
-
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
105 end;
106 do until (last.ID);
107 ? ?set thdata;
-
180
108 ? ?by ID;
-
180
109 ? ?if n_missing < 2 then output;
-
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
110 end;
111 drop n_missing;
112 run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.WANT may be incomplete. When this step was stopped there were 0
observations and 0 variables.
WARNING: Data set WORK.WANT was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
run;
Why do you have question marks in the middle of your program? There should be blanks at that point.
There are not zero in A.there are only missing value. data have; infile cards truncover; input ID Year a b c d e; cards; 1 2008 -3.107 5.5707 186 0.87 9.08 1 2009 -4.506 4.3754 200 0.83 10.811 1 2010 -6.811 3.2732 146 0.68 9.173 2 2008 -4.569 10.5325 143 1.22 0.004 2 2009 2 2010 3 2008 3.049 3.7264 96 1.15 0.02 3 2009 . 42 1.27 . 3 2010 . 4 2008 9.989 6.2903 4.29 0.76 0.057 4 2009 -9.162 3.1848 22.61 0.88 0.071 4 2010 4.367 4.847 5.65 0.8 0.2 ; run; proc sql; select * from have group by id having sum(missing(a))<2; quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.