DATA Step, Macro, Functions and more

dropping low freq panel data

Reply
Contributor
Posts: 50

dropping low freq panel data

Hi,

 

I want to drop ID variable if more than 2 observation for var a are zero.e.g. in this case I would want to remove ID 2 and 3 frm my data

Can someone guide me how to do that?

 

IDYearabcde
12008-3.1075.57071860.879.08
12009-4.5064.37542000.8310.811
12010-6.8113.27321460.689.173
22008-4.56910.53251431.220.004
22009     
22010     
320083.0493.7264961.150.02
32009  421.27.
32010 .   
420089.9896.29034.290.760.057
42009-9.1623.184822.610.880.071
420104.3674.8475.650.80.2
Super User
Posts: 6,628

Re: dropping low freq panel data

The data illustrated is different than the result you describe.  You describe:

 

"if more than 2 observations for var a are zero"

 

The data show:

 

"if more than 1 observation for var a is missing"

 

Here's an approach to use, but you may need to change it slightly depending on which observations you want.  This assumes your data set is sorted by ID:

 

data want;

n_missing = 0;

do until (last.ID);

   set have;

   by ID;

   if a=. then n_missing + 1;

end;

do until (last.ID);

   set have;

   by ID;

   if n_missing < 2 then output;

end;

drop n_missing;

run;

 

The top loop processes observations for a single ID, and counts how many missing values exist for A.  The bottom loop re-reads the same observation, and outputs according to the results of the top loop.

Contributor
Posts: 50

Re: dropping low freq panel data

Posted in reply to Astounding

The following error appears,

 

99   data want;
100  n_missing = 0;
101  do until (last.ID);
102  ? ?set thdata;
     -
     180
103  ? ?by ID;
     -
     180
104  ? ?if te=. then n_missing + 1;
     -
     180
ERROR 180-322: Statement is not valid or it is used out of proper order.

105  end;
106  do until (last.ID);
107  ? ?set thdata;
     -
     180
108  ? ?by ID;
     -
     180
109  ? ?if n_missing < 2 then output;
     -
     180
ERROR 180-322: Statement is not valid or it is used out of proper order.

110  end;
111  drop n_missing;
112  run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.WANT may be incomplete.  When this step was stopped there were 0
         observations and 0 variables.
WARNING: Data set WORK.WANT was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.03 seconds

 

run;

Super User
Posts: 6,628

Re: dropping low freq panel data

Why do you have question marks in the middle of your program?  There should be blanks at that point.

Super User
Posts: 10,686

Re: dropping low freq panel data

There are not zero in A.there are only missing value.

data have;
infile cards truncover;
input ID	Year	 a	b	c	d	e;
cards;
1	2008	-3.107	5.5707	186	0.87	9.08
1	2009	-4.506	4.3754	200	0.83	10.811
1	2010	-6.811	3.2732	146	0.68	9.173
2	2008	-4.569	10.5325	143	1.22	0.004
2	2009	 	 	 	 	 
2	2010	 	 	 	 	 
3	2008	3.049	3.7264	96	1.15	0.02
3	2009	 	. 	42	1.27	.
3	2010	 	.	 	 	 
4	2008	9.989	6.2903	4.29	0.76	0.057
4	2009	-9.162	3.1848	22.61	0.88	0.071
4	2010	4.367	4.847	5.65	0.8	0.2
;
run;
proc sql;
select *
 from have
  group by id 
   having sum(missing(a))<2;
quit;

Ask a Question
Discussion stats
  • 4 replies
  • 87 views
  • 0 likes
  • 3 in conversation