how to choose a subsample of data according to some rules

Reply
Frequent Contributor
Posts: 122

how to choose a subsample of data according to some rules

Hi, this question seems to be easy, but I just don't know how to do. I have the following data

permno                    month                    return                    i

1                              1                              0.1                      1

1                              2                              0.1                      .

2                              1                              0.5                      .

2                              2                              0.6                      1

2                              3                              0.2                      .

3                              1                              0.5                      .

....

Let me explain. permno is an ID number, month the number of month, it corresponds to something like Aug, 2000, etc.

i is an identification number.

I marked i=1 for something I want to use to compare. First, I wan to choose the month when i=1. In this case, month 1 and 2. Then I would like to choose any observation with permno NOT equal to 1 or 2 (That i identifies) AND month equal to 1 and 2 (That i identifies). Calculate the average of return, compare this average return to the average of returns, of which i=1 identifies. That is permno 3 with month 1(the last observation) meets my requirement. I would like to mark i as 0 for this observation. The average mean is 0.5 (one number). The average return of i=1 is (0.1+0.6)/2=0.35.

I basicly want to mark down the last observation with i=0. How to do?

Trusted Advisor
Posts: 1,203

Re: how to choose a subsample of data according to some rules

proc sql;
create table want as
select permno,month,return,case when i=. then i=0 end as i from have
where permno not in (select permno from have where i=1 or
month not in (select month from have where i=1))
union
select * from have where i=1;
quit;

proc means data=want;
var return;
class i;
run;

Grand Advisor
Posts: 9,578

Re: how to choose a subsample of data according to some rules

data have;
input permno                    month                    return                    i ;
cards;
1                              1                              0.1                      1
1                              2                              0.1                      .
2                              1                              0.5                      .
2                              2                              0.6                      1
2                              3                              0.2                      .
3                              1                              0.5                      .
;
run;
data want;
 if _n_ eq 1 then do;
  if 0 then set have;
  declare hash ha1(dataset:'have(where=(i=1))');
  ha1.definekey('permno');
  ha1.definedone();
  declare hash ha2(dataset:'have(where=(i=1))');
  ha2.definekey('month');
  ha2.definedone();
end;
set have;
if ha1.check() ne 0 and ha2.check() eq 0 then i=0;
run;

Xia Keshan

Ask a Question
Discussion stats
  • 2 replies
  • 237 views
  • 0 likes
  • 3 in conversation