Marking next and previous observations with conflicts

Reply
Occasional Contributor
Posts: 6

Marking next and previous observations with conflicts

  

Hi all,

 

I want to create dummies in SAS. My plan is, everytime AnnouncementDate ne . I want to make a variable called event=2 and the next one also (event=2) as well as the five previous observations 1 And all this flagged if the Previous 5 observations and the day after the announcement date are not empty. and in a by class (by stock for example)

output data

 

date announcementdate event conflict
1115 . . .
2115 . . .
3115 . . .
4115 . . .
5115 . . .
6115 . 1 .
7115 . 1 .
8115 . 1 .
9115 . 1 .
10115 . 1 .
11115 1115 2 .
12115 . 2 .
13115 . .
14115 . 1 .
16115 . 1 .
17115 . 1 .
18115 . 1 .
19115 . 1 .
20115 20115 2 
21115 . 1 Y
21115 . 1 Y
22115 22115 2 Y
23115 . 2 Y
24115 . .
25115 . .

If this now switches to the next stock, it should start over and not take the previous observations into account.

My current code does a lead term as I merge just the event column with firstobs=2. Anyways this 5x lag bothers me as well as the conflict when there are observations before.

 

Thankful for any input or hint on what to use here.

Grand Advisor
Posts: 17,313

Re: Marking next and previous observations with conflicts

I can't follow that logic. 

 

Can you post what you have and what you need as separate data sets.

Respected Advisor
Posts: 4,606

Re: Marking next and previous observations with conflicts

Here is a solution using arrays:

 

data have;
stock = 1;
input date announcementDate;
datalines;
1115 . . .
2115 . . .
3115 . . .
4115 . . .
5115 . . .
6115 . 1 .
7115 . 1 .
8115 . 1 .
9115 . 1 .
10115 . 1 .
11115 1115 2 .
12115 . 2 .
13115 . .
14115 . 1 .
16115 . 1 .
17115 . 1 .
18115 . 1 .
19115 . 1 .
20115 20115 2 
21115 . 1 Y
21115 . 1 Y
22115 22115 2 Y
23115 . 2 Y
24115 . .
25115 . .
;

data want;
array d{999};
array a{999};
array e{999};
array c{999} $1;
do n = 1 by 1 until(last.stock);
    set have; by stock notsorted;
    d{n} = date;
    a{n} = announcementDate;
    if not missing(announcementDate) then do; 
        call missing (conflictPos);
        do i = max(n-5,1) to max(n-1,1);
            if missing(a{i}) then e{i} = 1; 
                else conflictPos = i; 
            end;
        e{n} = 2;
        e{n+1} = 2;
        if conflictPos then
            do i = conflictPos + 1 to n+1;
                c{i} = "Y";
                end;
        end;
    end;
do i = 1 to n;
    date = d{i};
    announcementDate = a{i};
    event = e{i};
    conflict = c{i};
    output;
    end;
keep stock date announcementDate event conflict;
run;

proc print; run;
PG
Occasional Contributor
Posts: 6

Re: Marking next and previous observations with conflicts

@PGStats

Thanks, PG. But as I have several millions of observations and like 100 announcement dates, the creation of 999 variables and opening it (to double chech the results) takes forever.

 

@Reeza

My dataset has several variables (incl. Stock, Date, some variables) and announcement date. Everytime the ann_date is not . (so, there is an announcement), I want to flag that observation, the next one and the 5 previous ones. (the 5 previous ones seperately). And if in that window is another ann_Date, than mark that seperately.

Respected Advisor
Posts: 4,606

Re: Marking next and previous observations with conflicts

Array dimensions should be large enough to accomodate the maximum number of dates for a single stock. Instead of checking the program performance by hand on your whole dataset, build a small test set with all possible sequences, including extreme cases, and check the results.

PG
Ask a Question
Discussion stats
  • 4 replies
  • 110 views
  • 0 likes
  • 3 in conversation