Hello,
I ran into a problem when I was trying to set the 2 consecutive observations before and after a certain observation equal to one. the dataset looks like this:
Date TradingDate TradingDummy What I want
1/1/23 . . .
1/2/23 . . 1
1/3/23 . . 1
1/4/23 1/4/23 1 1
1/5/23 . . 1
1/6/23 . . 1
1/7/23 . . .
1/8/23 . . .
1/9/23 . . 1
1/10/23 . . 1
1/11/23 1/11/23 1 1
1/12/23 . . 1
1/13/23 . . 1
1/14/23 . . .
... ... ... ...
Basically what I want is for each trading date, I want to create a dummy that equals 1 for 2 days before and after (including the trading date). For this example, I want the trading dummy to equal 1 from 1/2 to 1/6, and from 1/9 to 1/13.
Any help would be greatly appreciated!
Best,
skyland
Does your input file already have all trading dates (and no non trading dates)? If so, the programming would be easy.
But the data you show has no date gaps, so you apparently are starting with all dates, not just trading dates.
How do you intend to distinguish between trading dates and non trading dates?
Hi mkeintz,
Thank you for your reply. Yes all the dates in my data file are trading dates as well (and no non-trading dates), and what I want is two consecutive trading dates before and after each trading date to be a dummy of one. Sorry for not being clear to begin with.
Best,
skyland
@skyland1991 wrote:
Hi mkeintz,
Thank you for your reply. Yes all the dates in my data file are trading dates as well (and no non-trading dates), and what I want is two consecutive trading dates before and after each trading date to be a dummy of one. Sorry for not being clear to begin with.
Best,
skyland
I suggest being a bit clearer about exactly what the "before" data looks like. When you only provide "want" you may be showing / implying stuff that is not actually in your data and we don't know that. By clearer I mean data in the form of working data step code. This particularly important with date, time and datetime variables because code that works will with SAS date values doesn't do well with character variables masquerading as dates.
2-digit years are sub-optimal as you are then very heavily reliant on things like the proper YEARCUTOFF option and depending on other choices you may not even realize that a "date" is actually a century off.
If your data only has trading dates (edited addition: and has all the trading dates - i.e. no trading date gaps), and the data is sorted by date, then:
data want (drop=_:);
merge have
have (firstobs=2 keep=tradingdummy rename=(tradingdummy=_nxt_dummy1))
have (firstobs=3 keep=tradingdummy rename=(tradingdummy=_nxt_dummy2));
newvar = max(lag2(tradingdummy),lag(tradingdummy,dummy,_nxt_dummy1,_nxt_dummy2);
run;
Given the tradingdummy is 1 or 0, this program looks for any value of 1 within 2 observations of the observation in hand.
Below works for your sample data.
data have;
infile datalines truncover dsd dlm=' ';
input (Date TradingDate) (:mmddyy10.) TradingDummy WhatIwant;
format Date TradingDate date9.;
datalines;
1/1/23 . . .
1/2/23 . . 1
1/3/23 . . 1
1/4/23 1/4/23 1 1
1/5/23 . . 1
1/6/23 . . 1
1/7/23 . . .
1/8/23 . . .
1/9/23 . . 1
1/10/23 . . 1
1/11/23 1/11/23 1 1
1/12/23 . . 1
1/13/23 . . 1
;
proc sql;
select distinct
l.*,
not missing(r.Date) as WhatIDerived_1,
case when missing(r.Date) then . else 1 end as WhatIDerived_2
from
have l
left join
( select Date from have where not missing(TradingDate) ) r
on l.Date between r.Date-2 and r.Date+2
;
quit;
data have;
infile datalines truncover dsd dlm=' ';
input (Date TradingDate) (:mmddyy10.) TradingDummy ;
format Date TradingDate date9.;
datalines;
1/1/23 . . .
1/2/23 . . 1
1/3/23 . . 1
1/4/23 1/4/23 1 1
1/5/23 . . 1
1/6/23 . . 1
1/7/23 . . .
1/8/23 . . .
1/9/23 . . 1
1/10/23 . . 1
1/11/23 1/11/23 1 1
1/12/23 . . 1
1/13/23 . . 1
;
data want;
if _n_=1 then do;
if 0 then set have;
declare hash h(dataset:'have(where=(TradingDate is not missing))');
h.definekey('TradingDate');
h.definedone();
end;
set have;
do d=Date-2 to Date+2;
if h.check(key:d)=0 then want=1;
end;
drop d;
run;
So assuming you have data with one observation per trading date and a variable to indicate that an EVENT of some type occurred on that date.
data have;
input TradingDay :yymmdd. Event ;
format TradingDay yymmdd10.;
cards;
2023-01-01 .
2023-01-02 .
2023-01-03 .
2023-01-04 1
2023-01-05 .
2023-01-06 .
2023-01-07 .
2023-01-08 .
2023-01-09 .
2023-01-10 .
2023-01-11 1
2023-01-12 .
2023-01-13 .
2023-01-14 .
;
You can use a combination of LAG and LEAD variables to generate your event window flags.
data want;
set have;
lag2=lag2(event);
lag1=lag1(event);
set have(firstobs=2 keep=event rename=(event=lead1)) have(obs=1 drop=_all_);
set have(firstobs=3 keep=event rename=(event=lead2)) have(obs=2 drop=_all_);
EventWindow = max(lag2,lag1,event,lead1,lead2);
run;
You can drop the LAG and LEAD variables if you don't need them anymore.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.