07-14-2017 10:01 AM
Hi! I have an original dataset as the following:
id start_date end_date
1 2016-06-01 2016-07-01
2 2013-02-01 2014-04-06
2 2015-01-03 2016-03-07
3 2013-01-01 2015-03-06
For each row, I would like to compute the days between the start date and the end date that is between year 2014-01-01 to 2014-12-31.
I was thinking that I can create two new variables per row so that:
id start_date end_date date1 date2
1 2016-06-01 2016-07-01 . .
2 2013-02-01 2014-04-06 2014-01-01 2014-04-06
2 2015-01-03 2016-03-07 . .
3 2013-01-01 2015-03-06 2014-01-01 2014-12-31
and then I can drop the rows with missing value and compute the difference between date2 and date1. But the following code does not work:
format date1 date2 YYMMDD10.;
if start_date <= 2004-01-01 then date1 = 2004-01-01;
else if start_date <= 2004-12-31 then date1 = start_date;
else date1 = .;
if end_date < 2004-01-01 then date2 =.;
else if end_date <= 2004-12-31 then date2 = end_date;
else date2 = 2004-12-31;
Could you please help me fixing the above code or give me some suggesting about how to acheive my goal? Thank you so much! I really appreciate it.
07-14-2017 10:11 AM
Once you think about your formulas, you'll realize that 2014-01-01 = 2014 - 1 - 1 = 2012.
To refer to a specific day, use a different form: '01Jan2014'd
Do not put a space between the closing quote and the letter d
And be careful about using 2004 vs. using 2014
07-14-2017 10:22 AM
07-14-2017 10:53 AM
Thank you so much for you reply! But the date format in the original dataset is YYMMDD10.
Do I need to reformat the date in the original file first?
I changed into the '01Jan2014'd in the if-else statement but still I got weird outputs
Format has NOTHING to do with arithmetic. Formats are for display or grouping.
Show the code and the results that are "weird" and the expected results.
07-14-2017 11:47 AM
Weird results can be debugged. Show what you did, and what the result was. So far, you are moving in right direction and reformatting is not necessary.