- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi! I have an original dataset as the following:
id start_date end_date
1 2016-06-01 2016-07-01
2 2013-02-01 2014-04-06
2 2015-01-03 2016-03-07
3 2013-01-01 2015-03-06
For each row, I would like to compute the days between the start date and the end date that is between year 2014-01-01 to 2014-12-31.
I was thinking that I can create two new variables per row so that:
id start_date end_date date1 date2
1 2016-06-01 2016-07-01 . .
2 2013-02-01 2014-04-06 2014-01-01 2014-04-06
2 2015-01-03 2016-03-07 . .
3 2013-01-01 2015-03-06 2014-01-01 2014-12-31
and then I can drop the rows with missing value and compute the difference between date2 and date1. But the following code does not work:
data want;
set original;
format date1 date2 YYMMDD10.;
if start_date <= 2004-01-01 then date1 = 2004-01-01;
else if start_date <= 2004-12-31 then date1 = start_date;
else date1 = .;
if end_date < 2004-01-01 then date2 =.;
else if end_date <= 2004-12-31 then date2 = end_date;
else date2 = 2004-12-31;
run;
Could you please help me fixing the above code or give me some suggesting about how to acheive my goal? Thank you so much! I really appreciate it.
Yiyi
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Once you think about your formulas, you'll realize that 2014-01-01 = 2014 - 1 - 1 = 2012.
To refer to a specific day, use a different form: '01Jan2014'd
Do not put a space between the closing quote and the letter d
And be careful about using 2004 vs. using 2014
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Do I need to reformat the date in the original file first?
I changed into the '01Jan2014'd in the if-else statement but still I got weird outputs
Thanks again!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@yiyizhao86 wrote:
Thank you so much for you reply! But the date format in the original dataset is YYMMDD10.
Do I need to reformat the date in the original file first?
I changed into the '01Jan2014'd in the if-else statement but still I got weird outputs
Thanks again!
Format has NOTHING to do with arithmetic. Formats are for display or grouping.
Show the code and the results that are "weird" and the expected results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Weird results can be debugged. Show what you did, and what the result was. So far, you are moving in right direction and reformatting is not necessary.