data cig.date_duration;
set cig.ALL;
by HHID purchase_date;
if first.HHID then do;
if first.HHID then first_date = purchase_date;
if last.HHID then last_date = purchase_date;
end;
duration = first_date - last_date + 1;
if last.HHID then output;
run;
What is wrong with my code? Basically, I want to calculate the difference between last purchase_date - first purchase_date for each household id (HHID) from a data file called cig.ALL.
With the above code, I get last_date only when it's a last_date = first_date (i.e., purchase is made on only one day).
Can someone please help?
Thanks,
C
I'll answer your question but, first, I forgot to include an important line. A retain statment is needed. The code should have read:
data cig.date_duration;
set cig.ALL;
by HHID;
retain first_date;
if first.HHID then first_date = purchase_date;
if last.HHID then do;
last_date = purchase_date;
duration =last_date-first_date + 1;
output;
end;
run;
The output statement, because it is within the do loop where last.HHID is true, causes SAS to only output that one record, rather than all of the records.
You have left out the chance to get the date when the last record in a group is read and I think that you reversed the way you want to do the calculation. I think you are actually trying to do something like:
data cig.date_duration;
set cig.ALL;
by HHID;
if first.HHID then first_date = purchase_date;
if last.HHID then do;
last_date = purchase_date;
duration =last_date-first_date + 1;
output;
end;
run;
thank you, sir. what does "output;" do? I get the same output without it.
I'll answer your question but, first, I forgot to include an important line. A retain statment is needed. The code should have read:
data cig.date_duration;
set cig.ALL;
by HHID;
retain first_date;
if first.HHID then first_date = purchase_date;
if last.HHID then do;
last_date = purchase_date;
duration =last_date-first_date + 1;
output;
end;
run;
The output statement, because it is within the do loop where last.HHID is true, causes SAS to only output that one record, rather than all of the records.
now, it records the first_date and last_date. But, I still need to collapse it into the table in the post below...
Is there a way I can do it in the same data step without creating extra "sql" command?
many thanks!
I'm not sure what your question is. If it is just that you only want to keep the fields you identified, you can either only keep those fields when you set the dataset, keep them in the data statement, or drop the other fields. I think the most efficient way would be to use something like:
data cig.date_duration;
set cig.ALL (keep=HHID purchase_date);
by HHID;
retain first_date;
format first_date last_date date9.;
if first.HHID then first_date = purchase_date;
if last.HHID then do;
last_date = purchase_date;
duration =last_date-first_date + 1;
purchase_date=first_date;
output;
end;
run;
The data ALL looks like this:
HHID Purchase Date
1 1/2/2002
1 1/11/2002
1 1/13/2002
2
2
I want:
HHID Purchase Date First_Date Last_Date Duration
1 1/2/2002 1/2/2002 1/13/2002 12
How to decide which date is PurchaseDate.
Art's code can not work?
data cig.date_duration;
set cig.ALL;
by HHID;
retain first_date ;
if first.HHID then first_date = purchase_date;
if last.HHID then do;
last_date = purchase_date;
duration =last_date-first_date + 1;
purchase_date=first_date;
output;
end;
run;
Code not test.
Ksharp
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.