dear experts,
Please help me with the following:
2 Data Sets
1. I have three different kinds of dates in a demographic data set called cig.mergedpanel: work_date, restaurant_date and bar_date.
2. I have a purchase data set called cig.panelsample.
I need to count total number of purchases for each householdsin cig.panelsample before each of the three dates: work_date, restaurant_date and bar_date.
data cig.panelsample_b4law;
SET cig.panelsample;
by household_id;
STEP1
if first.household_id then npb4w=0 npb4r=0 npb4b=0;
STEP2
if cig.panelsample(purchase_date)<cig.mergedpanel(work_date) then npb4w+1;
if cig.panelsample(purchase_date)<cig.mergedpanel(restaurant_date) then npb4r+1;
if cig.panelsample(purchase_date)<cig.mergedpanel(bar_date) then npb4b+1;
run;
This wouldn't work for many reasons that are unknown to me (first time user). One obvious reason is that I have multiple statements after then (STEP1).
Also, i'm not sure how to make comparisons between two different data sets without merging the two (STEP2).
any help would be greatly appreciated!
best,
c
Then, I have to think that your merge didn't create what you think it did. Your code appears to be correct, other than you probably want to add a statement that only outputs the results for the last record for each household.
Without seeing the data one can only guess. My first guesses would be that some or all of the dates haven't been read in as SAS dates or that you simply don't have any records that meet the conditions that you specified.
Take a look at the following. It appears to be doing what you want to do and the only difference between it and your code is the addition of the suggested output statement:
data cig.Merged2;
informat purchase_date
work_date
restaurant_date
bar_date
date9.;
input household_id
purchase_date
work_date
restaurant_date
bar_date
;
cards;
1 10AUG2011 11AUG2011 12AUG2011 10AUG2011
1 09AUG2011 11AUG2011 12AUG2011 10AUG2011
1 15AUG2011 11AUG2011 12AUG2011 10AUG2011
2 06AUG2011 11AUG2011 12AUG2011 10AUG2011
2 09AUG2011 11AUG2011 12AUG2011 10AUG2011
2 08AUG2011 11AUG2011 12AUG2011 10AUG2011
;
data cig.Merged2_bna_law;
SET cig.Merged2;
by household_id;
if first.household_id then do;
npb4w=0;
npb4r=0;
npb4b=0;
end;
if purchase_date<work_date then npb4w+1;
if purchase_date<restaurant_date then npb4r+1;
if purchase_date<bar_date then npb4b+1;
if last.household_id then output;
run;
You have to provide more information.
The step 1 part is easy to correct. It has to be in the form:
if first.household_id then do;
npb4w=0;
npb4r=0;
npb4b=0;
end;
However, regarding step 2, you probably have to merge the two files together, but one can't show you how without knowing the structure of the two files. Does the mergedpanel file include the same household_id's as the panelsample file or, if not, what does it look like?
thank you, sir. same household_id's.
i will merged the two first. could you please answer as if everything is in the same file?
each household has different work_date, restaurant_date and bar_date.
many thanks,
c
If you merge the two files together (and that can be separately or in the same datastep), then you would just use code something like:
if purchase_date<work_date then npb4w+1;
if purchase_date<restaurant_date then npb4r+1;
if purchase_date<bar_date then npb4b+1;
I've tried the following, but I get 0's for all three variables and all rows.
data cig.Merged2_bna_law;
SET cig.Merged2;
by household_id;
if first.household_id then do;
npb4w=0;
npb4r=0;
npb4b=0;
end;
if purchase_date<work_date then npb4w+1;
if purchase_date<restaurant_date then npb4r+1;
if purchase_date<bar_date then npb4b+1;
run;
Then, I have to think that your merge didn't create what you think it did. Your code appears to be correct, other than you probably want to add a statement that only outputs the results for the last record for each household.
Without seeing the data one can only guess. My first guesses would be that some or all of the dates haven't been read in as SAS dates or that you simply don't have any records that meet the conditions that you specified.
Take a look at the following. It appears to be doing what you want to do and the only difference between it and your code is the addition of the suggested output statement:
data cig.Merged2;
informat purchase_date
work_date
restaurant_date
bar_date
date9.;
input household_id
purchase_date
work_date
restaurant_date
bar_date
;
cards;
1 10AUG2011 11AUG2011 12AUG2011 10AUG2011
1 09AUG2011 11AUG2011 12AUG2011 10AUG2011
1 15AUG2011 11AUG2011 12AUG2011 10AUG2011
2 06AUG2011 11AUG2011 12AUG2011 10AUG2011
2 09AUG2011 11AUG2011 12AUG2011 10AUG2011
2 08AUG2011 11AUG2011 12AUG2011 10AUG2011
;
data cig.Merged2_bna_law;
SET cig.Merged2;
by household_id;
if first.household_id then do;
npb4w=0;
npb4r=0;
npb4b=0;
end;
if purchase_date<work_date then npb4w+1;
if purchase_date<restaurant_date then npb4r+1;
if purchase_date<bar_date then npb4b+1;
if last.household_id then output;
run;
I agree with art297. I tried with simple numbers instead of dates and can see the three variables correctly accumulated within each household.
thank you very much, sir! 
When was it changed that these types of group by summarizations no longer required a retain statement containing the summarizing variables? I tested both ways and results are the same (using SAS under Unix 9.2). However I remember in previous versions or possible different circumstances where if the retain statement was not used results were incorrect.
data cig.Merged2_bna_law;
SET cig.Merged2;
by household_id;
retain npb4w npb4r npb4b;
if first.household_id then do;
npb4w=0;
npb4r=0;
npb4b=0;
end;
if purchase_date<work_date then npb4w+1;
if purchase_date<restaurant_date then npb4r+1;
if purchase_date<bar_date then npb4b+1;
if last.household_id then output;
run;
When you use the form
npb4w+1
the variable is automatically retained and the summation is equivalent to using the sum function.
Ah, that's right, thank you for the clarification.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
