Hi,someone know summary statement please help me with the explanation of line with number below.
I have written the annotation as I understood my way,however they are not the situation when I check it back in the excel file.
Thanks in advance!
data ccm_base_trans_3m;
retain txn_amt_rmb_08 txn_amt_rmb_09 0;
set ccm_base_trans(drop=event_id org where=("1jul2014"d le eff_date le "30sep2014"d));
if "1aug2014"d le eff_date le "31aug2014"d then do;
txn_amt_rmb_08=txn_amt_rmb;
end;
if "1sep2014"d le eff_date le "30sep2014"d then do;
txn_amt_rmb_09=txn_amt_rmb;
txn_date=datepart(eff_date);
end;
length mcc_desc $200.;
mcc_desc=put(mcc,$mcc.);
run;
proc summary data=ccm_base_trans_3m(keep=becif_custid txn_amt_rmb txn_amt_rmb_08 txn_amt_rmb_09 eff_date) nway missing;
/*each becif_custid*/
class becif_custid;
var txn_amt_rmb txn_amt_rmb_08 txn_amt_rmb_09 eff_date;
output out=ccm_base_trans_3m(rename=(_freq_=p3m_sum_cnt) drop=_type_)
/*1.count of txn_amt_rmb for each becif_custid*/
sum(txn_amt_rmb) =p3m_sum_cnt
/*2.sum of txn_amt_rmb_08 for each becif_custid*/
sum(txn_amt_rmb_08)=txn_amt_rmb_08
/*3.sum of txn_amt_rmb_09 for each becif_custid*/
sum(txn_amt_rmb_09)=txn_amt_rmb_09
/*4.last date of eff_date for each becif_custid*/
max(eff_date)=last_txn_date;
run;
@VDD - the usage of date1 LE any_date LE last_date is equivalent to say: any_date BETWEEN date1 and last_date.
@Geo- - I understand that you got some different results using sas vs excel.
Is the difference in all four variables -
txn_amt_rmb txn_amt_rmb_08 txn_amt_rmb_09 eff_date
are dates equal ?
while txn_amt_rmb_08 contains august values, txn_amt_rmb_09 contains september values -
pay attention that in sas, according to your code, txn_amt_rmb contains values of the whole period.
You have not supplied information what have you done in excel - maybe you summed july only ?
If you want to sum each month separately you could do, either:
data ccm_base_trans_3m;
/* retain txn_amt_rmb_08 txn_amt_rmb_09 0; <<< NO NEED */
set ccm_base_trans(drop=event_id org where=("1jul2014"d le eff_date le "30sep2014"d));
if "1aug2014"d le eff_date le "31aug2014"d then do;
txn_amt_rmb_08=txn_amt_rmb;
txn_amt_rmb = 0; /* line added */
end; ELSE
if "1sep2014"d le eff_date le "30sep2014"d then do;
txn_amt_rmb_09=txn_amt_rmb;
txn_amt_rmb = 0; /* line added */
/* txn_date=datepart(eff_date); <<< is eff_date not a date variable ? */
end;
length mcc_desc $200.;
mcc_desc=put(mcc,$mcc.);
run;
or you do all in one step:
proc summary data=ccm_base_trans
(keep=becif_custid txn_amt_rmb txn_amt_rmb_08 txn_amt_rmb_09 eff_date
where=("1jul2014"d le eff_date le "30sep2014"d)) nway missing;
class becif_custid eff_date;
format eff_date yymms7.;
var txn_amt_rmb txn eff_date;
output out=ccm_base_trans_3m
(rename=(_freq_=p3m_sum_cnt) drop=_type_)
sum(txn_amt_rmb) =p3m_sum_cnt
max(eff_date)=last_txn_date;
run;
Question are you sure that you don't want the beginning of the month to be greater than or equal to rather than less than or equal to?
@Geo-has "1jul2014"d le eff_date le "30sep2014"d
would this be the soliton: "1jul2014"d => eff_date <= "30sep2014"d to change the starting of the test to be within the bounds.
the test @Geo- has only allows records that meet the condition of less than or equal to the beginning of the month in the first agreement.
@VDD - the usage of date1 LE any_date LE last_date is equivalent to say: any_date BETWEEN date1 and last_date.
@Geo- - I understand that you got some different results using sas vs excel.
Is the difference in all four variables -
txn_amt_rmb txn_amt_rmb_08 txn_amt_rmb_09 eff_date
are dates equal ?
while txn_amt_rmb_08 contains august values, txn_amt_rmb_09 contains september values -
pay attention that in sas, according to your code, txn_amt_rmb contains values of the whole period.
You have not supplied information what have you done in excel - maybe you summed july only ?
If you want to sum each month separately you could do, either:
data ccm_base_trans_3m;
/* retain txn_amt_rmb_08 txn_amt_rmb_09 0; <<< NO NEED */
set ccm_base_trans(drop=event_id org where=("1jul2014"d le eff_date le "30sep2014"d));
if "1aug2014"d le eff_date le "31aug2014"d then do;
txn_amt_rmb_08=txn_amt_rmb;
txn_amt_rmb = 0; /* line added */
end; ELSE
if "1sep2014"d le eff_date le "30sep2014"d then do;
txn_amt_rmb_09=txn_amt_rmb;
txn_amt_rmb = 0; /* line added */
/* txn_date=datepart(eff_date); <<< is eff_date not a date variable ? */
end;
length mcc_desc $200.;
mcc_desc=put(mcc,$mcc.);
run;
or you do all in one step:
proc summary data=ccm_base_trans
(keep=becif_custid txn_amt_rmb txn_amt_rmb_08 txn_amt_rmb_09 eff_date
where=("1jul2014"d le eff_date le "30sep2014"d)) nway missing;
class becif_custid eff_date;
format eff_date yymms7.;
var txn_amt_rmb txn eff_date;
output out=ccm_base_trans_3m
(rename=(_freq_=p3m_sum_cnt) drop=_type_)
sum(txn_amt_rmb) =p3m_sum_cnt
max(eff_date)=last_txn_date;
run;
1 - p3m_sum_cnt seems to be erroneous and display either the frequency (count) or the summary of txt_amt_rmb per customer ID as same variable name was given to both.
2, 3 and 4 - are interpreted correctly;
There is no summary for the 1st month in the quarter.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.