02-02-2016 11:23 PM
I need a query help for the following requirement. I have a dataset with 4 variables that looks like the following:
id date charge_mode code_no
a1 2-Oct-15 5 127
a1 5-Oct-15 5 104
a1 5-Oct-15 5 510
a1 5-Oct-15 5 511
a1 1-Nov-15 5 128
a1 5-Nov-15 5 330
a2 2-Oct-15 6 430
a2 5-Oct-15 6 230
a2 5-Oct-15 6 128
a2 5-Oct-15 6 127
a2 1-Nov-15 6 520
a2 5-Nov-15 6 210
This is a transaction data for each custid, I need to find if any of transactions that has the code_no(127,128,104) which contributes to flag1 and flag2 variables that I need to create for each "month" of a customer id.
The monthly cycle(/*most important*/) depends on the variable charge_mode, like in the above example the chargemode for id a1 is 5 and a2 is 6, meaning :
1. For A1,5th of oct to 4th of nov is the period to be considered for October month and 5th of Nov to 4th of Dec to be considered for November and so on.
2. For A2, the period would be from 6th to 5th and so on.
The logic for flags:
If in a month, all the 3 code_no(127,128,104) exist then flag_2=1 and flag_1=1;
if in a month, only 104 exist, then flag_2=1,flag_1=1
If in a month, only (127 and 128) exist, and if 128 is the latest found record for the month, then flag_1=0,flag_2=0
If in a month, only (127 and 128) exist and if 127 is the latest found for the month, then flag_1=1, flag_2=0
1. If only 127 is there , then flag_1=1, flag_2=0
2. if only 128, exists, then flag_1=0. flag_2=0
Want should be in the following format:
id flag1 flag2 month
a1 0 1 oct
a1 1 0 nov
a1 0 0 dec
followed a2 and so on....…..
02-03-2016 03:51 AM - edited 02-03-2016 03:52 AM
Your "want" does not meet your rules.
a1 in "October" has codes 104 and 12, therefore no flags should be set.
At the moment, my code would be:
data have; informat id $2. date date9. charge_mode 1. code_no 3. ; format date date9.; input id date charge_mode code_no ; cards; a1 2-Oct-15 5 127 a1 5-Oct-15 5 104 a1 5-Oct-15 5 510 a1 5-Oct-15 5 511 a1 1-Nov-15 5 128 a1 5-Nov-15 5 330 a2 2-Oct-15 6 430 a2 5-Oct-15 6 230 a2 5-Oct-15 6 128 a2 5-Oct-15 6 127 a2 1-Nov-15 6 520 a2 5-Nov-15 6 210 ; run; data int; set have; length month $6.; if day(date) >= charge_mode then month = put(year(date),z4.) !! put(month(date),z2.); else do; if month(date) = 1 then month = put(year(date)-1,z4.) !! '12'; else month = put(year(date),z4.) !! put(month(date)-1,z2.); end; run; proc sort data=int; by id month date; run; data want /*(keep=id month flag1 flag2)*/; set int; by id month; retain code127 code128 code104 ; if first.month then do; code127 = 0; code128 = 0; code104 = 0; end; if code_no = 127 then code127 = 1; if code_no = 128 then code128 = 1; if code_no = 104 then code104 = 1; if last.month then do; flag1 = 0; flag2 = 0; if code127 + code128 + code104 = 3 or (code104 = 1 and code127 + code128 = 0) then do; flag1 = 1; flag2 = 1; end; else if code104 = 0 and code127 + code128 = 2 then do; if code_no = 128 then do; /*flag1 = 0; flag2 = 0;*/ end; else if code_no = 127 then do; flag1 = 1; /*flag2 = 0;*/ end; end; else if code127 = 1 and code104 + code128 = 0 then do; flag1 = 1; /*flag2 = 0;*/ end; else if code128 = 1 and code104 + code127 = 0 then do; /*flag1 = 0;*/ flag2 = 1; end; output; end; run;
02-03-2016 11:26 AM
Will this process ever run on data with more than one calendar year? If so you may want your output and criteria to include a year requirement. I believe your critera may some potential for headaches with December data in general. And do leap days make any changes for February?