@CynthiaWei wrote:
Hi,
Thank you so much for the code! I was not able to work on this project in the past weeks so sorry for my late reply.
I think your understanding is quite close to my task. I think there is one criterion needs to add here: if a given combination of id*first_date*edu*class has one record do nothing, i.e. leave the last_date value as is. But if that combination has more than one record and one record has a date difference of 0, then assign last_date=first_date for the rest. So, if a given combination of id*first_date*edu*class has more than one record but each record has a date difference >0 and none of the records has a date difference of 0 within this ID, then leave them. Examples like follows:(there is no record like "5 8-19-2017 8-19-2017 1 1" for this person)
ID First_date Last_date EDU Class
5 8-19-2017 11-11-2017 1 1 5 8-19-2017 12-22-2017 1 1
Could you please advice me what the code look like when incorporating the above situation. I apologize for not specifying this before.
So instead of assigning last_date=first_date for all combinations that have multiple records, as I had programmed before, just do it for all combinations that have multiple records, of which at least one already has last_date=first_date:
In that case then during the first pass, the program should still count records by combination. But in addition, if any record has first_date=last_date, then add (say) 1,000 to the count. I choose 1,000 because I assume no combination is likely to have near 1,000 records.
This means:
A single record would have count=1 or 1,001
Multiple records would have counts 2,3,4,... or 1,002+
Therefore, during the second pass, only count>1,001 requires assigning last_date=first_date
data want;
set have (in=firstpass) have (in=secondpass);
by id;
array _n_fdates {%sysevalf("01jan2015"d):%sysevalf("31dec2017"d),3,3} _temporary_;
if first.id then call missing(of _n_fdates{*});
if firstpass then do;
_n_fdates{first_date,edu,class}+1;
if first_date=last_date then _n_fdates{first_date,edu,class}+1000;
end;
if secondpass;
if _n_fdates{first_date,edu,class}>1001 then last_date=first_date;
run;
And if the variables that are going to be involved in a given combination are varA, varB, varC, varD, varE, varF and varG, I just need to put all of them in the "{}" in the if then statement, right?
In short yes. Make provision for each variable in the array statement, and then use those variable names as the array indexes in the subsequent statements.
... View more