Hello everyone,
I was wondering if I could get some assistance and to be honest I don't
really know where to start. I have data like below and what I would like to
determine is the last stage an individual completed. The stages should be completed
in order from start to end and no individual in my data should have an end date
because no one has completed the process.
In some instances, individuals have dates that occur out of order for example id 001, 002, 004 etc.
and I do not want to exclude those individuals but maybe use a different flag variable to identify
those people.
What I would like to do is create a flag for where the person stopped. So for instance, id 007
made it through to review complete but did not make it to health date. Therefore, the
last stage that person completed was review complete. I would flag that person as being stopped at
review complete. The overall goal is to get a count of how many people got stopped in each stage.
Any help would be greatly appreciated.
ID |
Start |
Review |
Review Complete |
Health date |
Health date end |
End |
001 |
10/11/2019 |
. |
. |
10/24/2019 |
10/24/2019 |
. |
002 |
11/14/2019 |
. |
. |
11/15/2019 |
11/15/2019 |
. |
003 |
10/24/2019 |
10/29/2019 |
10/29/2019 |
11/20/2019 |
11/20/2019 |
. |
004 |
01/08/2019 |
. |
. |
02/19/2019 |
. |
. |
005 |
08/16/2019 |
08/16/2019 |
09/27/2019 |
09/20/2019 |
09/20/2019 |
. |
006 |
02/11/2019 |
. |
. |
. |
. |
. |
007 |
02/12/2019 |
02/13/2019 |
02/13/2019 |
. |
. |
. |
You need to use ARRAYs here, and VNAME().
This isn't dealing with 1/2/4 situation but I'll leave that to you or someone else. You can loop through the remainder of the array to check or you can use some basic math to check the number missing and where you are in the loop to determine if you need to exclude that record.
Here's a tutorial on using Arrays in SAS
https://stats.idre.ucla.edu/sas/seminars/sas-arrays/
data want;
set have;
length lastStage $50.;
array _myDates(*) start review review_complete health_date health_date_end end;
ShouldBeLast = dim(_myDates) - nmiss(of _mydates(*))+1;
*if you have 6 dates and 3 are missing, 4 should be the first missing record if in order;
do i=1 to dim(_myDates) while(last_flag=0);
if missing(_mydates(i)) then do;
last_flag=1;
lastStage = vname(_mydates(i));
if i ne ShouldBeLast then lastStage = "Out of Order";
end;
end;
run;
@luvscandy27 wrote:
Hello everyone,
I was wondering if I could get some assistance and to be honest I don't
really know where to start. I have data like below and what I would like to
determine is the last stage an individual completed. The stages should be completed
in order from start to end and no individual in my data should have an end date
because no one has completed the process.
In some instances, individuals have dates that occur out of order for example id 001, 002, 004 etc.
and I do not want to exclude those individuals but maybe use a different flag variable to identify
those people.
What I would like to do is create a flag for where the person stopped. So for instance, id 007
made it through to review complete but did not make it to health date. Therefore, the
last stage that person completed was review complete. I would flag that person as being stopped at
review complete. The overall goal is to get a count of how many people got stopped in each stage.Any help would be greatly appreciated.
ID
Start
Review
Review Complete
Health date
Health date end
End
001
10/11/2019
.
.
10/24/2019
10/24/2019
.
002
11/14/2019
.
.
11/15/2019
11/15/2019
.
003
10/24/2019
10/29/2019
10/29/2019
11/20/2019
11/20/2019
.
004
01/08/2019
.
.
02/19/2019
.
.
005
08/16/2019
08/16/2019
09/27/2019
09/20/2019
09/20/2019
.
006
02/11/2019
.
.
.
.
.
007
02/12/2019
02/13/2019
02/13/2019
.
.
.
You need to use ARRAYs here, and VNAME().
This isn't dealing with 1/2/4 situation but I'll leave that to you or someone else. You can loop through the remainder of the array to check or you can use some basic math to check the number missing and where you are in the loop to determine if you need to exclude that record.
Here's a tutorial on using Arrays in SAS
https://stats.idre.ucla.edu/sas/seminars/sas-arrays/
data want;
set have;
length lastStage $50.;
array _myDates(*) start review review_complete health_date health_date_end end;
ShouldBeLast = dim(_myDates) - nmiss(of _mydates(*))+1;
*if you have 6 dates and 3 are missing, 4 should be the first missing record if in order;
do i=1 to dim(_myDates) while(last_flag=0);
if missing(_mydates(i)) then do;
last_flag=1;
lastStage = vname(_mydates(i));
if i ne ShouldBeLast then lastStage = "Out of Order";
end;
end;
run;
@luvscandy27 wrote:
Hello everyone,
I was wondering if I could get some assistance and to be honest I don't
really know where to start. I have data like below and what I would like to
determine is the last stage an individual completed. The stages should be completed
in order from start to end and no individual in my data should have an end date
because no one has completed the process.
In some instances, individuals have dates that occur out of order for example id 001, 002, 004 etc.
and I do not want to exclude those individuals but maybe use a different flag variable to identify
those people.
What I would like to do is create a flag for where the person stopped. So for instance, id 007
made it through to review complete but did not make it to health date. Therefore, the
last stage that person completed was review complete. I would flag that person as being stopped at
review complete. The overall goal is to get a count of how many people got stopped in each stage.Any help would be greatly appreciated.
ID
Start
Review
Review Complete
Health date
Health date end
End
001
10/11/2019
.
.
10/24/2019
10/24/2019
.
002
11/14/2019
.
.
11/15/2019
11/15/2019
.
003
10/24/2019
10/29/2019
10/29/2019
11/20/2019
11/20/2019
.
004
01/08/2019
.
.
02/19/2019
.
.
005
08/16/2019
08/16/2019
09/27/2019
09/20/2019
09/20/2019
.
006
02/11/2019
.
.
.
.
.
007
02/12/2019
02/13/2019
02/13/2019
.
.
.
I ran the code above but all of the columns are empty accept for the should be last column? I'm not sure what I'm doing incorrectly.
@luvscandy27 wrote:
Hello everyone,
I was wondering if I could get some assistance and to be honest I don't
really know where to start. I have data like below and what I would like to
determine is the last stage an individual completed. The stages should be completed
in order from start to end and no individual in my data should have an end date
because no one has completed the process.
In some instances, individuals have dates that occur out of order for example id 001, 002, 004 etc.
and I do not want to exclude those individuals but maybe use a different flag variable to identify
those people.
I don't see any values "out of order". The right most values are all later than or the same as one to right when present.
So are you asking to identify the missing values?
It might help to provide what you would expect to see as a result.
You also say that "no individual in my data should have an end date because no one has completed the process". Are you expecting a check to see if that is true? What to do if there happens to be a value in the End variable?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.