BookmarkSubscribeRSS Feed
nxmogil
Obsidian | Level 7

Hi all, my data looks like this 

id     phase    date

01    Active   20jan2022

02   closed    21Feb2022

03   Active    20Mar2022

 

I need to create a error flag for  the records only after the closed phase which means that even the entries after the date 21feb2022. 

Please help

12 REPLIES 12
PaigeMiller
Diamond | Level 26

Is the variable named DATE numeric or character? What role does variable ID play in this problem? Could there be more than one "closed" record? What should the code do then?

 

Can you provide a larger (say 20 rows) example that is realistic? I am reluctant to work on this small example, as it seems as if you have greatly oversimplified any real-world problem. 

 

Please provide data as working SAS data step code (examples and instructions)

--
Paige Miller
nxmogil
Obsidian | Level 7
Hi Thank you so much for the quick reply.
Date is a character and Id is the identification number
data test;
input ID$ phase $ date $;
datalines;
001 Active 08/31/2023
001 Closed 09/29/2023
001 Active 10/23/2023
001 Active 11/25/2023
;
run;

I need the output with the last two rows as
they have been recorded after the phase is closed
Here I have given the dates in order, but in raw dataset its not sorted
PaigeMiller
Diamond | Level 26

I remain reluctant to write code for this greatly simplified example, as I think whatever code I write will need to be changed for the real-world problem.


What should be done if there is more than one ID? What should be done if within an ID, there is more than one 'Closed' record? 

 

Can you provide a larger (say 20 rows) example that is more realistic?

 

In your real data (not this simplified example data), are the dates numeric or character, according to PROC CONTENTS? Dates should be numeric, but we can deal with that after we address the above concerns.

--
Paige Miller
andreas_lds
Jade | Level 19

Step 1: Fix the data type of the variable date, changing it to a numeric variable with a date format attached.

Untested:

data work.date_fixed;
  set work.test(rename=(date = str_date));
  date = input(str_date, mmddyy10.);
  format date date9.;
  drop str_date;
run;

Step 2: Use a retained variable to store the date variable when phase = "closed", a subsetting-if can be used to keep only obs with date > retained_date. If you have multiple ids in the data, i am sure your need retained_date set to missing if the id changes.

nxmogil
Obsidian | Level 7

I have tried like this 

 

If phase = "Closed" and input (date,mmddyy10.) < max(input (date,mmddyy10.) then do;

error = "Account_closed";

output;

end;

 

but it dint thrown me any results ..

nxmogil
Obsidian | Level 7
I cant create a datastep , i was asked to create a validation rule need that if statement which compares for any account if the phase is closed and have records after that
PaigeMiller
Diamond | Level 26

@nxmogil wrote:
I cant create a datastep , i was asked to create a validation rule need that if statement which compares for any account if the phase is closed and have records after that

But you did create a data step. I am asking for a more realistic data set. It doesn't have to be real data; it can be fake data but it does have to be more representative of the real world problem.

 

Anyway, I have asked for you to provide a bunch of information; you have addressed some of my points but not others, and so I await the other information I asked for.

--
Paige Miller
nxmogil
Obsidian | Level 7
data test;
input ID$ phase $ date $;
datalines;
001 Active 08/31/2023
001 Closed 09/29/2023
001 Active 10/23/2023
001 Active 11/25/2023
002 Active 07/22/2023
002 Active 08/13/2023
003 Active 02/12/2022
003 Active 03/14/2022
003 Active 04/15/2022
003 Active 01/10/2022
004 Active 03/15/2022
004 Active 01/16/2022
004 Closed 04/23/2022
004 Active 06/26/2022
004 Active 07/23/2022
005 Active 12/01/2022
005 Active 01/10/2023
005 Active 04/23/2023
005 Closed 02/12/2023
005 Active 03/12/2023
;
run

This is my dataset and need to create a validation rule to throw all improper records after the account is closed for each account



PaigeMiller
Diamond | Level 26

So you still have not addressed the issue of whether or not the dates in your REAL data set are numeric or character. I will assume the dates in your real data set are numeric (but really, I shouldn't have to assume, but I asked for this information, and you didn't provide it)

 

data test;
input ID$ phase $ date mmddyy10.;
format date date9.;
datalines;
001 Active 08/31/2023
001 Closed 09/29/2023
001 Active 10/23/2023
001 Active 11/25/2023
002 Active 07/22/2023
002 Active 08/13/2023
003 Active 02/12/2022
003 Active 03/14/2022
003 Active 04/15/2022
003 Active 01/10/2022
004 Active 03/15/2022
004 Active 01/16/2022
004 Closed 04/23/2022
004 Active 06/26/2022
004 Active 07/23/2022
005 Active 12/01/2022
005 Active 01/10/2023
005 Active 04/23/2023
005 Closed 02/12/2023
005 Active 03/12/2023
;

data want;
    set test;
    by id;
    retain flag closed_date;
    if first.id then flag=0;
    if phase='Closed' then do;
        flag=1;
        closed_date=date;
    end;
    if flag=1 and date>closed_date then after_closed=1;
    format closed_date date9.;
run;
--
Paige Miller
nxmogil
Obsidian | Level 7
Hi sorry for the late reply, but i did mentioned in previous messages that dat e is charecter, I will try this
Kurt_Bremser
Super User

@nxmogil wrote:
Hi sorry for the late reply, but i did mentioned in previous messages that dat e is charecter, I will try this

So your first step has to be to convert these strings to numeric SAS date values; still better, fix the data import process so that the data is correctly read into the SAS environment in the first place.

Patrick
Opal | Level 21

@nxmogil wrote:
I cant create a datastep , i was asked to create a validation rule need that if statement which compares for any account if the phase is closed and have records after that

Do you need to implement this logic in the context of some SAS product like the DI Studio data validation transformation or a rule engine used by some SAS solution? If so please tell including the version of the product/solution and the version of SAS.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2304 views
  • 0 likes
  • 5 in conversation