BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
NewbieTom
Calcite | Level 5

Hi all,

 

Below is the original dataset that I want to split it into 3 different datasets as pass, open, and exception. Could someone kindly show me how to go about doing this? Any help is greatly appreciated.

 

1. Pass will contain all the ones with status = success and the difference between finished date and started date is equal or less than 1 day for the same client.

2. Open will contain all the ones with status=failed and no success for the same client.

3. Exception will contain all the ones with mix of failed and success status, but the finished date of last success is more than 1 day from the started date of first failure.

See the example output below.

 

Original dataset:

 

ClientStatusStartedFinished
adfs02.ccnb.comfailed2/27/20222/27/2022
adfs02.ccnb.comsuccess2/28/20222/28/2022
devcsssql01.ccnb.comfailed2/27/20222/27/2022
devcsssql01.ccnb.comsuccess2/28/20222/28/2022
devdbasql01.ccnb.comfailed2/14/20222/14/2022
devdbasql01.ccnb.comfailed2/15/20222/15/2022
devdbasql01.ccnb.comfailed2/15/20222/15/2022
devdbasql01.ccnb.comfailed2/15/20222/15/2022
devdlakesql01.ccnb.comfailed2/8/20222/9/2022
devdlakesql01.ccnb.comsuccess2/9/20222/9/2022
devdlakesql01.ccnb.comfailed2/9/20222/9/2022
devdlakesql01.ccnb.comfailed2/10/20222/10/2022
devdlakesql01.ccnb.comsuccess2/10/20222/10/2022
devdlakesql01.ccnb.comfailed2/12/20222/13/2022
devdlakesql01.ccnb.comfailed2/13/20222/13/2022
devdlakesql01.ccnb.comfailed2/13/20222/14/2022
devdlakesql01.ccnb.comfailed2/14/20222/14/2022
devdlakesql01.ccnb.comsuccess2/14/20222/15/2022
devdlakesql01.ccnb.comfailed2/28/20222/28/2022
devspdb01.ccnb.comfailed2/28/20223/1/2022
prddlakesql01.ccnb.comfailed2/28/20222/28/2022

 

Pass:

ClientStatusStartedFinished
adfs02.ccnb.comfailed2/27/20222/27/2022
adfs02.ccnb.comsuccess2/28/20222/28/2022
devcsssql01.ccnb.comfailed2/27/20222/27/2022
devcsssql01.ccnb.comsuccess2/28/20222/28/2022
devdlakesql01.ccnb.comfailed2/8/20222/9/2022
devdlakesql01.ccnb.comsuccess2/9/20222/9/2022
devdlakesql01.ccnb.comfailed2/9/20222/9/2022
devdlakesql01.ccnb.comfailed2/10/20222/10/2022
devdlakesql01.ccnb.comsuccess2/10/20222/10/2022

 

Open:

 

ClientStatusStartedFinished
devdbasql01.ccnb.comfailed2/14/20222/14/2022
devdbasql01.ccnb.comfailed2/15/20222/15/2022
devdbasql01.ccnb.comfailed2/15/20222/15/2022
devdbasql01.ccnb.comfailed2/15/20222/15/2022
devdlakesql01.ccnb.comfailed2/28/20222/28/2022
devspdb01.ccnb.comfailed2/28/20223/1/2022
prddlakesql01.ccnb.comfailed2/28/20222/28/2022

 

Exception:

 

ClientStatusStartedFinished
devdlakesql01.ccnb.comfailed2/12/20222/13/2022
devdlakesql01.ccnb.comfailed2/13/20222/13/2022
devdlakesql01.ccnb.comfailed2/13/20222/14/2022
devdlakesql01.ccnb.comfailed2/14/20222/14/2022
devdlakesql01.ccnb.comsuccess2/14/20222/15/2022
1 ACCEPTED SOLUTION

Accepted Solutions
mtnbikerjoshua
Obsidian | Level 7

It seems like you are looking at the rows in groups of all the failures for a certain client up until one success. I was able to accomplish your desired output by adding a group variable and then using the having clause in proc sql. I also had to add a sort order variable in order to keep the rows in the original order. Let me know if this is what you are looking for, or if I misunderstand.

 

 

data original;
  length client $50 status $10;
  input Client $ Status $ Started Finished;
  informat started finished mmddyy10.;
  format started finished mmddyy10.;
  datalines;
adfs02.ccnb.com failed 2/27/2022 2/27/2022
adfs02.ccnb.com success 2/28/2022 2/28/2022
devcsssql01.ccnb.com failed 2/27/2022 2/27/2022
devcsssql01.ccnb.com success 2/28/2022 2/28/2022
devdbasql01.ccnb.com failed 2/14/2022 2/14/2022
devdbasql01.ccnb.com failed 2/15/2022 2/15/2022
devdbasql01.ccnb.com failed 2/15/2022 2/15/2022
devdbasql01.ccnb.com failed 2/15/2022 2/15/2022
devdlakesql01.ccnb.com failed 2/8/2022 2/9/2022
devdlakesql01.ccnb.com success 2/9/2022 2/9/2022
devdlakesql01.ccnb.com failed 2/9/2022 2/9/2022
devdlakesql01.ccnb.com failed 2/10/2022 2/10/2022
devdlakesql01.ccnb.com success 2/10/2022 2/10/2022
devdlakesql01.ccnb.com failed 2/12/2022 2/13/2022
devdlakesql01.ccnb.com failed 2/13/2022 2/13/2022
devdlakesql01.ccnb.com failed 2/13/2022 2/14/2022
devdlakesql01.ccnb.com failed 2/14/2022 2/14/2022
devdlakesql01.ccnb.com success 2/14/2022 2/15/2022
devdlakesql01.ccnb.com failed 2/28/2022 2/28/2022
devspdb01.ccnb.com failed 2/28/2022 3/1/2022
prddlakesql01.ccnb.com failed 2/28/2022 2/28/2022
;
run;

data grouped;
  set original;
  by client;
  sortorder = _n_;
  if first.client then group = 1;
  output;
  if status = "success" then group + 1;
run;

proc sql;
  create table pass as
    select client, status, started, finished
    from grouped
    group by client, group
    having sum(status = "success") > 0 and max(finished) - min(started) <= 1
    order by sortorder
  ;
  create table open as
    select client, status, started, finished
    from grouped
    group by client, group
    having sum(status = "success") = 0
    order by sortorder
  ;
  create table exception as
    select client, status, started, finished
    from grouped
    group by client, group
    having sum(status = "success") > 0 and max(finished) - min(started) > 1
    order by sortorder
  ;
run;

 

 

View solution in original post

5 REPLIES 5
mtnbikerjoshua
Obsidian | Level 7

It seems like you are looking at the rows in groups of all the failures for a certain client up until one success. I was able to accomplish your desired output by adding a group variable and then using the having clause in proc sql. I also had to add a sort order variable in order to keep the rows in the original order. Let me know if this is what you are looking for, or if I misunderstand.

 

 

data original;
  length client $50 status $10;
  input Client $ Status $ Started Finished;
  informat started finished mmddyy10.;
  format started finished mmddyy10.;
  datalines;
adfs02.ccnb.com failed 2/27/2022 2/27/2022
adfs02.ccnb.com success 2/28/2022 2/28/2022
devcsssql01.ccnb.com failed 2/27/2022 2/27/2022
devcsssql01.ccnb.com success 2/28/2022 2/28/2022
devdbasql01.ccnb.com failed 2/14/2022 2/14/2022
devdbasql01.ccnb.com failed 2/15/2022 2/15/2022
devdbasql01.ccnb.com failed 2/15/2022 2/15/2022
devdbasql01.ccnb.com failed 2/15/2022 2/15/2022
devdlakesql01.ccnb.com failed 2/8/2022 2/9/2022
devdlakesql01.ccnb.com success 2/9/2022 2/9/2022
devdlakesql01.ccnb.com failed 2/9/2022 2/9/2022
devdlakesql01.ccnb.com failed 2/10/2022 2/10/2022
devdlakesql01.ccnb.com success 2/10/2022 2/10/2022
devdlakesql01.ccnb.com failed 2/12/2022 2/13/2022
devdlakesql01.ccnb.com failed 2/13/2022 2/13/2022
devdlakesql01.ccnb.com failed 2/13/2022 2/14/2022
devdlakesql01.ccnb.com failed 2/14/2022 2/14/2022
devdlakesql01.ccnb.com success 2/14/2022 2/15/2022
devdlakesql01.ccnb.com failed 2/28/2022 2/28/2022
devspdb01.ccnb.com failed 2/28/2022 3/1/2022
prddlakesql01.ccnb.com failed 2/28/2022 2/28/2022
;
run;

data grouped;
  set original;
  by client;
  sortorder = _n_;
  if first.client then group = 1;
  output;
  if status = "success" then group + 1;
run;

proc sql;
  create table pass as
    select client, status, started, finished
    from grouped
    group by client, group
    having sum(status = "success") > 0 and max(finished) - min(started) <= 1
    order by sortorder
  ;
  create table open as
    select client, status, started, finished
    from grouped
    group by client, group
    having sum(status = "success") = 0
    order by sortorder
  ;
  create table exception as
    select client, status, started, finished
    from grouped
    group by client, group
    having sum(status = "success") > 0 and max(finished) - min(started) > 1
    order by sortorder
  ;
run;

 

 

NewbieTom
Calcite | Level 5

Hi mtnbikerjoshua,

 

Thank you for the response. I think we are almost there. So, what I want to accomplish, and the challenge is to analyze the result of each row as well as the results of the group.

 

For "Pass" table, it would be

1) If the difference between the finished date of the current "success" and the finished date of the last "success" was less or equal to 1 day. Then 

all the preceding failures as well as the current success would be dumped into "Pass" table.

Would it be something like this?

a) If current client <> previous client and current statis = success and current Finished - current Started <= 1

b) If current client = previous client and current statis = success and previous status = success and current Finished - current Started <= 1

 

For "Open" table, it would be

1)  If there was no "success" in the group. I think we got this.

 

For "Exception" table, it would be

1)  If the difference between the finished date of the current "success" and the finished date of the last "success" was greater than 1 day. Then all the preceding failures (could be more than one) as well as the current success would be dumped into "Exception" table.

 

mtnbikerjoshua
Obsidian | Level 7

Hi @NewbieTom,

 

I'm not sure I totally understand you. You say that you want to analyze each row as well as the whole group. In that case, you could add the group condition to the data as a new column and then filter by it later, something like this:

 

 

proc sql;
  create table pass1 as
    select client, status, started, finished, 
      sum(status = "success") > 0 and max(finished) - min(started) <= 1 as pass 
    from grouped
    group by client, group
    order by sortorder
  ;
quit;

data pass;
  set pass1;
  where pass /* and other criteria */;
  drop pass;
run;

 

 

I'm not sure how that connects to the rest of your post.

 

For "Pass" table, it would be

1) If the difference between the finished date of the current "success" and the finished date of the last "success" was less or equal to 1 day. Then 

all the preceding failures as well as the current success would be dumped into "Pass" table. Do you mean to say that instead of comparing the finished date of the current success and the start date of the first of the preceding series of consecutive failures, you want to compare the finished date of the current success to the finished date of the preceding success? Is that true even if the previous success was for a different client?

Would it be something like this?

a) If current client <> previous client and current statis = success and current Finished - current Started <= 1 Is this referring to the case where the first row for a certain client is a success? If so, that would be handled by the code I shared. (There would only be one record in the group, so max(finished) would be the finished date of that one success and min(started) would be the started date for the same record.

b) If current client = previous client and current statis = success and previous status = success and current Finished - current Started <= 1 Is this referring to the case of two successes in a row for the same client? If so, the same is true that I said for a).

 

For "Open" table, it would be

1)  If there was no "success" in the group. I think we got this. Great!

 

For "Exception" table, it would be

1)  If the difference between the finished date of the current "success" and the finished date of the last "success" was greater than 1 day. Then all the preceding failures (could be more than one) as well as the current success would be dumped into "Exception" table. This is the same as what I did in my program, except use finished date of previous success instead of start date of first consecutive failure. Correct?

s_lassen
Meteorite | Level 14

I think it can be done in a single data step, like this:

data pass open exception;
  do until(last.client or status='success');
    set original;
    by client;
	if first.client then
	  date0=started;
	end;
  if status='failed' then
	type='O';
  else if finished-date0>1 then 
	type='E';
  else
	type='P';
  date0=finished;
  retain date0;
  do until(last.client or status='success');
    set original;
    by client;
    select(type);
	  when('O') output open;
	  when('P') output pass;
	  otherwise output exception;
	  end;
	end;
  drop date0 type;
run;
john_mccall
SAS Employee

Hi.  A possible additional option is as follows.  I put the original data in sasuser.original.

 

Data Success Failed Exception;
set sasuser.Original;
If upcase(status)="SUCCESS" and started-finished=0 then output Success;
Else If upcase(status)="FAILED" and started-finished=0 then output Failed;
Else output Exception;
run;

 
proc print data=sasuser.original;
title "Original Table";
run;
proc print data=Work.Success;
title "Success Table";
run;
proc print data=Work.Failed;
title "Failled Table";
run;
proc print data=Work.Exception;
title "Exception Table";
run;
 

Original Table

Obs Email Status Started Finished
1 adfs02.ccnb.com failed 02/27/2022 02/27/2022
2 adfs02.ccnb.com success 02/28/2022 02/28/2022
3 devcsssql01.ccnb.com failed 02/27/2022 02/27/2022
4 devcsssql01.ccnb.com success 02/28/2022 02/28/2022
5 devdbasql01.ccnb.com failed 02/14/2022 02/14/2022
6 devdbasql01.ccnb.com failed 02/15/2022 02/15/2022
7 devdbasql01.ccnb.com failed 02/15/2022 02/15/2022
8 devdbasql01.ccnb.com failed 02/15/2022 02/15/2022
9 devdlakesql01.ccnb.com failed 02/08/2022 02/09/2022
10 devdlakesql01.ccnb.com success 02/09/2022 02/09/2022
11 devdlakesql01.ccnb.com failed 02/09/2022 02/09/2022
12 devdlakesql01.ccnb.com failed 02/10/2022 02/10/2022
13 devdlakesql01.ccnb.com success 02/10/2022 02/10/2022
14 devdlakesql01.ccnb.com failed 02/12/2022 02/13/2022
15 devdlakesql01.ccnb.com failed 02/13/2022 02/13/2022
16 devdlakesql01.ccnb.com failed 02/13/2022 02/14/2022
17 devdlakesql01.ccnb.com failed 02/14/2022 02/14/2022
18 devdlakesql01.ccnb.com success 02/14/2022 02/15/2022
19 devdlakesql01.ccnb.com failed 02/28/2022 02/28/2022
20 devspdb01.ccnb.com failed 02/28/2022 03/01/2022
21 prddlakesql01.ccnb.com failed 02/28/2022 02/28/2022

 


Success Table

Obs Email Status Started Finished
1 adfs02.ccnb.com success 02/28/2022 02/28/2022
2 devcsssql01.ccnb.com success 02/28/2022 02/28/2022
3 devdlakesql01.ccnb.com success 02/09/2022 02/09/2022
4 devdlakesql01.ccnb.com success 02/10/2022 02/10/2022

 


Failled Table

Obs Email Status Started Finished
1 adfs02.ccnb.com failed 02/27/2022 02/27/2022
2 devcsssql01.ccnb.com failed 02/27/2022 02/27/2022
3 devdbasql01.ccnb.com failed 02/14/2022 02/14/2022
4 devdbasql01.ccnb.com failed 02/15/2022 02/15/2022
5 devdbasql01.ccnb.com failed 02/15/2022 02/15/2022
6 devdbasql01.ccnb.com failed 02/15/2022 02/15/2022
7 devdlakesql01.ccnb.com failed 02/09/2022 02/09/2022
8 devdlakesql01.ccnb.com failed 02/10/2022 02/10/2022
9 devdlakesql01.ccnb.com failed 02/13/2022 02/13/2022
10 devdlakesql01.ccnb.com failed 02/14/2022 02/14/2022
11 devdlakesql01.ccnb.com failed 02/28/2022 02/28/2022
12 prddlakesql01.ccnb.com failed 02/28/2022 02/28/2022

 


Exception Table

Obs Email Status Started Finished
1 devdlakesql01.ccnb.com failed 02/08/2022 02/09/2022
2 devdlakesql01.ccnb.com failed 02/12/2022 02/13/2022
3 devdlakesql01.ccnb.com failed 02/13/2022 02/14/2022
4 devdlakesql01.ccnb.com success 02/14/2022 02/15/2022
5 devspdb01.ccnb.com failed 02/28/2022 03/01/2022

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1087 views
  • 1 like
  • 4 in conversation