DATA Step, Macro, Functions and more

Subgroup count required

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 79
Accepted Solution

Subgroup count required

Hi Expoets,

 

Here, I need to get the distinct Case count for the cases that has ateast one error levels(L1, L2 or L3) excluding NA and null values.

L1, L2 or L3 are the errors.
Further I want all the sub-grouped values for each CaseType and Serousness.

Tried By Group processing but didn't solve the purpose.

Getting this is quit critical for me so please suggest me a way so that I could achieve this.

-Thanks


Accepted Solutions
Solution
3 weeks ago
PROC Star
Posts: 1,847

Re: Subgroup count required

[ Edited ]
Posted in reply to Rahul_SAS

@Rahul_SAS Alright, sorry that i didn;t see your updates. It would help to get your requirement in full in go at the very top like @MarkWik suggested. It's kinda difficult to read through the thread back and forth. Thus the delays. 

Please see if this helps

 

data have;
infile cards truncover;
input (EmpName	Customers	ProjectCode	CaseID	CaseType 	Serousness	ErrorLevels) (:$30.);
cards;
Emp3	Cust1	1006680	1002	Case4	Serious	L1
Emp3	Cust1	1006680	1002	Case4	Serious	L1
Emp3	Cust1	1006680	1002	Case4	Serious	L2
Emp3	Cust1	1006680	1002	Case4	Serious	L2
Emp3	Cust1	1006680	1002	Case4	Serious	L3
Emp6	Cust1	1006680	1003	Case3	Serious	NULL
Emp1	Cust2	1004931	1005	Case6	Serious	NULL
Emp6	Cust1	1006680	1007	Case4	Serious	L2
Emp6	Cust1	1006680	1007	Case4	Serious	L2
Emp6	Cust1	1006680	1007	Case4	Serious	L2
Emp6	Cust1	1006680	1007	Case4	Serious	L2
Emp6	Cust2	1004931	1008	Case4	Non-Seri	L2
Emp5	Cust2	1004931	1008	Case4	Non-Seri	L2
Emp5	Cust2	1004931	1008	Case4	Non-Seri	L2
Emp6	Cust2	1004931	1008	Case4	Non-Seri	L2
Emp6	Cust2	1004931	1008	Case4	Non-Seri	L2
Emp5	Cust2	1004931	1008	Case4	Non-Seri	L2
Emp6	Cust2	1004931	1008	Case4	Non-Seri	NA
Emp5	Cust2	1004931	1008	Case4	Non-Seri	NA
Emp6	Cust1	1006680	1009	Case4	Non-Seri	NULL
Emp6	Cust1	1006680	1012	Case4	Non-Seri	NULL
Emp6	Cust1	1006680	1013	Case1	Non-Seri	NULL
Emp8	Cust1	1006680	1014	Case2	Serious	NULL
Emp7	Cust1	1006680	1015	Case2	Serious	L3
Emp6	Cust1	1006680	1017	Case4	NA	NULL
Emp4	Cust1	1006680	1017	Case4	NA	NULL
Emp5	Cust1	1006680	1018	Case4	Non-Seri	L2
Emp5	Cust1	1006680	1018	Case4	Non-Seri	L2
Emp5	Cust1	1006680	1018	Case4	Non-Seri	L3
Emp5	Cust1	1006680	1018	Case4	Non-Seri	NA
Emp5	Cust1	1006680	1019	Case1	Non-Seri	L1
Emp5	Cust1	1006680	1019	Case1	Non-Seri	L2
Emp5	Cust1	1006680	1019	Case1	Non-Seri	L3
Emp6	Cust1	1006680	1020	Case1	Non-Seri	L1
Emp2	Cust1	1006680	1020	Case1	Non-Seri	L1
Emp2	Cust1	1006680	1020	Case1	Non-Seri	L2
Emp6	Cust1	1006680	1020	Case1	Non-Seri	L2
Emp2	Cust1	1006680	1020	Case1	Non-Seri	L3
Emp6	Cust1	1006680	1020	Case1	Non-Seri	L3
Emp6	Cust1	1006680	1020	Case1	Non-Seri	L3
Emp2	Cust1	1006680	1020	Case1	Non-Seri	L3
Emp6	Cust1	1006680	1021	Case4	NA	NULL
Emp5	Cust1	1006680	1022	Case1	Non-Seri	NULL
Emp4	Cust1	1006680	1023	Case4	Non-Seri	NULL
Emp4	Cust1	1006680	1024	Case5	Non-Seri	L3
Emp2	Cust1	1006680	1025	Case5	Non-Seri	NULL
Emp5	Cust1	1006680	1026	Case1	Non-Seri	NULL
Emp4	Cust1	1006680	1027	Case4	Death/Li	NULL
Emp2	Cust1	1006680	1028	Case4	Serious	L3
Emp4	Cust1	1006680	1029	Case2	Serious	L3
Emp2	Cust1	1006680	1030	Case5	Serious	NA
Emp4	Cust1	1006680	1031	Case3	Non-Seri	NULL
Emp4	Cust1	1006680	1032	Case1	Non-Seri	L3
Emp4	Cust1	1006680	1033	Case3	Serious	NULL
Emp2	Cust1	1006680	1034	Case1	NA	NULL
Emp4	Cust1	1006680	1035	Case5	Serious	NULL
;

data tmp;
set have;
length grp $50;
if errorlevels in ('L1','L2','L3') then grp='ERROR';
else grp=errorlevels;
run;
proc sort data=tmp;
by caseid grp;
run;

data want;
n1=0;
do n=1 by 1  until(last.caseid);
set tmp;
by caseid grp;
if grp='ERROR' then n1+1;
end;
do until(last.caseid);
set tmp;
by caseid grp;
if grp='ERROR' then do;CaseCount=1/n1; case_with_errors=1/n1;end;
else case_with_errors=0;
if grp='NA' then CaseCount =0;
else if grp='NULL' then CaseCount=1;
CaseCount_sum+CaseCount;
case_with_errors_sum+case_with_errors;
output;
end;
drop grp n: ;
run;

 

Note: The above will require some tweak depending on the logic to consider variables  Seriousness<-> ErrorLevels to deal with NA and NULL besides ERROR values L series as I didn't dig deep enough to comprehend your dataset. But do let me know if you want me to further change

View solution in original post


All Replies
Frequent Contributor
Posts: 105

Re: Subgroup count required

[ Edited ]
Posted in reply to Rahul_SAS

@Rahul_SAS You can make it easy for responders  by providing your sample data in the form of a datastep and your expected OUTPUT sample for the input sample. This would help the community test their code to see if they match your expected output before they post the code. Thanks!

Frequent Contributor
Posts: 79

Re: Subgroup count required

Attached a new sample data.

 

I want two variables to be created-

1. Distinct CaseID Count (total should be 29)

2. Case Processed With atleast one error(L1,L2 or L3) and count should be 11

 

I don't know how will I get this but i need these numbers to use in SAS VA.

Occasional Contributor
Posts: 8

Re: Subgroup count required

Posted in reply to Rahul_SAS
proc sql;
create table work.want as
select distinct (select count(distinct CaseID) from work.data) as d_Case_ID
, (select count(distinct CaseID) from work.data where ErrorLevels in('L1','L2','L3')) as atl1_error_Case_ID
from work.data;
quit;

proc print data=want; run;

The new data set, work.want has two columns with the desired counts. 

Frequent Contributor
Posts: 79

Re: Subgroup count required

Posted in reply to JustABitOutside

thank you sir for replying.

 

the numbers are absolutely fine but I need these numbers to be distributed and populated at row level so that i will be able to use this output for reporting purpose in SAS VA

 

Required Final Output is Attached

PROC Star
Posts: 1,847

Re: Subgroup count required

Posted in reply to Rahul_SAS
data have;
infile cards truncover;
input (EmpName	Customers	ProjectCode	CaseID	CaseType 	Serousness	ErrorLevels) (:$30.);
cards;
Emp3	Costomer1	1006680	1000000002	Case4	Serious	L3
Emp6	Customer2	1004931	1000000008	Case4	Non-Serious	L2
Emp5	Costomer1	1006680	1000000019	Case1	Non-Serious	L3
Emp4	Costomer1	1006680	1000000027	Case4	Death/LifeThreatening	NULL
Emp2	Costomer1	1006680	1000000034	Case1	NA	NULL
Emp3	Costomer1	1006680	1000000002	Case4	Serious	L2
Emp5	Costomer1	1006680	1000000018	Case4	Non-Serious	L3
Emp3	Costomer1	1006680	1000000002	Case4	Serious	L1
Emp4	Costomer1	1006680	1000000023	Case4	Non-Serious	NULL
Emp5	Costomer1	1006680	1000000022	Case1	Non-Serious	NULL
Emp5	Costomer1	1006680	1000000019	Case1	Non-Serious	L2
Emp4	Costomer1	1006680	1000000029	Case2	Serious	L3
Emp5	Costomer1	1006680	1000000018	Case4	Non-Serious	L2
Emp6	Costomer1	1006680	1000000007	Case4	Serious	L2
Emp6	Costomer1	1006680	1000000007	Case4	Serious	L2
Emp5	Costomer1	1006680	1000000019	Case1	Non-Serious	L1
Emp4	Costomer1	1006680	1000000032	Case1	Non-Serious	L3
Emp2	Costomer1	1006680	1000000020	Case1	Non-Serious	L3
Emp5	Customer2	1004931	1000000008	Case4	Non-Serious	L2
Emp8	Costomer1	1006680	1000000014	Case2	Serious	NULL
Emp6	Costomer1	1006680	1000000017	Case4	NA	NULL
Emp3	Costomer1	1006680	1000000002	Case4	Serious	L1
Emp6	Costomer1	1006680	1000000003	Case3	Serious	NULL
Emp2	Costomer1	1006680	1000000028	Case4	Serious	L3
Emp4	Costomer1	1006680	1000000035	Case5	Serious	NULL
Emp6	Customer2	1004931	1000000008	Case4	Non-Serious	NA
Emp4	Costomer1	1006680	1000000017	Case4	NA	NULL
Emp6	Costomer1	1006680	1000000020	Case1	Non-Serious	L3
Emp4	Costomer1	1006680	1000000033	Case3	Serious	NULL
Emp5	Customer2	1004931	1000000008	Case4	Non-Serious	L2
Emp2	Costomer1	1006680	1000000030	Case5	Serious	NA
Emp5	Costomer1	1006680	1000000018	Case4	Non-Serious	L2
Emp7	Costomer1	1006680	1000000015	Case2	Serious	L3
Emp1	Customer2	1004931	1000000005	Case6	Serious	NULL
Emp6	Costomer1	1006680	1000000012	Case4	Non-Serious	NULL
Emp2	Costomer1	1006680	1000000025	Case5	Non-Serious	NULL
Emp2	Costomer1	1006680	1000000020	Case1	Non-Serious	L2
Emp6	Costomer1	1006680	1000000007	Case4	Serious	L2
Emp6	Costomer1	1006680	1000000020	Case1	Non-Serious	L1
Emp4	Costomer1	1006680	1000000024	Case5	Non-Serious	L3
Emp4	Costomer1	1006680	1000000031	Case3	Non-Serious	NULL
Emp5	Costomer1	1006680	1000000026	Case1	Non-Serious	NULL
Emp6	Customer2	1004931	1000000008	Case4	Non-Serious	L2
Emp6	Customer2	1004931	1000000008	Case4	Non-Serious	L2
Emp6	Costomer1	1006680	1000000013	Case1	Non-Serious	NULL
Emp6	Costomer1	1006680	1000000021	Case4	NA	NULL
Emp6	Costomer1	1006680	1000000020	Case1	Non-Serious	L3
Emp2	Costomer1	1006680	1000000020	Case1	Non-Serious	L3
Emp5	Customer2	1004931	1000000008	Case4	Non-Serious	L2
Emp6	Costomer1	1006680	1000000007	Case4	Serious	L2
Emp6	Costomer1	1006680	1000000020	Case1	Non-Serious	L2
Emp6	Costomer1	1006680	1000000009	Case4	Non-Serious	NULL
Emp5	Customer2	1004931	1000000008	Case4	Non-Serious	NA
Emp3	Costomer1	1006680	1000000002	Case4	Serious	L2
Emp5	Costomer1	1006680	1000000018	Case4	Non-Serious	NA
Emp2	Costomer1	1006680	1000000020	Case1	Non-Serious	L1
;
proc sql;
create table want as
select count(distinct CaseID) as d_Case_ID , sum(e) as atl1_error_Case_ID
from (select distinct CaseID , ErrorLevels in ('L1','L2','L3') as e from have);
quit;
Frequent Contributor
Posts: 79

Re: Subgroup count required

Posted in reply to novinosrin

no, please see attached Final Data.

Solution
3 weeks ago
PROC Star
Posts: 1,847

Re: Subgroup count required

[ Edited ]
Posted in reply to Rahul_SAS

@Rahul_SAS Alright, sorry that i didn;t see your updates. It would help to get your requirement in full in go at the very top like @MarkWik suggested. It's kinda difficult to read through the thread back and forth. Thus the delays. 

Please see if this helps

 

data have;
infile cards truncover;
input (EmpName	Customers	ProjectCode	CaseID	CaseType 	Serousness	ErrorLevels) (:$30.);
cards;
Emp3	Cust1	1006680	1002	Case4	Serious	L1
Emp3	Cust1	1006680	1002	Case4	Serious	L1
Emp3	Cust1	1006680	1002	Case4	Serious	L2
Emp3	Cust1	1006680	1002	Case4	Serious	L2
Emp3	Cust1	1006680	1002	Case4	Serious	L3
Emp6	Cust1	1006680	1003	Case3	Serious	NULL
Emp1	Cust2	1004931	1005	Case6	Serious	NULL
Emp6	Cust1	1006680	1007	Case4	Serious	L2
Emp6	Cust1	1006680	1007	Case4	Serious	L2
Emp6	Cust1	1006680	1007	Case4	Serious	L2
Emp6	Cust1	1006680	1007	Case4	Serious	L2
Emp6	Cust2	1004931	1008	Case4	Non-Seri	L2
Emp5	Cust2	1004931	1008	Case4	Non-Seri	L2
Emp5	Cust2	1004931	1008	Case4	Non-Seri	L2
Emp6	Cust2	1004931	1008	Case4	Non-Seri	L2
Emp6	Cust2	1004931	1008	Case4	Non-Seri	L2
Emp5	Cust2	1004931	1008	Case4	Non-Seri	L2
Emp6	Cust2	1004931	1008	Case4	Non-Seri	NA
Emp5	Cust2	1004931	1008	Case4	Non-Seri	NA
Emp6	Cust1	1006680	1009	Case4	Non-Seri	NULL
Emp6	Cust1	1006680	1012	Case4	Non-Seri	NULL
Emp6	Cust1	1006680	1013	Case1	Non-Seri	NULL
Emp8	Cust1	1006680	1014	Case2	Serious	NULL
Emp7	Cust1	1006680	1015	Case2	Serious	L3
Emp6	Cust1	1006680	1017	Case4	NA	NULL
Emp4	Cust1	1006680	1017	Case4	NA	NULL
Emp5	Cust1	1006680	1018	Case4	Non-Seri	L2
Emp5	Cust1	1006680	1018	Case4	Non-Seri	L2
Emp5	Cust1	1006680	1018	Case4	Non-Seri	L3
Emp5	Cust1	1006680	1018	Case4	Non-Seri	NA
Emp5	Cust1	1006680	1019	Case1	Non-Seri	L1
Emp5	Cust1	1006680	1019	Case1	Non-Seri	L2
Emp5	Cust1	1006680	1019	Case1	Non-Seri	L3
Emp6	Cust1	1006680	1020	Case1	Non-Seri	L1
Emp2	Cust1	1006680	1020	Case1	Non-Seri	L1
Emp2	Cust1	1006680	1020	Case1	Non-Seri	L2
Emp6	Cust1	1006680	1020	Case1	Non-Seri	L2
Emp2	Cust1	1006680	1020	Case1	Non-Seri	L3
Emp6	Cust1	1006680	1020	Case1	Non-Seri	L3
Emp6	Cust1	1006680	1020	Case1	Non-Seri	L3
Emp2	Cust1	1006680	1020	Case1	Non-Seri	L3
Emp6	Cust1	1006680	1021	Case4	NA	NULL
Emp5	Cust1	1006680	1022	Case1	Non-Seri	NULL
Emp4	Cust1	1006680	1023	Case4	Non-Seri	NULL
Emp4	Cust1	1006680	1024	Case5	Non-Seri	L3
Emp2	Cust1	1006680	1025	Case5	Non-Seri	NULL
Emp5	Cust1	1006680	1026	Case1	Non-Seri	NULL
Emp4	Cust1	1006680	1027	Case4	Death/Li	NULL
Emp2	Cust1	1006680	1028	Case4	Serious	L3
Emp4	Cust1	1006680	1029	Case2	Serious	L3
Emp2	Cust1	1006680	1030	Case5	Serious	NA
Emp4	Cust1	1006680	1031	Case3	Non-Seri	NULL
Emp4	Cust1	1006680	1032	Case1	Non-Seri	L3
Emp4	Cust1	1006680	1033	Case3	Serious	NULL
Emp2	Cust1	1006680	1034	Case1	NA	NULL
Emp4	Cust1	1006680	1035	Case5	Serious	NULL
;

data tmp;
set have;
length grp $50;
if errorlevels in ('L1','L2','L3') then grp='ERROR';
else grp=errorlevels;
run;
proc sort data=tmp;
by caseid grp;
run;

data want;
n1=0;
do n=1 by 1  until(last.caseid);
set tmp;
by caseid grp;
if grp='ERROR' then n1+1;
end;
do until(last.caseid);
set tmp;
by caseid grp;
if grp='ERROR' then do;CaseCount=1/n1; case_with_errors=1/n1;end;
else case_with_errors=0;
if grp='NA' then CaseCount =0;
else if grp='NULL' then CaseCount=1;
CaseCount_sum+CaseCount;
case_with_errors_sum+case_with_errors;
output;
end;
drop grp n: ;
run;

 

Note: The above will require some tweak depending on the logic to consider variables  Seriousness<-> ErrorLevels to deal with NA and NULL besides ERROR values L series as I didn't dig deep enough to comprehend your dataset. But do let me know if you want me to further change

Frequent Contributor
Posts: 79

Re: Subgroup count required

Posted in reply to novinosrin

Thank you 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 159 views
  • 0 likes
  • 4 in conversation