If I am understanding your request correctly the code below should get you the results mentioned: PROC SQL;
CREATE TABLE WORK.Stag1 (DROP=Employee_Name Supervisor_Name CaseAccountableTeam) as
SELECT DISTINCT ProjectCode, Employee_Name, Supervisor_Name, CaseAccountableTeam, QCCompleteDate, SafetyCaseID, QCCompleteStatusName, ProcessingHub
FROM WORK.Sample_Data;
QUIT;
PROC SORT DATA=WORK.Stag1 NODUPKEY; BY _ALL_; RUN;
PROC SQL;
CREATE TABLE WORK.Stag2 AS
SELECT DISTINCT
ProjectCode
, SUM(CASE WHEN QCCompleteStatusName IN('QCCOMP') AND SafetyCaseID NOT IN('') THEN 1 ELSE 0 END) AS QCCOMP
, SUM(CASE WHEN QCCompleteStatusName IN('QCCOMPERR') AND SafetyCaseID NOT IN('') THEN 1 ELSE 0 END) AS QCCOMPERR
FROM WORK.Stag1
GROUP BY ProjectCode;
QUIT; In order to get the counts that you were looking for, I needed to remove these variables (Employee_Name, Supervisor_Name, CaseAccountableTeam) from the SQL Step performing the grouping counts. These had variations across the observations that if included in the grouping, would appear as a distinct observation and so would give you a count of 6 when you expected 3 and 4 when you expected 2. Without them, the counts align to you volume expectations. If those must be included in that step, then you will not get the counts you think you should, due to their variance. I wasn't sure exactly what your second request was looking to accomplish specifically. I understand the calculation logic, just not how you wanted that value determined for all of the grouped variables. if you add: , (CALCULATED QCCOMP/(CALCULATED QCCOMP + CALCULATED QCCOMPERR)) AS test to the select statement above, it will generate values for each of the rows generated there. Hopefully this is what you were looking to find.
... View more