How can I exclude only 1, 5 and 7 records..
data temp_copy;
infile cards dlm='|' truncover ;
input subDate :mmddyy10. TestID :$5. LogDept :$2. CurrentDept :$2. status :$8. Category $4.;
format subDate yymmdd10.;
cards;
10/21/2020|29545|54|99|Active|K12
10/21/2020|29545|54|99|Active|K6
10/21/2020|29545|54|99|Active|K9
10/30/2020|56545|45|99|Active|K12
10/30/2020|56545|45|99|InActive|K12
10/30/2020|56545|45|88|Active|K12
11/21/2020|49545|33|99|Active|K12
11/21/2020|49545|33|99|Active|K9
11/21/2020|99546|56|94|shipped|K6
11/21/2020|99546|56|94|shipped|K9
11/21/2020|99546|56|99|shipped|K9
;
proc sql;
create table test as
select distinct testid, subDate, CurrentDept, Status, category
from temp_copy
order by testid, subDate, CurrentDept, Status, category
;
quit;
data lastcategory;
set test;
by testid subDate CurrentDept Status category;
if last.CurrentDept;
if CurrentDept= '99' and Status="InActive" then delete;
run;
/*conditions to check..
If TestId is in 2 or more categories then count only in K6 & K9, because K12 is the parent and should be excluded; but if TestId is in only K12 category and in 2 different currentDept then keep both the records.
If CurrentDept='99' and status='InActive' then delete;*/
Hi @Stalk Assuming I kinda understand what you want-->
data temp_copy;
infile cards dlm='|' truncover ;
input subDate :mmddyy10. TestID :$5. LogDept :$2. CurrentDept :$2. status :$8. Category $4.;
format subDate yymmdd10.;
cards;
10/21/2020|29545|54|99|Active|K12
10/21/2020|29545|54|99|Active|K6
10/21/2020|29545|54|99|Active|K9
10/30/2020|56545|45|99|Active|K12
10/30/2020|56545|45|99|InActive|K12
10/30/2020|56545|45|88|Active|K12
11/21/2020|49545|33|99|Active|K12
11/21/2020|49545|33|99|Active|K9
11/21/2020|99546|56|94|shipped|K6
11/21/2020|99546|56|94|shipped|K9
11/21/2020|99546|56|99|shipped|K9
;
proc sql;
create table want(drop=_c:) as
select * ,count(distinct Category) as _c,count(distinct currentdept) as _c1
from temp_copy
where not(CurrentDept= '99' and Status="InActive")
group by testid
having _c>=2 and Category ne 'K12' or _c=1 and Category='K12' and _c1>=2;
quit;
@Stalk Don't worry about "NOTE: The query requires remerging summary statistics back with the original data."
I needed the SQL processor to actually take advantage of automatic merge back with the original in order to have an extra pass of the BY GROUP for the filter to subset the needed records. So that was on purpose.
Not sure, what you mean. You could perhaps post a clear sample of your HAVE and WANT explaining the logic in points.
Hi @Stalk Do you mean this change?
having _c>=2 and Category ne 'K12' or _c=1 and Category='K12';
full code->
proc sql;
create table want(drop=_c:) as
select * ,count(distinct Category) as _c
from temp_copy
where not(CurrentDept= '99' and Status="InActive")
group by testid
having _c>=2 and Category ne 'K12' or _c=1 and Category='K12';
quit;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.