BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Stalk
Pyrite | Level 9

How can I exclude only 1, 5 and 7 records..

 

data temp_copy;

infile cards dlm='|' truncover ;

input subDate :mmddyy10. TestID :$5. LogDept :$2. CurrentDept :$2. status :$8. Category $4.;

format subDate yymmdd10.;

cards;

 

10/21/2020|29545|54|99|Active|K12

10/21/2020|29545|54|99|Active|K6

10/21/2020|29545|54|99|Active|K9

10/30/2020|56545|45|99|Active|K12

10/30/2020|56545|45|99|InActive|K12

10/30/2020|56545|45|88|Active|K12

11/21/2020|49545|33|99|Active|K12

11/21/2020|49545|33|99|Active|K9

11/21/2020|99546|56|94|shipped|K6

11/21/2020|99546|56|94|shipped|K9

11/21/2020|99546|56|99|shipped|K9

;

 

proc sql;

create table test as

select distinct testid, subDate, CurrentDept, Status, category

from temp_copy

order by testid, subDate, CurrentDept, Status, category

 

;

quit;

data lastcategory;

set test;

by testid subDate CurrentDept Status category;

if last.CurrentDept;

if CurrentDept= '99' and Status="InActive" then delete;

run;

/*conditions to check..

If TestId is in 2 or more categories then count only in K6 & K9, because K12 is the parent and should be excluded; but if TestId is in only K12 category and in 2 different currentDept then keep both the records.

If CurrentDept='99' and status='InActive' then delete;*/

1 ACCEPTED SOLUTION

Accepted Solutions
Stalk
Pyrite | Level 9
Thank you novinosrin. You solved this complicated query in such a simple step. I spent at least 2 -3 hours with several steps. I tested this on one day's worth of data and looks perfect. Need to see if any of the data is falling into cracks as I get the note: NOTE: The query requires remerging summary statistics back with the original data.

I always use distinct function on one field to get the line list, but you have used distinct function on different fields and made this so simple. May be I need to update my SQL skills. Once again thank you so much.

View solution in original post

8 REPLIES 8
novinosrin
Tourmaline | Level 20

Hi @Stalk  Assuming I kinda understand what you want-->

 

data temp_copy;

infile cards dlm='|' truncover ;

input subDate :mmddyy10. TestID :$5. LogDept :$2. CurrentDept :$2. status :$8. Category $4.;

format subDate yymmdd10.;

cards;
10/21/2020|29545|54|99|Active|K12
10/21/2020|29545|54|99|Active|K6
10/21/2020|29545|54|99|Active|K9
10/30/2020|56545|45|99|Active|K12
10/30/2020|56545|45|99|InActive|K12
10/30/2020|56545|45|88|Active|K12
11/21/2020|49545|33|99|Active|K12
11/21/2020|49545|33|99|Active|K9
11/21/2020|99546|56|94|shipped|K6
11/21/2020|99546|56|94|shipped|K9
11/21/2020|99546|56|99|shipped|K9
;

proc sql;
 create table want(drop=_c:) as
 select * ,count(distinct Category) as _c,count(distinct currentdept) as _c1
 from temp_copy
 where not(CurrentDept= '99' and Status="InActive")
 group by testid
 having _c>=2 and Category ne 'K12' or  _c=1 and Category='K12' and _c1>=2;
quit;


Stalk
Pyrite | Level 9
Thank you novinosrin. You solved this complicated query in such a simple step. I spent at least 2 -3 hours with several steps. I tested this on one day's worth of data and looks perfect. Need to see if any of the data is falling into cracks as I get the note: NOTE: The query requires remerging summary statistics back with the original data.

I always use distinct function on one field to get the line list, but you have used distinct function on different fields and made this so simple. May be I need to update my SQL skills. Once again thank you so much.
novinosrin
Tourmaline | Level 20

@Stalk Don't worry about "NOTE: The query requires remerging summary statistics back with the original data."

 

I needed the SQL processor to actually take advantage of automatic merge back with the original in order to have an extra pass of the BY GROUP for the filter to subset the needed records. So that was on purpose.

Stalk
Pyrite | Level 9
Actually I NEED to keep/count the TestID for K12 category if they don't exist in any other categories. Currently if the count=1 for K12 they are all excluded.
novinosrin
Tourmaline | Level 20

Not sure, what you mean. You could perhaps post a clear sample of your HAVE and WANT explaining the logic in points.

Stalk
Pyrite | Level 9
I added 2 more cards where only one TestID exists for Category=K12 and I want to keep the below two records in addition to the above list.. ..Total my want table should have 10 records

11/25/2020|98548|75|99|shipped|K12
11/26/2020|98549|76|88|shipped|K12
novinosrin
Tourmaline | Level 20

Hi @Stalk   Do you mean this change?

 having _c>=2 and Category ne 'K12' or  _c=1 and Category='K12';

full code->

proc sql;
 create table want(drop=_c:) as
 select * ,count(distinct Category) as _c
 from temp_copy
 where not(CurrentDept= '99' and Status="InActive")
 group by testid
 having _c>=2 and Category ne 'K12' or  _c=1 and Category='K12';
quit;
Stalk
Pyrite | Level 9
yes, it worked. Thank you for your time and solution.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2308 views
  • 0 likes
  • 2 in conversation