BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
d0816
Quartz | Level 8

Hello,

I have a dataset which I want to spit into two dataset where One dataset will contain all rows for an IDs if one of the rows for an ID have " Counted" as a value for a variable COUNT. The second dataset would contain all the rest of the IDs.

For e.g Dataset: Have

IDCountGroup
1NAj
2NAF2
2NOT COUNTEDj
2NOT COUNTEDj
2NOT COUNTEDj
3NAC
4NOT COUNTEDF
4NAj
5NAj
5NOT COUNTEDj
5NOT COUNTEDj
6NAC
6NOT COUNTEDj
6NOT COUNTEDj
6NOT COUNTEDj
6NOT COUNTEDj
7NAC
7NOT COUNTEDj
8NAC
8NOT COUNTEDF
9NAC
9NOT COUNTEDj
9NOT COUNTEDj
9COUNTEDj

 

Final dataset would be two datasets:

Dataset : Want 1

IDCountGroup
9NAC
9NOT COUNTEDj
9NOT COUNTEDj
9COUNTEDj

 

Dataset : Want 2

IDCountGroup
1NAj
2NAF2
2NOT COUNTEDj
2NOT COUNTEDj
2NOT COUNTEDj
3NAC
4NOT COUNTEDF
4NAj
5NAj
5NOT COUNTEDj
5NOT COUNTEDj
6NAC
6NOT COUNTEDj
6NOT COUNTEDj
6NOT COUNTEDj
6NOT COUNTEDj
7NAC
7NOT COUNTEDj
8NAC
8NOT COUNTEDF

 Thank you.

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hello @d0816 

data one two;
merge have(in=a) have(in=b keep=id count rename=count=_count where=(_Count='COUNTED'));
by id;
if a and not b then output two;
else if a and b then output one;
drop _count;
run;

View solution in original post

6 REPLIES 6
novinosrin
Tourmaline | Level 20

Hello @d0816 

data one two;
merge have(in=a) have(in=b keep=id count rename=count=_count where=(_Count='COUNTED'));
by id;
if a and not b then output two;
else if a and b then output one;
drop _count;
run;

hashman
Ammonite | Level 13

@d0816:

This is a typical double DoW-loop job (assuming HAVE is ordered/grouped by ID):

data have ;                               
  input ID @4 Count $11. @17 Group $2. ;  
  cards ;                                 
1  NA           j                         
2  NA           F2                        
2  NOT COUNTED  j                         
2  NOT COUNTED  j                         
2  NOT COUNTED  j                         
3  NA           C                         
4  NOT COUNTED  F                         
4  NA           j                         
5  NA           j                         
5  NOT COUNTED  j                         
5  NOT COUNTED  j                         
6  NA           C                         
6  NOT COUNTED  j                         
6  NOT COUNTED  j                         
6  NOT COUNTED  j                         
6  NOT COUNTED  j                         
7  NA           C                         
7  NOT COUNTED  j                         
8  NA           C                         
8  NOT COUNTED  F                         
9  NA           C                         
9  NOT COUNTED  j                         
9  NOT COUNTED  j                         
9  COUNTED      j                         
run ;                                     
                                          
data want1 want2 ;                        
  do _n_ = 1 by 1 until (last.id) ;       
    set have ;                            
    by ID ;                               
    if count = "COUNTED" then _want1 = 1 ;
  end ;                                   
  do _n_ = 1 to _n_ ;                     
    set have ;                            
    if _want1 then output want1 ;         
    else           output want2 ;         
  end ;   
drop _: ; run ;

Kind regards

Paul D. 

novinosrin
Tourmaline | Level 20

Guru @hashman   I would have thought you would have posted this one

 

data one two;
if _n_=1 then do;
   dcl hash H (dataset:'have(where=(count="COUNTED"))') ;
   h.definekey  ("id") ;
   h.definedata ("id") ;
   h.definedone () ;
end;
set have;
by id;
if first.id then _iorc_=h.check();
if _iorc_=0 then output one;
else output two;
run;
d0816
Quartz | Level 8

@novinosrin  @hashman 

 

Thank you for the solutions. All three solutions worked. I am going to accept novinosrin's code as a solution as his came first.

 

Novinosrin, Would appreciate if you could explain how hash is working, or resources related to hash.

 

Thank you again. You guys rock.

novinosrin
Tourmaline | Level 20

Hi @d0816   I am humbled by your words. Well to tell you the honest truth, my hash skills are exclusively gained by reading @hashman (Guru I forgot Don's Pseudonym, otherwise I would mention him too. Sorry about that but do pass on my regards) authored book. 

 

Hashman aka Paul D is lord's incarnation with a hash boon and a gift to us(SAS community). I hope to become as good as him one day.

 

Anyways regarding explaining the entire concept will not help much other than the fact what's happening here is the ID that has counted is basically parked in a memory table called Hash . On a pass of a dataset, we basically check for matches with the counted ID in Hash table. The true ones are populated  in one and the rest in two which is clearly noticeable in the code.

 

PS

Please do get hold of the Book 

Data Management Solutions Using SAS® Hash Table Operations: A Business Intelligence Case Study 

By Paul Dorfman and Don Henderson

 

at the soonest if you want to speed up your learning. All the best!

hashman
Ammonite | Level 13

@novinosrin: Needless to say, thanks for the plug! 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 670 views
  • 5 likes
  • 3 in conversation