BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kbug
Obsidian | Level 7

Hello, I am using SAS 9.4. I am trying to count merged data. One has ID and non relevant information, the other has the same ID and sample ID's which I want to match up. Some of the ID's have multiple and some have no sample ID's. I would like to count how many samples there are per ID and if it does not have a match put a 0 as the count. I can tell which ones do not have any match because they do not have a box ID.

 

Set 1: ID

 

ID      

123         

456        

789

101

102

 

Set 2: Sample ID with ID

ID              Sample ID   Box ID

123                ABC            A

123                DEF            B

456                GHI             B

456                JKL             F

456                MNO           L

456                PQR           J

101                STU            A

102                VWX          W

105                YZA            B

106                BCD           N

  

 

Set: WANT
ID                 sample ID    Box ID  Count

123                ABC                A       1

123                DEF                B       2

456                GHI                 B       1

456                JKL                 F        2

456                MNO               L        3

456                PQR               J         4

789                ------               ----       0

101                STU                A        1

102                VWX               W       1

 

I tried merging the two and then counting however it would just still keep 789 and treat it as if it had a Sample ID and then count it as 1 when it should be a 0. How would I go about this?

 

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

See this:

data have1;
input id $3.;
datalines;
123
456
789
101
102
;

data have2;
input id :$3. sample_id :$3. box_id :$1.;
datalines;
123 ABC A
123 DEF B
456 GHI B
456 JKL F
456 MNO L
456 PQR J
101 STU A
102 VWX W
105 YZA B
106 BCD N
;
run;

proc sort data=have1;
by id;
run;

proc sort data=have2;
by id;
run;

data want;
merge
  have1 (in=h1)
  have2 (in=h2)
;
by id;
if h1;
if first.id
then count = 0;
if h2
then count + 1;
else do;
  sample_id = '---';
  box_id = '-';
end;
run;

proc print data=want noobs;
run;

Result:

       sample_
id       id       box_id    count

101      STU        A         1  
102      VWX        W         1  
123      ABC        A         1  
123      DEF        B         2  
456      GHI        B         1  
456      JKL        F         2  
456      MNO        L         3  
456      PQR        J         4  
789      ---        -         0  

View solution in original post

2 REPLIES 2
Kurt_Bremser
Super User

See this:

data have1;
input id $3.;
datalines;
123
456
789
101
102
;

data have2;
input id :$3. sample_id :$3. box_id :$1.;
datalines;
123 ABC A
123 DEF B
456 GHI B
456 JKL F
456 MNO L
456 PQR J
101 STU A
102 VWX W
105 YZA B
106 BCD N
;
run;

proc sort data=have1;
by id;
run;

proc sort data=have2;
by id;
run;

data want;
merge
  have1 (in=h1)
  have2 (in=h2)
;
by id;
if h1;
if first.id
then count = 0;
if h2
then count + 1;
else do;
  sample_id = '---';
  box_id = '-';
end;
run;

proc print data=want noobs;
run;

Result:

       sample_
id       id       box_id    count

101      STU        A         1  
102      VWX        W         1  
123      ABC        A         1  
123      DEF        B         2  
456      GHI        B         1  
456      JKL        F         2  
456      MNO        L         3  
456      PQR        J         4  
789      ---        -         0  
novinosrin
Tourmaline | Level 20

Same idea as Kurt, just my share of fun

 

data have1;
input id $3.;
datalines;
123
456
789
101
102
;

data have2;
input id :$3. sample_id :$3. box_id :$1.;
datalines;
123 ABC A
123 DEF B
456 GHI B
456 JKL F
456 MNO L
456 PQR J
101 STU A
102 VWX W
105 YZA B
106 BCD N
;
run;

data want ;
if _n_=1 then do;
   if 0 then set have2;
   dcl hash H (dataset:'have1') ;
   h.definekey  ("id") ;
   h.definedata ("id") ;
   h.definedone () ;
   dcl hiter hi('h');
 end;
 do count=1 by 1 until(last.id);
  set have2 end=z;
  by id notsorted;
  if h.check()=0 then do;
   if last.id then h.remove();
  output;
  end;
 end;
 if z;
 do while(hi.next()=0);
  count=0;
  call missing(sample_id,box_id);
  output;
 end;
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 937 views
  • 2 likes
  • 3 in conversation