Hello, I am using SAS 9.4. I am trying to count merged data. One has ID and non relevant information, the other has the same ID and sample ID's which I want to match up. Some of the ID's have multiple and some have no sample ID's. I would like to count how many samples there are per ID and if it does not have a match put a 0 as the count. I can tell which ones do not have any match because they do not have a box ID.
Set 1: ID
ID
123
456
789
101
102
Set 2: Sample ID with ID
ID Sample ID Box ID
123 ABC A
123 DEF B
456 GHI B
456 JKL F
456 MNO L
456 PQR J
101 STU A
102 VWX W
105 YZA B
106 BCD N
Set: WANT
ID sample ID Box ID Count
123 ABC A 1
123 DEF B 2
456 GHI B 1
456 JKL F 2
456 MNO L 3
456 PQR J 4
789 ------ ---- 0
101 STU A 1
102 VWX W 1
I tried merging the two and then counting however it would just still keep 789 and treat it as if it had a Sample ID and then count it as 1 when it should be a 0. How would I go about this?
Thank you
See this:
data have1;
input id $3.;
datalines;
123
456
789
101
102
;
data have2;
input id :$3. sample_id :$3. box_id :$1.;
datalines;
123 ABC A
123 DEF B
456 GHI B
456 JKL F
456 MNO L
456 PQR J
101 STU A
102 VWX W
105 YZA B
106 BCD N
;
run;
proc sort data=have1;
by id;
run;
proc sort data=have2;
by id;
run;
data want;
merge
have1 (in=h1)
have2 (in=h2)
;
by id;
if h1;
if first.id
then count = 0;
if h2
then count + 1;
else do;
sample_id = '---';
box_id = '-';
end;
run;
proc print data=want noobs;
run;
Result:
sample_ id id box_id count 101 STU A 1 102 VWX W 1 123 ABC A 1 123 DEF B 2 456 GHI B 1 456 JKL F 2 456 MNO L 3 456 PQR J 4 789 --- - 0
See this:
data have1;
input id $3.;
datalines;
123
456
789
101
102
;
data have2;
input id :$3. sample_id :$3. box_id :$1.;
datalines;
123 ABC A
123 DEF B
456 GHI B
456 JKL F
456 MNO L
456 PQR J
101 STU A
102 VWX W
105 YZA B
106 BCD N
;
run;
proc sort data=have1;
by id;
run;
proc sort data=have2;
by id;
run;
data want;
merge
have1 (in=h1)
have2 (in=h2)
;
by id;
if h1;
if first.id
then count = 0;
if h2
then count + 1;
else do;
sample_id = '---';
box_id = '-';
end;
run;
proc print data=want noobs;
run;
Result:
sample_ id id box_id count 101 STU A 1 102 VWX W 1 123 ABC A 1 123 DEF B 2 456 GHI B 1 456 JKL F 2 456 MNO L 3 456 PQR J 4 789 --- - 0
Same idea as Kurt, just my share of fun
data have1;
input id $3.;
datalines;
123
456
789
101
102
;
data have2;
input id :$3. sample_id :$3. box_id :$1.;
datalines;
123 ABC A
123 DEF B
456 GHI B
456 JKL F
456 MNO L
456 PQR J
101 STU A
102 VWX W
105 YZA B
106 BCD N
;
run;
data want ;
if _n_=1 then do;
if 0 then set have2;
dcl hash H (dataset:'have1') ;
h.definekey ("id") ;
h.definedata ("id") ;
h.definedone () ;
dcl hiter hi('h');
end;
do count=1 by 1 until(last.id);
set have2 end=z;
by id notsorted;
if h.check()=0 then do;
if last.id then h.remove();
output;
end;
end;
if z;
do while(hi.next()=0);
count=0;
call missing(sample_id,box_id);
output;
end;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.