BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
genemroz
Quartz | Level 8
This is probably simple but....I don't know how to do it.
My dataset consists of Events recorded by multiple Observers.  I want a count of Events observed by both ObserverID 02 and 05.  See below...  Thanks in advance!
ENum ObserverID
1         0G
1          05
        02
2         04
2          05
2          02
3          0A
3           0P
        02
        0D
        0J
        02
        0D
        0M
        0H
        02
        0H
        0M
        02
7          0G
7          05
        02
        0A
        0P
        02
        0J
        0G
9          02
 
1 ACCEPTED SOLUTION

Accepted Solutions
RichardDeVen
Barite | Level 11

You can merge subsets of the same data set.

data want (keep=enum label='Enums observed by 02 and 05');
  merge 
    have (where=(ObserverID='02') in = _02)
    have (where=(ObserverID='05') in = _05)
    end=end
  ;
  by enum;

  _0205_flag = first.enum and _02 and _05;  * use first. in case the field data has repeated information;

  count + _0205_flag;

  if end then put 'NOTE: ' count ' enums observed by both 02 and 05';

  if _0205_flag;  * subsetting if;
run;

In SQL you can count a positive logical assertion over a group using SUM

proc sql;
  create table want as
  select enum from have
  group by enum
  having sum(ObserverId='02') > 0 and sum(ObserverId='05') > 0;

%put NOTE: &SQLOBS enums were dual observed by 02 and 05;

A NODUPKEY SORT + DATA step can find the enums that were dual observed

proc sort nodupkey data=have out=_0205 ;
  by enum ObserverId;
  where ObserverId in ('02' '05');
run;

data want;
  set _0205;
  by enum;
  if last.enum and not first.enum;
run;

 

View solution in original post

14 REPLIES 14
vkumbhakarna
Fluorite | Level 6

Try this! replace the text in bold with your dataset name. 
proc sql;
select count(distinct(enum)) into: count_event
from <datasetname>
where ObserverID in ('02','05')
;
quit;
%put count_event = &count_event;

yabwon
Onyx | Level 15
Hi,
using the example data your code returns 9 - but expected is 3. Requirement was "count of Events observed by both ObserverID 02 and 05" what is not equivalent to condition `where ObserverID in ('02','05')` (i.e. ObserverID = '02' OR ObserverID = '05').

All the best
Bart
_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



vkumbhakarna
Fluorite | Level 6

Thanks! Yes, I realized my mistake after reading the other responses. I misinterpreted the initial question. 

RichardDeVen
Barite | Level 11

You can merge subsets of the same data set.

data want (keep=enum label='Enums observed by 02 and 05');
  merge 
    have (where=(ObserverID='02') in = _02)
    have (where=(ObserverID='05') in = _05)
    end=end
  ;
  by enum;

  _0205_flag = first.enum and _02 and _05;  * use first. in case the field data has repeated information;

  count + _0205_flag;

  if end then put 'NOTE: ' count ' enums observed by both 02 and 05';

  if _0205_flag;  * subsetting if;
run;

In SQL you can count a positive logical assertion over a group using SUM

proc sql;
  create table want as
  select enum from have
  group by enum
  having sum(ObserverId='02') > 0 and sum(ObserverId='05') > 0;

%put NOTE: &SQLOBS enums were dual observed by 02 and 05;

A NODUPKEY SORT + DATA step can find the enums that were dual observed

proc sort nodupkey data=have out=_0205 ;
  by enum ObserverId;
  where ObserverId in ('02' '05');
run;

data want;
  set _0205;
  by enum;
  if last.enum and not first.enum;
run;

 

FreelanceReinh
Jade | Level 19

Sir @RichardDeVen,

 

I suspect the new forum software has jumbled a few lines of the DATA step with your MERGE approach: The "if end then put ..." must have been moved upwards from its original place after the sum statement. Now it would be executed prematurely if the last BY group contributed to the count.

RichardDeVen
Barite | Level 11
Good catch. I'll fix the answer. In order to have the END= based logging PUT in the step, there needs to be some more scaffolding before the subsetting IF.
andreas_lds
Jade | Level 19

You have not explained in which form the result is required in. If you want the number in a macro-variable, the proc sql suggested by @vkumbhakarna is the way to go. If you want it in a dataset, you could use proc summary:

proc summary data=have;
	where ObserverID in ('02', '05');
	output out=counted(drop=_type_ rename=(_freq_=count));
run;
yabwon
Onyx | Level 15

Hi,

 

assuming that ENum is integer and you are able to anticipate its range, following should do:

data have;
input ENum ObserverID $;
cards;

1         0G
1          05
1          02
2         04
2          05
2          02
3          0A
3           0P
3          02
4          0D
4          0J
4          02
5          0D
5          0M
5          0H
5          02
6          0H
6          0M
6          02
7          0G
7          05
7          02
8          0A
8          0P
8          02
9          0J
9          0G
9          02
;
run;

data want;
  array obs2[9] _temporary_;
  array obs5[9] _temporary_;

  do until (EOF);
    set have end = eof;
      where observerID in ('02', '05');
    select;
      when(observerID = '02') obs2[ENum] = 1;
      when(observerID = '05') obs5[ENum] = 1;
      otherwise;
    end;
  end;

  do ENum = 1 to 9;
    count + (obs2[ENum] and obs5[ENum]);
  end;
  output;
  keep count;
run;

 

if ENum isn't integer or you can't anticipate the range then following should do:

data want;
  if 0 then set have;
  declare hash obs2();
  obs2.defineKey("ENum");
  obs2.defineDone();
  declare hiter I2("obs2");

  declare hash obs5();
  obs5.defineKey("ENum");
  obs5.defineDone();

  do until (EOF);
    set have end = eof;
      where observerID in ('02', '05');
    select;
      when(observerID = '02') _N_ = obs2.add();
      when(observerID = '05') _N_ = obs5.add();
      otherwise;
    end;
  end;

  do while(I2.next() = 0);
    if obs5.check() = 0 then count + 1;
  end;
output;
keep count; stop; run;

or eventually:

proc sql;
  create table want as  
  select count(*) from  
    (
    select ENum from have where observerID = '02'
    intersect
    select ENum from have where observerID = '05' 
    )
  ;
quit;

 

All the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



RichardDeVen
Barite | Level 11

Hi Bart @yabwon  :

 

I would call your second hash way a 'hash merge'.  This was my take on it.

data want(keep=enum label='They have been watching you');
  retain enum .;

  declare hash _02(dataset:'have(where=(ObserverID = "02"))', ordered:'A');
  declare hash _05(dataset:'have(where=(ObserverID = "05"))', ordered:'A');

  _02.defineKey('enum'); _02.defineDone();
  _05.defineKey('enum'); _05.defineDone();

  declare hiter check ('_02');
  do while (check.next() eq 0);
    if _05.check()=0 then do;
      _0205_pair_count + 1;
      output;
    end;
  end;

  put 'NOTE: There are ' _0205_pair_count 'enums that were observed by 02 and 05';
run;
yabwon
Onyx | Level 15

Nice! I'm making a copy in my archive.

All the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



yabwon
Onyx | Level 15

@RichardDeVen 

 

Richard,

I've just realised that for this task you need no more than 1 hash table (or 1 array)

 

proc format;
  invalue primes
  '02'  = 2
  '05'  = 5
  other = 1
  ;
run;
data _null_; if 0 then set have; declare hash obs(hashexp:20); obs.defineKey("ENum"); obs.defineData("ENum","INT"); obs.defineDone(); do until (EOF); set have end = eof; where observerID in ('02', '05'); if obs.find() NE 0 then INT = input(observerID, primes.); else INT = INT * input(observerID, primes.); _N_ = obs.replace(); end; /* if INT is divisible by 10 then ENum was observed by 02 nad 05*/ obs.output(dataset:'want(where=( mod(INT, 10) = 0 ))'); stop; run; %let range = 9; data want; array obs[&range.] _temporary_; do ENum = 1 to &range.; obs[ENum] = 1; end; do until (EOF); set have end = eof; where observerID in ('02', '05'); obs[ENum] = obs[ENum] * input(observerID, primes.); end; do ENum = 1 to &range.; if mod(obs[ENum], 10) = 0 then output; end; keep ENum; run;

 

All the best

Bart

 

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



RichardDeVen
Barite | Level 11
Great use of INTERSECT !
genemroz
Quartz | Level 8

These solutions worked too.  The one using INTERSECT is really slick!

 

Thanks for your help!

 

Gene

vkumbhakarna
Fluorite | Level 6

Wow! i'm loving these forums. Get to learn something new every day.

Liked the use of SQLOBS. Per documentation, SQLOBS automatic macro variable is assigned a value after the SQL SELECT statement executes. PROC SQL produces six automatic macro variables (SQLOBS, SQLRC, SQLOOPS, SQLEXITCODE, SQLXRC, and SQLXMSG) to help you troubleshoot your program. For more information, see Using the PROC SQL Automatic Macro Variables.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 1345 views
  • 6 likes
  • 6 in conversation