DATA Step, Macro, Functions and more

Looking at observations within a certain group

Reply
Occasional Contributor
Posts: 12

Looking at observations within a certain group

Hi,

This is probably a pretty simple question, but I can't seem to figure it out. My dataset looks like the following:

Date Name Facility
1/2/09 John A
1/2/09 Gary B
1/2/09 John B
1/4/09 Lisa C
1/4/09 Lisa C
1/5/09 Gary A
1/5/09 Mary D
1/6/09 Jess B
1/6/09 Jess D
1/6/09 Jess B

I want to determine if a person visited a different facility on the same day. So for this example, John visited facility A and B on 1/2/09, so he would get flagged or outputted to another dataset. Same thing for Jess on 1/6/09.

How can I code this? I tried sorting by date and name, then using a lag function. But that didn't seem to work well if a person had visited multiple facilities on the same day.

Thanks for your help!
Respected Advisor
Posts: 3,887

Re: Looking at observations within a certain group

Sort by date and name and then use:

if not (first.name and last.name) then output; Message was edited by: Patrick
Occasional Contributor
Posts: 15

Re: Looking at observations within a certain group

Hi ,
The below shown code will do the trick...The code has been prepared with the example you have given..



/*use distinct to remove cases like Lisa ie. same day same facility more than once*/
proc sql;
create table test1 as
select distinct * from test
order by date, name, facility;
run;

data test2;
set test1;
format date1 mmddyy10.;
date1=lag(date);
name1=lag(name);
run;


proc sql;
create table test3 as
select *,
(case when (date=date1 and name=name1) then "Y" else "N" end) as Result
from test2;
run;

proc sql;
create table output as
select name, result from test3 where result='Y';
run;
Super Contributor
Posts: 291

Re: Looking at observations within a certain group

another possibilty (SAS always has several)

data test;
input Date mmddyy6. Name $ Facility $;
cards;
1/2/09 John A
1/2/09 Gary B
1/2/09 John B
1/4/09 Lisa C
1/4/09 Lisa C
1/5/09 Gary A
1/5/09 Mary D
1/6/09 Jess B
1/6/09 Jess D
1/6/09 Jess B
;
run;

proc sort data=test;
by date name facility;
run;

proc transpose data=test out=test2;
by date name;
var facility;
run;
SAS Super FREQ
Posts: 8,740

Re: Looking at observations within a certain group

...And, just for grins, here's a PROC REPORT approach that creates an output dataset using just one pass through the data and the output dataset has ONLY the dates/names where the number of visits was GE 2 -and- the number of facilities was GE 2.

WORK.REPOUT is created by PROC REPORT -- which calculates the TOTVISIT column and the FACCNT column on the report -- so those 2 columns are available for WHERE processing. With TOTVISIT and FACCNT used in WHERE processing on the OUT= option, only the report rows that meet the WHERE condition will be written to the output dataset. Also on the OUT= option is the DROP= data set option to drop the automatic variable _BREAK_ and the RENAME= option to rename the absolute column numbers to be the facility values that they represent.

cynthia

[pre]
data test;
infile datalines;
input Date : mmddyy6. Name $ Facility $;
return;
datalines;
1/2/09 John A
1/2/09 Gary B
1/2/09 John B
1/4/09 Lisa C
1/4/09 Lisa C
1/5/09 Gary A
1/5/09 Mary D
1/6/09 Jess B
1/6/09 Jess D
1/6/09 Jess B
;
run;

ods listing close;
ods html file='report.html' style=sasweb;

proc report data=test nowd
out=repout(where=(totvisit ge 2 and faccnt ge 2)
rename=(_c3_=A _c4_=B _c5_=C _c6_=D)
drop=_break_);
title 'Report Approach';
column date name facility n=totvisit faccnt;
define date / group f=date9.;
define name /group;
define facility / across;
define totvisit / 'Total Visits';
define faccnt / computed "Facility Count";
** ACROSS variables have absolute numbers in PROC REPORT;
** so _c3_ = Facility A, _c4_ = Facility B,;
** _c5_ = Facility C, _c6_ = Facility D;
compute faccnt;
** use the N function to get the count of non-missing facility values;
** when count is 1, means they visited 1 facility, etc.;
** if there are more than 4 facilities, then logic needs to change;
faccnt=n(_c3_, _c4_, _c5_, _c6_);
endcomp;
run;

proc print data=work.repout;
title 'WORK.REPOUT is Output dataset from PROC REPORT';
format date date9.;
run;

ods html close;

[/pre]
Ask a Question
Discussion stats
  • 4 replies
  • 165 views
  • 0 likes
  • 5 in conversation