Help using Base SAS procedures

Converting from visit-level to person-level data

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 82
Accepted Solution

Converting from visit-level to person-level data

Hello, I am analyzing a health care utilization survey, and I have a clinic visit dataset in which each observation represents one clinic visit.  For each observation, there is a data element for the person's unique identifier (PID) and the month of the visit (MONTH - takes on values of 1-12).  As a simplified example, say I had five people (PID 1-5).  The dataset would look like this (note that PID 3 had no visits that year and therefore has no observations in the clinic visit dataset):

PID            MONTH

1               3

2               5

4               6

4               1

5               12

I am trying to create a person-level dataset in which each observation represents one person.  The dataset would have information on the number of clinic visits in each month, e.g.:

PID   VIS_JAN   VIS_FEB  VIS_MAR  VIS_APR  VIS_MAY  VIS_JUN   VIS_JUL  VIS_AUG  VIS_SEP  VIS_OCT   VIS_NOV  VIS_DEC

1          0                 0               1              0             0               0             0              0              0               0               0             0

2          0                 0               0              0             1               0             0              0              0               0               0             0

3          0                 0               0              0             0               0             0              0              0               0               0             0

4          1                 0               0              0             0               1             0              0              0               0               0             0

5          0                 0               0              0             0               0             0              0              0               0               0             1

Any suggestions?  Thanks!


Accepted Solutions
Solution
‎07-05-2013 12:55 PM
Super User
Super User
Posts: 6,681

Re: Converting from visit-level to person-level data

You have to use your new dataset and tell summary that PID is now a CLASS variable.

proc summary data=have nway classdata=CLASSDATA;

  class pid month ;

  output out=counts ;

run;

View solution in original post


All Replies
Super Contributor
Posts: 333

Re: Converting from visit-level to person-level data

This is quick and dirty so some optimization could probably be done but hopefully this gets you going:

A couple of notes:

1) The card statements are just getting test data in. You need a table with all your ids present and all the months you want to use present so the sql can make the cartesian product between the ids and months.

2) Some of the sql steps can be combined, but I left them separate so it was easier to see what was happening.

infile cards dlm=',';
input pid month;
cards;

1, 3

2, 5

4, 6

4, 1

5, 12

1, 3

8, 5

4, 6

4, 2

5, 12

;

run;

data all;

infile cards ;
input pid;
cards;

1

2

4

3

4

5

5

6

7

8

8

9

2

10

;

run;

data mnthin;

infile cards;
input month;
cards;

1

2

3

4

4

4

5

5

6

7

8

8

9

10

11

12

;

run;

proc sql;

create table cnts as
select pid, month, count(*) as cnt
from in
group by pid, month
;
create table ids as
select distinct pid
from all
;
create table mnth as
select distinct month
from mnthin
;
create table cartesian as
select a.pid, b.month
from ids as a , mnth as b
;
create table out1 as
select a.pid, a.month, coalesce(b.cnt,0) as cnt
from cartesian as a left join cnts as b on a.pid =b.pid and a.month = b.month
order by pid, month
;

quit;

Hope this helps!

EJ

data out2;

set out1;
by pid month;

if first.pid then do;
vis_jan = 0; vis_feb = 0; vis_mar = 0; vis_apr = 0;
vis_may = 0; vis_jun = 0; vis_jul = 0; vis_aug = 0;
vis_sep = 0; vis_oct = 0; vis_nov = 0; vis_dec = 0;
end;

select (month);
when (1) vis_jan = cnt;
when (2) vis_feb = cnt;
when (3) vis_mar = cnt;
when (4) vis_apr = cnt;
when (5) vis_may = cnt;
when (6) vis_jun = cnt;
when (7) vis_jul = cnt;
when (8) vis_aug = cnt;
when (9) vis_sep = cnt;
when (10) vis_oct = cnt;
when (11) vis_nov = cnt;
when (12) vis_dec = cnt;
end;

retain vis: ;
drop cnt month;
if last.pid then output;

run;

Frequent Contributor
Posts: 82

Re: Converting from visit-level to person-level data

Thanks EJ.  I have never used SQL before and am having trouble deciphering the code. I should probably take some time to learn SQL in the near future. 

I'm wondering, though, if there is a non-SQL way to do this?

Super Contributor
Posts: 333

Re: Converting from visit-level to person-level data

The sql is doing the counts and creating a full set of ids months to merge the counts into. If I remember correctly a data step merge will not give you a cartesian product (all the ids months combinations) in a long format instead of the wide you desire.

You can repalce the sql count with proc freq and an output statement (with some additional cleanup)

You can replace the two distincts with proc sort  nodupkey

Im sure others can provide better data step suggestions.

EJ

Super User
Super User
Posts: 6,681

Re: Converting from visit-level to person-level data

Use PROC SUMMARY and then TRANSPOSE.  You can create a CLASSDATA table to insure that all months are represented.

data months;

do month=1 to 12; output; end;

run;

proc summary data=have nway classdata=months;

  by pid;

  class month ;

  output out=counts ;

run;

proc transpose data=counts out=want prefix=vis_ ;

  by pid;

  id month;

  var _freq_;

run;

Super User
Posts: 9,758

Re: Converting from visit-level to person-level data

data have;
input PID            MONTH;
cards;
1               3
2               5
3               6
4               1
5               12
;
run;
data want(keep=pid vis:) ;
 set have;
 array a{12} VIS_JAN   VIS_FEB  VIS_MAR  VIS_APR  VIS_MAY  VIS_JUN   VIS_JUL  VIS_AUG  VIS_SEP  VIS_OCT   VIS_NOV  VIS_DEC ;
 do i=1 to dim(a);
  if i=month then a{i}=1;
   else a{i}=0;
 end;
run;

Ksharp

Super Contributor
Posts: 333

Re: Converting from visit-level to person-level data

and

Correct me if Im wrong but I dont think either of your methods account for PIDs that are not in the source dataset. So if a full set is desired some would be missing.

Also believe Ksharp code just flags if a visit happens in a month and not the number of visits in that month, (again forgive if I missed something).

EJ

Super User
Super User
Posts: 6,681

Re: Converting from visit-level to person-level data

It will include all PID values in the input dataset, but If there is a source of PID values in another dataset then you can use that to generate a CLASSDATA dataset for PROC SUMMARY.  Just convert the proc summary code to treat PID as a CLASS variable instead of a BY variable.

Here is one way.

data classdata ;

  set PIDLIST (keep=pid );

  by pid;

  if first.pid;

  do month=1 to 12;

    output;

  end;

run;

Frequent Contributor
Posts: 82

Re: Converting from visit-level to person-level data

Thanks Tom.  I still am not getting any entries for PID 3, who is not in the original clinical visit dataset.  This is the code I am using:

data have;

input PID            MONTH;

cards;

1               3

2               5

4               6

4               1

5               12

;

run;

data PIDLIST;

do PID = 1 to 5; output; end;

run;

data classdata;

  set PIDLIST (keep = PID);

  by PID;

  if first.PID;

  do month=1 to 12;

    output;

  end;

run;

proc summary data=have nway classdata=months;

  by pid;

  class month ;

  output out=counts ;

run;

proc transpose data=counts out=want prefix=vis_ ;

  by pid;

  id month;

  var _freq_;

run;

Solution
‎07-05-2013 12:55 PM
Super User
Super User
Posts: 6,681

Re: Converting from visit-level to person-level data

You have to use your new dataset and tell summary that PID is now a CLASS variable.

proc summary data=have nway classdata=CLASSDATA;

  class pid month ;

  output out=counts ;

run;

Frequent Contributor
Posts: 82

Re: Converting from visit-level to person-level data

Thanks, Tom.  I appreciate your help.

Frequent Contributor
Posts: 82

Re: Converting from visit-level to person-level data

Thanks to Tom and KSharp for their input.  EJ is correct; neither of the solutions account for PIDs that are not in the source dataset. 

Super User
Posts: 9,758

Re: Converting from visit-level to person-level data

It looks like I don't understand OP totally.

data have;
input PID            MONTH;
cards;
1               3
2               5
4               6
4               1
5               12
;
run;
 

 

data PIDLIST;
do PID = 1 to 5; output; end;
run;

data have;
 merge have pidlist;
 by pid;
run;

data temp(keep=pid vis:) ;
 set have;
 array a{12} VIS_JAN   VIS_FEB  VIS_MAR  VIS_APR  VIS_MAY  VIS_JUN   VIS_JUL  VIS_AUG  VIS_SEP  VIS_OCT   VIS_NOV  VIS_DEC ;
 do i=1 to dim(a);
  if i=month then a{i}=1;
   else a{i}=0;
 end;
run;
proc summary data=temp;
 by pid;
 var vis:;
 output out=want(drop=_:) sum= ;
run;

Ksharp

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 347 views
  • 3 likes
  • 4 in conversation