Solved: Re: Create a count variable that accumulates events and accounts for n...

ANKH1 · Posted 02-02-2024 02:10 PM

In SAS I have two datasets.
Dataset1: age_event variable is the age at which each ID reported an event. One same ID can have one or more than one event. Not all IDs from the sample reported events. IDs that did not reported events were not included in this dataset.

data ds1;
input ID age_event;
datalines;
a1	32
b2	54

b2	67
c3	34
c3	45
c3	78
;
run;

Dataset2: All IDs of the sample are reported. This dataset contains one row per ID. variable "last_agerecorded" is the age at which each reported their last record for the whole study.

data main;
input ID last_agerecorded;
datalines;
a1	56
a2	67
b1	68
b2	72
b3	132
c2	121
c3	124
c4	58
d1	89
d2	95
e2	74
;
run;

We would like to create a count_event variable that counts the number of events per ID. However, if an ID reports more than one event, count_event variable will need to add the total of events up to the last "age_event" reported. See cases for ID=b2 and c3 below. Each event has one row and the "count_event" variable sums the number of event at the last age_event. If the ID did not report any event at all, then age_event should be equal to "last_agerecorded" from the main dataset and the count_event should be equal to zero.

This is the required output:

ID	age_event	count_event
a1	32	1
a2	67	0
b1	68	0
b2	54	1
b2	67	2
b3	132	0
c2	121	0
c3	34	1
c3	45	2
c3	78	3
c4	58	0
d1	89	0
d2	95	0
e2	74	0

How can we get the required output? Thanks!

mkeintz · Posted 02-03-2024 07:57 AM

Assuming the data are sorted by ID (main) or ID/age_event (DS1), you can use the interleaving process of SET with BY, as in:

data ds1;
input ID $ age_event;
datalines;
a1 32
b2 54
b2 67
c3 34
c3 45
c3 78
;
run;

data main;
input ID $ last_agerecorded;
datalines;
a1 56
a2 67
b1 68
b2 72
b3 132
c2 121
c3 124
c4 58
d1 89
d2 95
e2 74
;

data want;
  set main (rename=last_agerecorded=age_event) ds1 (in=inds1);
  by id;
  if first.id then count=0;
  count+inds1;
  if inds1=1 or last.id;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

ChanceTGardener · Posted 02-02-2024 06:17 PM

data ds1;
input ID $ age_event;
datalines;
a1 32
b2 54
b2 67
c3 34
c3 45
c3 78
;
run;

data main;
input ID $ last_agerecorded;
datalines;
a1 56
a2 67
b1 68
b2 72
b3 132
c2 121
c3 124
c4 58
d1 89
d2 95
e2 74
;
run;

proc sort data=ds1;
 by ID age_event;
run;

data ds1;
 set ds1; 
  by ID;
   if first.ID then count_event = 0;
    count_event + 1;
run;

proc sql;
 create table want as 
  select ID
  ,	     age_event
  ,	     count_event
  from ds1
   outer union corresponding
  select ID
  ,	     last_agerecorded as age_event
  ,      0 as count_event
  from main
  where ID not in(select distinct ID from ds1)
 order by ID, count_event;
quit;

mkeintz · Posted 02-03-2024 07:57 AM

Assuming the data are sorted by ID (main) or ID/age_event (DS1), you can use the interleaving process of SET with BY, as in:

data ds1;
input ID $ age_event;
datalines;
a1 32
b2 54
b2 67
c3 34
c3 45
c3 78
;
run;

data main;
input ID $ last_agerecorded;
datalines;
a1 56
a2 67
b1 68
b2 72
b3 132
c2 121
c3 124
c4 58
d1 89
d2 95
e2 74
;

data want;
  set main (rename=last_agerecorded=age_event) ds1 (in=inds1);
  by id;
  if first.id then count=0;
  count+inds1;
  if inds1=1 or last.id;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Create a count variable that accumulates events and accounts for no events

Re: Create a count variable that accumulates events and accounts for no events

Re: Create a count variable that accumulates events and accounts for no events

Re: Create a count variable that accumulates events and accounts for no events

Create a count variable that accumulates events and accounts for no events

Re: Create a count variable that accumulates events and accounts for no events

Re: Create a count variable that accumulates events and accounts for no events

Re: Create a count variable that accumulates events and accounts for no events

SAS Innovate 2025: Register Now

SAS Training: Just a Click Away