Hello,
Please, I have a data set in longitudinal format as show below. I want to create additional column CENST in which the values of CENS is shift one row up and the values after first none-zero value is missing for each subject.
data have;
input subjid age cens censt;
datalines;
SUBIJD |
age |
cens |
100-001 |
50 |
0 |
100-001 |
50 |
0 |
100-001 |
50 |
1 |
100-001 |
50 |
1 |
100-001 |
50 |
1 |
100-002 |
35 |
0 |
100-002 |
35 |
0 |
100-002 |
35 |
0 |
100-002 |
35 |
1 |
100-002 |
35 |
1 |
100-002 |
35 |
1 |
Data want;
SUBIJD |
age |
cens |
censt |
100-001 |
50 |
0 |
0 |
100-001 |
50 |
0 |
1 |
100-001 |
50 |
1 |
. |
100-001 |
50 |
1 |
. |
100-001 |
50 |
1 |
. |
100-002 |
35 |
0 |
0 |
100-002 |
35 |
0 |
0 |
100-002 |
35 |
0 |
1 |
100-002 |
35 |
1 |
. |
100-002 |
35 |
1 |
. |
100-002 |
35 |
1 |
. |
;run;
data have;
input subjid $ age cens ;
datalines;
100-001
50
0
100-001
50
0
100-001
50
1
100-001
50
1
100-001
50
1
100-002
35
0
100-002
35
0
100-002
35
0
100-002
35
1
100-002
35
1
100-002
35
1
;
data want;
do until(last.subjid);
set have;
by subjid cens notsorted;
censt=cens;
if not cens and last.cens then censt=1;
else if cens then call missing(censt);
output;
end;
run;
What if the first CENS for a given ID is already 1?
And please post WORKING data step code into a window opened with the "little running man"; only then will we be able to recreate your dataset without more work than a copy/paste and submit. Your code as posted will not run.
data have;
input subjid $ age cens ;
datalines;
100-001
50
0
100-001
50
0
100-001
50
1
100-001
50
1
100-001
50
1
100-002
35
0
100-002
35
0
100-002
35
0
100-002
35
1
100-002
35
1
100-002
35
1
;
data want;
do until(last.subjid);
set have;
by subjid cens notsorted;
censt=cens;
if not cens and last.cens then censt=1;
else if cens then call missing(censt);
output;
end;
run;
The is a typical set/by plus self-merge with offset task:
data want (drop=onefound);
retain onefound 'N';
set have (keep=subjid);
by subjid;
if first.subjid then onefound='N';
merge have
have (firstobs=2 keep=cens rename=(cens=censt));
if last.subjid=1 or onefound='Y' then call missing(censt);
if censt=1 then onefound='Y';
run;
Thanks!
CENS remains as it. CENST is obtained from CENS by shifting one record upward and then declaring missing all 1s after the first 1. One can think about it this way. Within each ID, data represent values of measurement for different visit. CENS reflects the point an event occurred. By shifting CENS one record up, the analysis can use values of other variables just before the event occured because the patient is at risk prior to the event captured by CENS.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.