Hi!
I have the following data
data have; input id dt test; datalines;
ABCDE 20100429 T1
ABCDE 20100429 T2
ABCDE 20090908 T1
ABCDE 20090908 T2
ABCDE 20210823 T1
ABCUK 20191008 T1
ABCUK 20191008 T2
ABCUK 20230723 T1
;
In this dataset based on the id, date, and test there are six records that are in 3 pairs and two records that are singles. I want to create a new variable doc_id
in which each unique pair will get a new id and and for singles the field will be blank. See below expected output.
id dt test doc_id
ABCDE 20100429 T1 ABCDE-1
ABCDE 20100429 T2 ABCDE-1
ABCDE 20090908 T1 ABCDE-2
ABCDE 20090908 T2 ABCDE-2
ABCDE 20210823 T1
ABCUK 20191008 T1 ABCUK-1
ABCUK 20191008 T2 ABCUK-1
ABCUK 20230723 T1
Thank you for your inputs!
data have;
input id $ dt:yymmdd10. test $;
format dt date11.;
datalines;
ABCDE 20100429 T1
ABCDE 20100429 T2
ABCDE 20090908 T1
ABCDE 20090908 T2
ABCDE 20210823 T1
ABCUK 20191008 T1
ABCUK 20191008 T2
ABCUK 20230723 T1
;
data want;
set have ;
by id dt notsorted;
retain NEW_ID;
if first.id and first.dt then NEW_ID=1;
else if not first.id and first.dt then NEW_ID+1;
if not (first.dt and last.dt) then doc_id= catx('-', id, new_id);
drop new_id;
run;
data have;
input id $ dt:yymmdd10. test $;
format dt date11.;
datalines;
ABCDE 20100429 T1
ABCDE 20100429 T2
ABCDE 20090908 T1
ABCDE 20090908 T2
ABCDE 20210823 T1
ABCUK 20191008 T1
ABCUK 20191008 T2
ABCUK 20230723 T1
;
data want;
set have ;
by id dt notsorted;
retain NEW_ID;
if first.id and first.dt then NEW_ID=1;
else if not first.id and first.dt then NEW_ID+1;
if not (first.dt and last.dt) then doc_id= catx('-', id, new_id);
drop new_id;
run;
data have;
input id $ dt :yymmdd10. test $;
format dt yymmddn8.;
datalines;
ABCDE 20100429 T1
ABCDE 20100429 T2
ABCDE 20090908 T1
ABCDE 20090908 T2
ABCDE 20210823 T1
ABCUK 20191008 T1
ABCUK 20191008 T2
ABCUK 20230723 T1
;
data temp;
set have;
by id dt notsorted;
if first.id then n=0;
if not (first.dt and last.dt) and first.dt then n+1;
run;
data want;
set temp;
by id dt notsorted;
doc_id=catx('-',id,n);
if first.dt and last.dt then call missing(doc_id);
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.