I have this longitudinal data
id PD
01 0.5
01 1.0
01 1.5
01 0.9
01 1.6
01 1.1
02 1.5
02 1.0
02 1.5
02 1.5
02 1.7
03 1.2
03 1.5
03 1.0
03 1.5
03 0.5
03 1.7
03 1.2
04 0.2
04 0.5
04 0.6
04 0.5
04 0.5
04 0.7
04 0.2
For each id, the PD value fluctuate between 0 and 1.7.
I have this three categories: i) PD>1.15 , ii) 0.9<PD<=1.15 iii) PD<=0.9
For some id, PD will fluctuate within the three groups. I want to create a third variable GroupPD and call such group "A", that is the id will
have correspondent GroupPD value "A". b) if PD fluctuate between PD>1.15 and 0.9<PD<=1.15 the GroupPD value will be "B". c) if PD fluctuate between PD>1.15 and PD<=0.9
the GroupPD value will be "C" d) if PD fluctuate between 0.9<PD<=1.15 and PD<=0.9 GroupPD value will be "D" e) If PD stayed within PD>1.15, then GroupPD will be "E"
f) If PD stayed within <0.9PD<=1.15, then GroupPD will be "F" g) If PD stayed within PD<=0.9, then GroupPD will be "G"
Please can I have code to help me do this.
Expected output
id PD GroupPD
01 0.5 A
01 1.0 A
01 1.5 A
01 0.9 A
01 1.6 A
01 1.1 A
02 1.5 B
02 1.0 B
02 1.5 B
02 1.5 B
02 1.7 B
03 1.2 C
03 1.5 C
03 1.0 C
03 1.5 C
03 0.5 C
03 1.7 C
03 1.2 C
04 0.2 G
04 0.5 G
04 0.6 G
04 0.5 G
04 0.5 G
04 0.7 G
04 0.2 G
data have;
input id $ 1-2 pd;
datalines;
01 0.5
01 1.0
01 1.5
01 0.9
01 1.6
01 1.1
02 1.5
02 1.0
02 1.5
02 0.5
02 1.7
03 1.2
03 1.5
03 1.0
03 1.5
03 0.5
03 1.7
03 1.2
04 0.2
04 0.5
04 0.6
04 0.5
04 0.5
04 0.7
04 0.2
;
run;
proc print data=have;
run;
* sort data set if not already sorted by id;
data want(drop=low medium high pd);
retain low medium high;
set have;
by id;
if first.id then
do;
low=0;
medium=0;
high=0;
end;
if pd le 0.9 then low+1;
else if 0.9 lt pd le 1.15 then medium+1;
else if pd gt 1.15 then high+1;
if last.id then
do;
groupPD='A';
if (low=0 and medium gt 0 and high gt 0) then groupPD='B';
else if (low gt 0 and medium=0 and high gt 0) then groupPD='C';
else if (low gt 0 and medium gt 0 and high=0) then groupPD='D';
else if (low=0 and medium=0 and high gt 0) then groupPD='E';
else if (low=0 and medium gt 0 and high=0) then groupPD='F';
else if (low gt 0 and medium=0 and high=0) then groupPD='G';
output;
end;
run;
proc print data=want;
run;
Please show us the expected output for better response
data have;
input id $ 1-2 pd;
datalines;
01 0.5
01 1.0
01 1.5
01 0.9
01 1.6
01 1.1
02 1.5
02 1.0
02 1.5
02 0.5
02 1.7
03 1.2
03 1.5
03 1.0
03 1.5
03 0.5
03 1.7
03 1.2
04 0.2
04 0.5
04 0.6
04 0.5
04 0.5
04 0.7
04 0.2
;
run;
proc print data=have;
run;
* sort data set if not already sorted by id;
data want(drop=low medium high pd);
retain low medium high;
set have;
by id;
if first.id then
do;
low=0;
medium=0;
high=0;
end;
if pd le 0.9 then low+1;
else if 0.9 lt pd le 1.15 then medium+1;
else if pd gt 1.15 then high+1;
if last.id then
do;
groupPD='A';
if (low=0 and medium gt 0 and high gt 0) then groupPD='B';
else if (low gt 0 and medium=0 and high gt 0) then groupPD='C';
else if (low gt 0 and medium gt 0 and high=0) then groupPD='D';
else if (low=0 and medium=0 and high gt 0) then groupPD='E';
else if (low=0 and medium gt 0 and high=0) then groupPD='F';
else if (low gt 0 and medium=0 and high=0) then groupPD='G';
output;
end;
run;
proc print data=want;
run;
Thank you very much!!!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.