I have this longitudinal data
id PD
01 0.5
01 1.0
01 1.5
01 0.9
01 1.6
01 1.1
02 1.5
02 1.0
02 1.5
02 1.5
02 1.7
03 1.2
03 1.5
03 1.0
03 1.5
03 0.5
03 1.7
03 1.2
04 0.2
04 0.5
04 0.6
04 0.5
04 0.5
04 0.7
04 0.2
For each id, the PD value fluctuate between 0 and 1.7.
I have this three categories: i) PD>1.15 , ii) 0.9<PD<=1.15 iii) PD<=0.9
For some id, PD will fluctuate within the three groups. I want to create a third variable GroupPD and call such group "A", that is the id will
have correspondent GroupPD value "A". b) if PD fluctuate between PD>1.15 and 0.9<PD<=1.15 the GroupPD value will be "B". c) if PD fluctuate between PD>1.15 and PD<=0.9
the GroupPD value will be "C" d) if PD fluctuate between 0.9<PD<=1.15 and PD<=0.9 GroupPD value will be "D" e) If PD stayed within PD>1.15, then GroupPD will be "E"
f) If PD stayed within <0.9PD<=1.15, then GroupPD will be "F" g) If PD stayed within PD<=0.9, then GroupPD will be "G"
Please can I have code to help me do this.
Expected output
id PD GroupPD
01 0.5 A
01 1.0 A
01 1.5 A
01 0.9 A
01 1.6 A
01 1.1 A
02 1.5 B
02 1.0 B
02 1.5 B
02 1.5 B
02 1.7 B
03 1.2 C
03 1.5 C
03 1.0 C
03 1.5 C
03 0.5 C
03 1.7 C
03 1.2 C
04 0.2 G
04 0.5 G
04 0.6 G
04 0.5 G
04 0.5 G
04 0.7 G
04 0.2 G
data have;
input id $ 1-2 pd;
datalines;
01 0.5
01 1.0
01 1.5
01 0.9
01 1.6
01 1.1
02 1.5
02 1.0
02 1.5
02 0.5
02 1.7
03 1.2
03 1.5
03 1.0
03 1.5
03 0.5
03 1.7
03 1.2
04 0.2
04 0.5
04 0.6
04 0.5
04 0.5
04 0.7
04 0.2
;
run;
proc print data=have;
run;
* sort data set if not already sorted by id;
data want(drop=low medium high pd);
retain low medium high;
set have;
by id;
if first.id then
do;
low=0;
medium=0;
high=0;
end;
if pd le 0.9 then low+1;
else if 0.9 lt pd le 1.15 then medium+1;
else if pd gt 1.15 then high+1;
if last.id then
do;
groupPD='A';
if (low=0 and medium gt 0 and high gt 0) then groupPD='B';
else if (low gt 0 and medium=0 and high gt 0) then groupPD='C';
else if (low gt 0 and medium gt 0 and high=0) then groupPD='D';
else if (low=0 and medium=0 and high gt 0) then groupPD='E';
else if (low=0 and medium gt 0 and high=0) then groupPD='F';
else if (low gt 0 and medium=0 and high=0) then groupPD='G';
output;
end;
run;
proc print data=want;
run;
Please show us the expected output for better response
data have;
input id $ 1-2 pd;
datalines;
01 0.5
01 1.0
01 1.5
01 0.9
01 1.6
01 1.1
02 1.5
02 1.0
02 1.5
02 0.5
02 1.7
03 1.2
03 1.5
03 1.0
03 1.5
03 0.5
03 1.7
03 1.2
04 0.2
04 0.5
04 0.6
04 0.5
04 0.5
04 0.7
04 0.2
;
run;
proc print data=have;
run;
* sort data set if not already sorted by id;
data want(drop=low medium high pd);
retain low medium high;
set have;
by id;
if first.id then
do;
low=0;
medium=0;
high=0;
end;
if pd le 0.9 then low+1;
else if 0.9 lt pd le 1.15 then medium+1;
else if pd gt 1.15 then high+1;
if last.id then
do;
groupPD='A';
if (low=0 and medium gt 0 and high gt 0) then groupPD='B';
else if (low gt 0 and medium=0 and high gt 0) then groupPD='C';
else if (low gt 0 and medium gt 0 and high=0) then groupPD='D';
else if (low=0 and medium=0 and high gt 0) then groupPD='E';
else if (low=0 and medium gt 0 and high=0) then groupPD='F';
else if (low gt 0 and medium=0 and high=0) then groupPD='G';
output;
end;
run;
proc print data=want;
run;
Thank you very much!!!
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.