I have a dataset that is structured like this:
Person Demographic1 Age Diagnosis1 SecondaryDiagnosis SecondaryDiagnosisPosition
Bob Male 45 424.0 250.1 2
Bob Male 45 424.0 V89.1 3
Bob Male 45 424.0 715.0 4
Jane Female 22 492.11 V25 2
Roger Male 10 720.1
etc etc...
I would like to create a dataset like this:
Person Demographic1 Age Diagnosis1 diagnosis2 diagnosis3 diagnosis4.....etc
Bob Male 45 424.0 250.1 V89.1 715.0
Jane Female 22 492.11 V25
Roger Male 10 720.1
etc...
My first thought was to create a new variable for each secondary diagnosis position (so... diagnosis2, diagnosis3, etc) and have it store the associated diagnosis code. I used this code:
data x; set data y;
diagnosis3 = .;
IF(secondarydiagnosis) =3 then diagnosis3= secondarydiagnosis;
run;
I tried this, but it didn't quite work- it errors out because of the v-codes (character data) inherent in ICD-9 coding. It seems like when I created the new variable, it was automatically created as a numeric variable, when it should've been a character variable. I've tried to switch it to a character, but when I do that, I end up deleting my entire dataset.
data x;
diagnosis2_char = put(diagnosis2, 6.)
drop diagnosis2;
rename diagnosis2_char = diagnosis2;
run;
Primary question: How should I be creating my new variables to get around this issue with the V-codes without deleting my dataset?
Secondary question: I'm not sure that I've chosen the most efficient way to get from the dataset structure I have to the dataset structure that I want- so if there are comments on that, it would be appreciated as well- but my primary problem is the character/ numeric situation.
Thank you.
Try TRANSPOSE:
data have;
infile datalines missover;
input Person $ Demographic1 $ Age Diagnosis1 $ SecondaryDiagnosis $ SecondaryDiagnosisPosition;
datalines;
Bob Male 45 424.0 250.1 2
Bob Male 45 424.0 V89.1 3
Bob Male 45 424.0 715.0 4
Jane Female 22 492.11 V25 2
Roger Male 10 720.1
;
proc transpose data=have out=want(drop=_name_) prefix=Diagnosis;
by Person Demographic1 Age Diagnosis1 notsorted;
var SecondaryDiagnosis;
id SecondaryDiagnosisPosition;
run;
PG
Try TRANSPOSE:
data have;
infile datalines missover;
input Person $ Demographic1 $ Age Diagnosis1 $ SecondaryDiagnosis $ SecondaryDiagnosisPosition;
datalines;
Bob Male 45 424.0 250.1 2
Bob Male 45 424.0 V89.1 3
Bob Male 45 424.0 715.0 4
Jane Female 22 492.11 V25 2
Roger Male 10 720.1
;
proc transpose data=have out=want(drop=_name_) prefix=Diagnosis;
by Person Demographic1 Age Diagnosis1 notsorted;
var SecondaryDiagnosis;
id SecondaryDiagnosisPosition;
run;
PG
Thanks for the reply-- I need to research TRANSPOSE (never used it before). I will try this out- thank you!
PROC TRANSPOSE is designed to do this.
Your first attempt created a numeric variable because the first reference to it was when you assigned a numeric value to it. (diagnosis3 = .;)
Your second attempt does not have any data since you forgot to include a SET statement. So it will create a dataset with only one observation.
You were on the right track to a brute force solution. One trick to make it easier is to nest the SET statement inside of a DO loop.
data want;
do until (last.person);
set have;
by person;
if secondarydiagnosisposition =2 then diagnosis2= secondarydiagnosis;
if secondarydiagnosisposition =3 then diagnosis3= secondarydiagnosis;
if secondarydiagnosisposition =4 then diagnosis4= secondarydiagnosis;
if secondarydiagnosisposition =5 then diagnosis5= secondarydiagnosis;
if secondarydiagnosisposition =6 then diagnosis6= secondarydiagnosis;
if secondarydiagnosisposition =7 then diagnosis7= secondarydiagnosis;
if secondarydiagnosisposition =8 then diagnosis8= secondarydiagnosis;
if secondarydiagnosisposition =9 then diagnosis9= secondarydiagnosis;
end;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.