Hi All,
I have a dataset with 24 variables, but I would only like to transpose 3 of them while keeping the others in the same orientation. The dataset describes a physical exam by body part and whether the exam is normal/abnormal and if abnormal the diagnosis is indicated. The datasets looks as follows:
TrialID Center Patient Visit ExamDt Body Exam Diag Sex Race Consent ConsentDt Wt RandomDt PerformStat Trt StopReas TrtGive Age TrtDt CtrPt TrialCtrPt
Body has the following categories: head, eye, ear, nose, throat, neck, lymph node, breast, heart, abdomen, lung/thorax, genitourinary, extremities, musculoskeletal, skin, neurologic, additional
Each patient has an enty for each body part so there are numerous rows with the same patient.
Is there a way to only transpose body, exam and diag while keeping the other variables in the same orientation? I have tried, but it seems that only the variable in the code below and the other variables will be dropped in the transposed dataset.
proc transpose data=medhistory out=transmedhist(drop=_:) prefix=exam_;
var exam;
by TrialCtrPt;
id body;
run;
If I've interpreted what you want correctly, this will come close. I'm assuming that you want multiple body category examinations and diagnoses per patient. I'm also assuming that trialctrpt is the patient id, and patient is some other way of identifying them. Don't worry too much about the first three steps - it's just my attempt at mocking up some test data.
Rather than trying to do it all in one transpose, which then requires lots of complicated arrays and post-processing, I went over the patient/body/exam/diag data three times, and then rejoined back with the original dataset.
data medhistory(drop=body exam diag)
patient(keep=trialctrpt body exam diag);
length TrialID Center Patient Visit ExamDt 8;
length Body exam diag $ 20;
length Sex Race Consent ConsentDt Wt RandomDt PerformStat Trt StopReas TrtGive Age TrtDt CtrPt TrialCtrPt 8;
array body_a[17] $ 20 _temporary_ ('head', 'eye', 'ear', 'nose', 'throat', 'neck', 'lymph node', 'breast', 'heart', 'abdomen',
'lung/thorax', 'genitourinary', 'extremities', 'musculoskeletal', 'skin', 'neurologic',
'additional');
call missing(of _all_);
do trialctrpt = 1 to 100;
do i = 1 to int(ranuni(225465114) * 3) + 1;
body = body_a[int(ranuni(225465114) * dim(body_a)) + 1];
exam = ifc(ranuni(225465114) > .1, 'normal', 'abnormal');
diag = ifc(exam = 'abnormal', 'indicated', ' ');
output patient; /* Multiple observations per patient (in this example, up to 3) */
end;
output medhistory; /* Single observations per patient */
end;
stop;
keep TrialID Center Patient Visit ExamDt Body exam diag
sex Race Consent ConsentDt Wt RandomDt PerformStat Trt StopReas TrtGive Age TrtDt CtrPt TrialCtrPt ;
run;
/*
Clean data to stop multiple body entries per patient
*/
proc sort data=patient noequals;
by trialctrpt body exam;
run;
data patient;
set patient;
by trialctrpt body;
if first.body;
run;
/*
Get the three different transpositions
*/
proc transpose data=patient out=transbody(drop=_name_) prefix=body_;
by trialctrpt;
var body;
run;
proc transpose data=patient out=transexam(drop=_name_) prefix=exam_;
by trialctrpt;
var exam;
run;
proc transpose data=patient out=transdiag(drop=_name_) prefix=diag_;
by trialctrpt;
var diag;
run;
/*
Join back with the patient records.
*/
data medhistory;
set medhistory;
set transbody;
set transexam;
set transdiag;
run;
If I've interpreted what you want correctly, this will come close. I'm assuming that you want multiple body category examinations and diagnoses per patient. I'm also assuming that trialctrpt is the patient id, and patient is some other way of identifying them. Don't worry too much about the first three steps - it's just my attempt at mocking up some test data.
Rather than trying to do it all in one transpose, which then requires lots of complicated arrays and post-processing, I went over the patient/body/exam/diag data three times, and then rejoined back with the original dataset.
data medhistory(drop=body exam diag)
patient(keep=trialctrpt body exam diag);
length TrialID Center Patient Visit ExamDt 8;
length Body exam diag $ 20;
length Sex Race Consent ConsentDt Wt RandomDt PerformStat Trt StopReas TrtGive Age TrtDt CtrPt TrialCtrPt 8;
array body_a[17] $ 20 _temporary_ ('head', 'eye', 'ear', 'nose', 'throat', 'neck', 'lymph node', 'breast', 'heart', 'abdomen',
'lung/thorax', 'genitourinary', 'extremities', 'musculoskeletal', 'skin', 'neurologic',
'additional');
call missing(of _all_);
do trialctrpt = 1 to 100;
do i = 1 to int(ranuni(225465114) * 3) + 1;
body = body_a[int(ranuni(225465114) * dim(body_a)) + 1];
exam = ifc(ranuni(225465114) > .1, 'normal', 'abnormal');
diag = ifc(exam = 'abnormal', 'indicated', ' ');
output patient; /* Multiple observations per patient (in this example, up to 3) */
end;
output medhistory; /* Single observations per patient */
end;
stop;
keep TrialID Center Patient Visit ExamDt Body exam diag
sex Race Consent ConsentDt Wt RandomDt PerformStat Trt StopReas TrtGive Age TrtDt CtrPt TrialCtrPt ;
run;
/*
Clean data to stop multiple body entries per patient
*/
proc sort data=patient noequals;
by trialctrpt body exam;
run;
data patient;
set patient;
by trialctrpt body;
if first.body;
run;
/*
Get the three different transpositions
*/
proc transpose data=patient out=transbody(drop=_name_) prefix=body_;
by trialctrpt;
var body;
run;
proc transpose data=patient out=transexam(drop=_name_) prefix=exam_;
by trialctrpt;
var exam;
run;
proc transpose data=patient out=transdiag(drop=_name_) prefix=diag_;
by trialctrpt;
var diag;
run;
/*
Join back with the patient records.
*/
data medhistory;
set medhistory;
set transbody;
set transexam;
set transdiag;
run;
Are you going from a long to wide - adding/creating variables for final data set?
Or wide to long - reducing the number of variables?
It really helps to clarify problems with a few rows of example input or starting data and the desired result for that output.
Sorry about that. An example of the dataset is:
TrialID Center Patient Visit ExamDt Body Exam Diag Sex Race Consent ConsentDt Wt
abc 1 1 1 2/3/1990 head normal F W Yes 2/1/1990 115
abc 1 1 1 2/3/1990 neck normal F W Yes 2/1/1990 115
abc 1 1 1 2/3/1990 heart abnormal htn F W Yes 2/1/1990 115
abc 2 1 1 2/3/1990 heart abnormal htn F B Yes 2/3/1990 110
RandomDt PerformStat Trt StopReas TrtGive Age TrtDt CtrPt TrialCtrPt
2/1/1990 1 x PD x 25 2/3/1990 1/1 abc/1/1
2/1/1990 1 x PD x 25 2/3/1990 1/1 abc/1/1
2/1/1990 1 x PD x 25 2/3/1990 1/1 abc/1/1
2/3/1990 1 x PD x 30 2/3/1990 2/1 abc/2/1
I would like the data to be the same for all the variables except body, exam and diag which I would think would be column headings of exam_head, exam_neck, etc and the same for diag (diag_head, diag_neck, etc.).
Hopefully that helps clarify what I am looking for.
Thank you for your help.
Could you use the instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to create that data as a data step that you could paste here as text? Then we would have a data set we could test code against.
And what value would Body_head, body_neck, have?
Please find the code below to generate the data
data mock;
infile datalines dsd;
input trialid $ centre pt visit examdt mmddyy. body $ exam $ diag $ sex $ race $ consent $ consentdt mmddyy. wt randomdt mmddyy. performstat trt $ stopreas $ trtgive $ age
trtdt mmddyy. ctrpt $ trialctrpt $ ;
datalines;
abc,1,1,2/3/1990,head,normal, ,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,eye,normal, ,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,ear,normal, ,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,nose,normal, ,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,throat,abnormal,lump,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,neck,abnormal,enlarged,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,lymphnode,abnormal,large,,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,breast,abnormal,warm,,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,heart,normal, ,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,abdomen,normal, ,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,lung/thorax,abnormal,dyspnea,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,genitourinary,normal, ,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,extremities,normal, ,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,musculoskeletal,normal, ,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,skin,abnormal,discolored,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,1,1,2/3/1990,neurologic,normal, ,F,W,Yes,2/1/1990,115,2/1/1990,1,x,PD,x,25,2/3/1990,1/1,abc/1/1
abc,2,1,2/3/1990,head,normal, ,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,eye,normal, ,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,ear,normal, ,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,nose,normal, ,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,throat,abnormal,lump,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,neck,abnormal,enlarged,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,lymphnode,abnormal,large,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,breast,abnormal,warm,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,heart,normal, ,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,abdomen,normal, ,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,lung/thorax,abnormal,dyspnea,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,genitourinary,normal, ,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,extremities,normal, ,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,musculoskeletal,normal, ,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,skin,abnormal,discolored,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,2,1,2/3/1990,neurologic,normal, ,F,B,Yes,2/3/1990,110,2/3/1990,1,x,PD,x,30,2/3/1990,2/1,abc/2/1
abc,3,1,2/3/1990,head,normal, ,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,eye,normal, ,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,ear,normal, ,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,nose,normal, ,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,throat,abnormal,lump,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,neck,abnormal,enlarged,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,lymphnode,abnormal,large,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,breast,abnormal,warm,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,heart,normal, ,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,abdomen,normal, ,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,lung/thorax,abnormal,dyspnea,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,genitourinary,normal, ,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,extremities,normal, ,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,musculoskeletal,normal, ,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,skin,abnormal,discolored, ,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,3,1,2/3/1990,neurologic,normal, ,M,W,Yes,2/3/1990,140,2/3/1990,1,x,death,x,30,2/3/1990,3/1,abc/3/1
abc,4,1,2/3/1990,head,normal, ,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,eye,normal, ,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,ear,normal, ,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,nose,normal, ,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,throat,abnormal,lump,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,neck,abnormal,enlarged,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,lymphnode,abnormal,large,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,breast,abnormal,warm,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,heart,normal, ,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,abdomen,normal, ,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,lung/thorax,abnormal,dyspnea,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,genitourinary,normal, ,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,extremities,normal, ,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,musculoskeletal,normal, ,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,skin,abnormal,discolored,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
abc,4,1,2/3/1990,neurologic,normal, ,M,B,Yes,2/3/1990,165,2/3/1990,1,x,death,x,45,2/3/1990,4/1,abc/4/1
;
run;
Thanks for your help.
Does that code correctly run for you when you copy and paste into an editor and run?
I get multple blocks similar to:
NOTE: Invalid data for visit in line 14 9-16. NOTE: Invalid data for examdt in line 14 18-23. NOTE: Invalid data for consentdt in line 14 49-54. NOTE: Invalid data for wt in line 14 55-60. NOTE: Invalid data for randomdt in line 14 62-67. NOTE: Invalid data for age in line 14 83-85. NOTE: Invalid data for trtdt in line 14 87-92.
Your examdt is the 5th variable on the input statement but the values are the 4th. So the exam date is being read into visit (unsuccessfully in general).
Here is a stab at what I think you may be asking for minus what ever would go into Body_ etc variables.
Note I am only using part of your example data as 1) I didn't want to spend too much time getting a workable data set, 2) adding the other variables is just adding to them to the select statement in the proc sql
The BY statements all need to have the same variables, in the same order and should be the minimum needed to identify the specific bit you want.
data have; informat TrialID $5. ExamDt mmddyy10. Body $10. Exam $10. Diag $10. Sex $1. Race $1. Consent $5. ConsentDt mmddyy10. ; format examdt consentdt mmddyy10.; input TrialID Center Patient Visit ExamDt Body Exam Diag Sex Race Consent ConsentDt Wt ; datalines; abc 1 1 1 2/3/1990 head normal . F W Yes 2/1/1990 115 abc 1 1 1 2/3/1990 neck normal . F W Yes 2/1/1990 115 abc 1 1 1 2/3/1990 heart abnormal htn F W Yes 2/1/1990 115 abc 2 1 1 2/3/1990 heart abnormal htn F B Yes 2/3/1990 110 ; run; proc sort data=have; by TrialID Center Patient Visit ExamDt; run; proc transpose data=have (drop=diag) out=transexam (drop=_name_) prefix=exam_; by TrialID Center Patient Visit ExamDt; var exam; id body; run; proc transpose data=have (drop=exam) out=transdiag (drop=_name_) prefix=diag_; by TrialID Center Patient Visit ExamDt; var diag; id body; run; proc sql; create table temp as select distinct TrialID,Center,Patient,Visit,ExamDt,Sex,Race,Consent,ConsentDt,Wt from have order by TrialID,Center,Patient,Visit,ExamDt; quit; data want; merge temp transexam transdiag; by TrialID Center Patient Visit ExamDt; run;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.