SAS 9.4
I have a data set with 200+ variables, 64 are score variables at 3 different time points, baseline, 3 months, 12 month, (e.g. score1_baseline - score64_baseline, score1_3mnth - score64_3mnth). I need to convert this from a wide data set to a long set with 1 variable indicating the 3 timepoints. This would lead to each observation having 3 records as a long versus the 1 record as a wide.
Is there a good process for converting 64 variables at one time from wide to long without losing any of the other data? Thank you
DELETED.
I would go the whole way and have a separate observation for each score and timepoint.
Transpose all variables, then, in a follow-up data step, extract the score number and timepoint from _NAME_.
A quick example with 2 scores that automatically expands to 64 on its own:
data have;
input
id $
score1_baseline score1_3mnth score1_12mnth
score2_baseline score2_3mnth score2_12mnth
;
datalines;
A 1 2 3 4 5 6
B 2 3 4 5 6 7
;
proc transpose
data=have
out=long (rename=(col1=score))
;
by id;
var score:;
run;
data want;
set long;
scorenum = input(substr(scan(_name_,1,"_"),6),2.);
if scan(_name_,2,"_") = "baseline"
then timepoint = 0;
else timepoint = input(compress(scan(_name_,-1,"_"),"mnth"),2.);
drop _name_;
run;
Thank you this is a good start. I attempted to modify the code as my variables were slightly different than described. As such I have a couple questions:
1. Is it possible to scan for a phrase in the variable name instead of scanning the position of the name? I noticed a couple had the "baseline" indicator in the 3rd word position or not had a long name than 32 characters so it was shortened to "baseli"
2. The same applies to the 3 and 12 months where it would be helpful to search for "3m" or "12m"
3. When I ran this I got the score and the timepoint but not the variable name and without the variable name I cannot tell which score the number is associated with. Does this make sense?
proc sort data = have;
by pt_study_deidentified;
run;
proc transpose
data=have
out=long (rename=(col1=score));
by pt_study_deidentified;
var back_pain_vas___1Baseline leg_pain_vas1___1Baseline odi_score_pain_intensity___1Base odi_score_personal_care___1Basel
odi_score_lifting___1Baseline odi_score_walking___1Baseline odi_score_sitting___1Baseline odi_score_standing___1Baseline
odi_score_sleeping___1Baseline odi_score_sex_life___1Baseline odi_score_social_life___1Baselin odi_score_travelling___1Baseline
total_odi_sum_score___1Baseline ODI_percent_baseline mobility___1Baseline self_care___1Baseline usual_activities___1Baseline
pain_discomfort___1Baseline anxiety_depression___1Baseline eq_vas___1Baseline actual_qaly_score___1Baseline promis29_pfa11___1Baseline
promis29_pfa21___1Baseline promis29_pfa23___1Baseline promis29_pfa53___1Baseline promis29_edanx01___1Baseline promis29_edanx40___1Baseline
promis29_edanx41___1Baseline promis29_edanx53___1Baseline promis29_eddep04___1Baseline promis29_eddep06___1Baseline
promis29_eddep29___1Baseline promis29_eddep41___1Baseline promis29_hi7___1Baseline promis29_an3___1Baseline promis29_fatexp41___1Baseline
promis29_fatexp40___1Baseline promis29_sleep109___1Baseline promis29_sleep116___1Baseline promis29_sleep20___1Baseline
promis29_sleep44___1Baseline promis29_srpper11_caps___1Baseli promis29_srpper18_caps___1Baseli promis29_srpper23_caps___1Baseli
promis29_srpper46_caps___1Baseli promis29_painin9___1Baseline promis29_painin22___1Baseline promis29_painin31___1Baseline
promis29_painin34___1Baseline promis29_global07___1Baseline pf_raw___1Baseline pf_t_score___1Baseline anxiety_raw___1Baseline
anxiety_t_score___1Baseline depression_raw___1Baseline depression_t_score___1Baseline fatigue_raw___1Baseline fatigue_t_score___1Baseline
sd_raw___1Baseline sd_t_score___1Baseline social_raw___1Baseline social_t_score___1Baseline pain_raw___1Baseline pain_t_score___1Baseline
back_pain_vas___3month leg_pain_vas1___3month odi_score_pain_intensity___3mont odi_score_personal_care___3month odi_score_lifting___3month
odi_score_walking___3month odi_score_sitting___3month odi_score_standing___3month odi_score_sleeping___3month odi_score_sex_life___3month
odi_score_social_life___3month odi_score_travelling___3month total_odi_sum_score___3month ODI_percent_3months mobility___3month
self_care___3month usual_activities___3month pain_discomfort___3month anxiety_depression___3month eq_vas___3month actual_qaly_score___3month
promis29_pfa11___3month promis29_pfa21___3month promis29_pfa23___3month promis29_pfa53___3month promis29_edanx01___3month
promis29_edanx40___3month promis29_edanx41___3month promis29_edanx53___3month promis29_eddep04___3month promis29_eddep06___3month
promis29_eddep29___3month promis29_eddep41___3month promis29_hi7___3month promis29_an3___3month promis29_fatexp41___3month
promis29_fatexp40___3month promis29_sleep109___3month promis29_sleep116___3month promis29_sleep20___3month promis29_sleep44___3month
promis29_srpper11_caps___3month promis29_srpper18_caps___3month promis29_srpper23_caps___3month promis29_srpper46_caps___3month
promis29_painin9___3month promis29_painin22___3month promis29_painin31___3month promis29_painin34___3month promis29_global07___3month
pf_raw___3month pf_t_score___3month anxiety_raw___3month anxiety_t_score___3month depression_raw___3month depression_t_score___3month
fatigue_raw___3month fatigue_t_score___3month sd_raw___3month sd_t_score___3month social_raw___3month social_t_score___3month
pain_raw___3month pain_t_score___3month
back_pain_vas__12month leg_pain_vas1__12month odi_score_pain_intensity__12mont odi_score_personal_care__12month odi_score_lifting__12month
odi_score_walking__12month odi_score_sitting__12month odi_score_standing__12month odi_score_sleeping__12month odi_score_sex_life__12month
odi_score_social_life__12month odi_score_travelling__12month total_odi_sum_score__12month ODI_percent_12months mobility__12month
self_care__12month usual_activities__12month pain_discomfort__12month anxiety_depression__12month eq_vas__12month
actual_qaly_score__12month promis29_pfa11__12month promis29_pfa21__12month promis29_pfa23__12month promis29_pfa53__12month
promis29_edanx01__12month promis29_edanx40__12month promis29_edanx41__12month promis29_edanx53__12month promis29_eddep04__12month
promis29_eddep06__12month promis29_eddep29__12month promis29_eddep41__12month promis29_hi7__12month promis29_an3__12month
promis29_fatexp41__12month promis29_fatexp40__12month promis29_sleep109__12month promis29_sleep116__12month promis29_sleep20__12month
promis29_sleep44__12month promis29_srpper11_caps__12month promis29_srpper18_caps__12month promis29_srpper23_caps__12month
promis29_srpper46_caps__12month promis29_painin9__12month promis29_painin22__12month promis29_painin31__12month promis29_painin34__12month
promis29_global07__12month pf_raw__12month pf_t_score__12month anxiety_raw__12month anxiety_t_score__12month depression_raw__12month
depression_t_score__12month fatigue_raw__12month fatigue_t_score__12month sd_raw__12month sd_t_score__12month social_raw__12month
social_t_score__12month pain_raw__12month pain_t_score__12month;
run;
data want;
set long;
scorenum = input(substr(scan(_name_,1,"_"),6),2.);
if scan(_name_,2,"_") = "baseline"
then timepoint = 0;
else timepoint = input(compress(scan(_name_,-1,"_"),"month"),2.);
drop _name_;
run;
Ideally, I the final product would look like below but I am not sure if this is possible given the data. Thank you
data want;
input
pt_study_deidentified timepoint $ back_pain_vas leg_pain_vas1;
datalines;
1 baseline 2 3
1 3month 3 4
1 12month 5 5
2 baseline 2 3
2 3month 3 4
2 12month 5 5
;
First of all, you do not have score numbers throughout (e.g. pain_t_score).
But you can extract everything before a sequence of two underlines, and get the time point from the rest:
data long;
input _name_ $32.;
datalines;
back_pain_vas___1Baseline
leg_pain_vas1___1Baseline
back_pain_vas___3month
leg_pain_vas1___3month
back_pain_vas__12month
leg_pain_vas1__12month
;
data want;
set long;
i = index(_name_,"__");
study = substr(_name_,1,i-1);
select (upcase(substr(_name_,i+2,3)));
when ("_1B") timepoint = 1;
when ("_3M") timepoint = 3;
when ("12M") timepoint = 12;
otherwise; /* insert alarm code here if needed */
end;
drop i;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.