I have a data set in wide format, and want to convert to long format. Each subject has (2+32*36) variables as follow:
subjid, visit_num, 32 set of the following 36 variables
DTAMT1, DTAMO1 DTA1_BOP_DF DTA1_BOP_DL DTA1_BOP_F DTA1_BOP_L DTA1_BOP_MF DTA1_BOP_ML DTA1_CAL_DF DTA1_CAL_DL DTA1_CAL_F DTA1_CAL_L DTA1_CAL_MF DTA1_CAL_ML DTA1_PDD_DF DTA1_PDD_DL DTA1_PDD_F DTA1_PDD_L DTA1_PDD_MF DTA1_PDD_ML DTA1_REC_DF DTA1_REC_DL DTA1_REC_F DTA1_REC_L DTA1_REC_MF DTA1_REC_ML DTA1_PI_DF DTA1_PI_DL DTA1_PI_F DTA1_PI_L DTA1_PI_MF DTA1_PI_ML DTA1_GI_DF DTA1_GI_F DTA1_GI_L DTA1_GI_MF
.....
DTAMT32, DTAMO32 DTA32_BOP_DF DTA32_BOP_DL DTA32_BOP_F DTA32_BOP_L DTA32_BOP_MF DTA32_BOP_ML DTA32_CAL_DF DTA32_CAL_DL DTA32_CAL_F DTA32_CAL_L DTA32_CAL_MF DTA32_CAL_ML DTA32_PDD_DF DTA32_PDD_DL DTA32_PDD_F DTA32_PDD_L DTA32_PDD_MF DTA32_PDD_ML DTA32_REC_DF DTA32_REC_DL DTA32_REC_F DTA32_REC_L DTA32_REC_MF DTA32_REC_ML DTA32_PI_DF DTA32_PI_DL DTA32_PI_F DTA32_PI_L DTA32_PI_MF DTA32_PI_ML DTA32_GI_DF DTA32_GI_F DTA32_GI_L DTA32_GI_MF
The number after DTA is tooth_number (1,2, ..... 32), the letters in the end (DF, DL, F, L, MF and ML) are tooth_sites (total 6), and the letters in the middle are measure variables (total 6). Each subject has 32 teeth and each tooth has 6 sites, and eash site has 6 measures.
I want to convert the wide format data to long format data, as shown in the following picture.
With the following code, I got something close, but with some issues:
data test;
set mydata;
array dta dta:;
do over dta;
var_name=vname(dta);
if index(var_name, '_') > 0 then do;
_tooth=scan(var_name, 1, '_');
tooth_num=substr(_tooth, 4);
_var=scan(var_name, 2, '_');
tooth_site=scan(var_name, 3, '_');
if _var='BOP' then BOP=dta;output;
if _var='CAL' then CAL=dta;output;
if _var='REC' then REC=dta;output;
if _var='PDD' then PDD=dta;output;
if _var='PI' then PI=dta;output;
if _var='GI' then GI=dta;output;
end;
if index(var_name, 'DTAMT') > 0 then DTAMT=dta;
if index(var_name, 'DTAMO') > 0 then DTAMO=dta;
end;
keep Subjid visit_num tooth_num tooth_site DTAMT DTAMO BOP CAL PDD REC PI GI;
run;
probably the problem is due to the output after each if statement. Can anyone give some suggestions on my code or other methods? Thank you in advance.
... View more