Dear All,
I have two survey datasets collected several months apart with the exact same questions. Therefore the column names of two datasets are partially the same, except in the first one it looks like baseline_question1, baseline_question2.....
and in the second it is six_month_question1, six_month_question2,........
There are nearly a hundred questions. I would like to transform them into:
Question1, Question2, Question3..... Time
Baseline
.......
Six_month
........
Therefore I need to remove the prefix of column names from both datasets, and combine them later.
Can anyone please suggest a way to do this?
Thank you in advance.
Because the names end in numeric suffixes should be able to rename very easily.
data want ;
length time $20 ;
set baseline (in=in1 rename=(baseline_question1-baseline_question20=question1-question20))
month6 (in=in2 rename=(six_month_question1-six_month_question20=question1-question20))
;
if in1 then time='Baseline';
if in2 then time='Six Months';
run;
Thanks for the suggestion.
They do actually not end in numeric suffixes. I just put "question1", "question2", as an example for the simplicity. Sorry for the misleading.
In the dataset, for instance, the names are " baseline_overallsafetyofChart", "baseline_indication", and so on.
So the only common part is the the prefix "baseline" for the first dataset, and "six_month" for the second dataset.
Thanks and regards
Then, I think your easiest solution would be to use the tranwrd word function, using proc sql, to create a macro variable based on dictionary.columns, the does the recode in your datastep.
data base;
input baseline_overallsafetyofChart baseline_indication $;
cards;
1 a
11 aa
;
run;
data six;
input six_overallsafetyofChart six_indication $;
cards;
2 b
22 bb
;
run;
proc transpose data=base(obs=1) out=temp1;
var _all_;
run;
proc transpose data=six(obs=1) out=temp2;
var _all_;
run;
data _null_;
set temp1 end=last ;
call symputx(cats('base',strip(_n_)),_name_);
if last then call symputx('nbase',_n_);
run;
data _null_;
set temp2 end=last ;
call symputx(cats('six',strip(_n_)),_name_);
if last then call symputx('nsix',_n_);
run;
%put _user_;
%macro rename;
data new_base;
set base;
rename %do i=1 %to &nbase;
&&base&i=%scan(&&base&i,-1,'_')
%end;;
run;
data new_six;
set six;
rename %do i=1 %to &nsix;
&&six&i=%scan(&&six&i,-1,'_')
%end;;
run;
%mend;
%rename;
data both;
length time $ 15;
set new_base (in=a)
new_six(in=b);
if a then time='Baseline';
if b then time='Six Months';
run;
proc print;run;
overallsafetyof
Obs time Chart indication
1 Baseline 1 a
2 Baseline 11 aa
3 Six Months 2 b
4 Six Months 22 bb
Linlin
How about:
data base; input baseline_overallsafetyofChart baseline_indication $; cards; 1 a 11 aa ; run; data six; input six_month_overallsafetyofChart six_month_indication $; cards; 2 b 22 bb ; run; proc sql noprint ; select catx('=',name,scan(name,-1,'_')) into : base separated by ' ' from dictionary.columns where libname='WORK' and memname='BASE' and name like 'baseline~_%' escape '~'; select catx('=',name,scan(name,-1,'_')) into : six separated by ' ' from dictionary.columns where libname='WORK' and memname='SIX' and name like 'six~_month~_%' escape '~'; quit; %put &base &six; proc datasets library=work nolist; modify base; rename &base ; modify six; rename &six; quit;
Ksharp
Thanks for all the replies.
I will test both suggested way soon.
Regards
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.