BookmarkSubscribeRSS Feed
Fred_Gavin
Calcite | Level 5

Dear All,

I have two survey datasets collected several months apart with the exact same questions. Therefore the column names of two datasets are partially the same, except in the first one it looks like baseline_question1, baseline_question2.....

and in the second it is six_month_question1, six_month_question2,........

There are nearly a hundred questions. I would like to transform them into:

Question1, Question2, Question3..... Time

                                                             Baseline

                                                              .......

                                                            Six_month

                                                             ........

Therefore I need to remove the prefix of column names from both datasets, and combine them later.

Can anyone please suggest a way to do this?

Thank you in advance.

6 REPLIES 6
Tom
Super User Tom
Super User

Because the names end in numeric suffixes should be able to rename very easily.

data want ;

   length time $20 ;

   set baseline (in=in1 rename=(baseline_question1-baseline_question20=question1-question20))

       month6 (in=in2 rename=(six_month_question1-six_month_question20=question1-question20))

   ;

   if in1 then time='Baseline';

   if in2 then time='Six Months';

run;

  

Fred_Gavin
Calcite | Level 5

Thanks for the suggestion.

They do actually not end in numeric suffixes. I just put "question1", "question2", as an example for the simplicity. Sorry for the misleading.

In the dataset, for instance, the names are " baseline_overallsafetyofChart", "baseline_indication", and so on.

So the only common part is the the prefix "baseline" for the first dataset, and "six_month" for the second dataset.

Thanks and regards

art297
Opal | Level 21

Then, I think your easiest solution would be to use the tranwrd word function, using proc sql, to create a macro variable based on dictionary.columns, the does the recode in your datastep.

Linlin
Lapis Lazuli | Level 10

data base;

input baseline_overallsafetyofChart baseline_indication $;

cards;

1 a

11 aa

;

run;

data six;

input six_overallsafetyofChart six_indication $;

cards;

2 b

22 bb

;

run;

proc transpose data=base(obs=1) out=temp1;

var _all_;

run;

proc transpose data=six(obs=1) out=temp2;

var _all_;

run;

data _null_;

  set temp1 end=last ;

  call symputx(cats('base',strip(_n_)),_name_);

  if last then call symputx('nbase',_n_);

  run;

data _null_;

  set temp2 end=last ;

  call symputx(cats('six',strip(_n_)),_name_);

  if last then call symputx('nsix',_n_);

  run;

  %put _user_;

%macro rename;

data new_base;

  set base;

rename %do i=1 %to &nbase;

     &&base&i=%scan(&&base&i,-1,'_')

          %end;;

  run;

  data new_six;

  set six;

rename %do i=1 %to &nsix;

     &&six&i=%scan(&&six&i,-1,'_')

          %end;;

  run;

%mend;

%rename;

data both;

  length time $ 15;

  set new_base (in=a)

       new_six(in=b);

   if a then time='Baseline';

   if b then time='Six Months';

run;

proc print;run;

                                                                             overallsafetyof

                                                  Obs       time            Chart         indication

                                                   1     Baseline                1              a

                                                   2     Baseline              11              aa

                                                   3     Six Months            2              b

                                                   4     Six Months           22              bb

Linlin

Ksharp
Super User

How about:

data base;
input baseline_overallsafetyofChart baseline_indication $;
cards;
1 a
11 aa
;
run;

data six;
input six_month_overallsafetyofChart six_month_indication $;
cards;
2 b
22 bb
;
run;

proc sql noprint ;
 select catx('=',name,scan(name,-1,'_')) into : base separated by ' '
  from dictionary.columns 
   where libname='WORK' and memname='BASE' and name like 'baseline~_%' escape '~';

 select catx('=',name,scan(name,-1,'_')) into : six separated by ' '
  from dictionary.columns 
   where libname='WORK' and memname='SIX' and name like 'six~_month~_%' escape '~';

quit;
%put &base &six;

proc datasets library=work nolist;
 modify base;
  rename &base ;
 modify six;
  rename &six;
quit;

Ksharp

Fred_Gavin
Calcite | Level 5

Thanks for all the replies.

I will test both suggested way soon.

Regards

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 4872 views
  • 0 likes
  • 5 in conversation