BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
daradanye
Obsidian | Level 7

Hi,

 

I have a question about variable reordering.  I have one dataset that is called a and another dataset called b. The variables in dataset a are exactly the same as dataset B but the ordering is different.  I would like to reorder the variables in dataset A based on the order in dataset b.  

 

I know that retain can be used to reorder the variables. But I have many variables, using retain to order it one by one is not possible.

 

Thanks in advance for the help!

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Are you exporting the data from sas dataset A to another format? Then maybe it makes the most sense to use @Kurt_Bremser's suggestion of  a hard-coded PUT statement.

 

If you really must make a copy of dataset A, with vars ordered as in B, but with var lengths from A, then you can use the metadata resources found via the "dictionary" facility in PROC SQL  (see Accessing SAS Information By Using DICTIONARY Tables)  :

 

data a ;
  length age 4 sex $10 name $40 ;
  set sashelp.class;
run;
data b;
  set sashelp.class;
run;

proc sql ;
  select catx(' ',dictb.name,_alength) into :LENGTHS_FROM_A_ORDERED_BY_B separated by '   ' 
  from
    (select name,varnum
      from dictionary.columns    where libname='WORK' and memname='B'  )
      as dictb

  inner join

    (select name,
      case when type='char' then cats('$',length)
           else cats(length)
      end 
      as _alength
      from dictionary.columns    where libname='WORK' and memname='A'  )
      as dicta

  on upcase(dicta.name)=upcase(dictb.name)
  order by dictb.varnum 
  ;
quit ;
%put &=lengths_from_a_ordered_by_b ;

data new_a;
  length &lengths_from_a_ordered_by_b ;
  set a;
run;

The PROC SQL above builds the arguments for a LENGTH statement with variables ordered by dataset B, but assigning lengths from A.  Those list of arguments get pasted into a macro variable, which is subsequently used in a LENGTH statement.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

9 REPLIES 9
mkeintz
PROC Star

You can't reorder variables within a dataset.  It will have to be copied:

 

data new_a;
  set b (obs=0)  a;
run;

 

The order of  vars in B will be honored, but the values will be from A.

 

Any variables in B, but not A, will be in NEW_A, but will be missing from first through last observation.  Any vars in A but not B will be to the right of the vars from B.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
daradanye
Obsidian | Level 7

Thanks!  One thing I forgot to mention is that: the length of some string variables are different in a and b (the string length is longer in a). For this code, I assume that the length of variables in B will be kept. Is there any way to keep the length of variables in dataset a but follow the order in dataset b?  Thanks.

ballardw
Super User

@daradanye wrote:

Thanks!  One thing I forgot to mention is that: the length of some string variables are different in a and b (the string length is longer in a). For this code, I assume that the length of variables in B will be kept. Is there any way to keep the length of variables in dataset a but follow the order in dataset b?  Thanks.


Not really. Your initial statement is not true: " The variables in dataset a are exactly the same as dataset B but the ordering is different." 

My first thought reading that was "and what about the LENGTHS of variables as that is often not considered".

 

Question: WHY is the order of the variables in the data set important? SAS doesn't care. None of the analysis or reporting procedures care about the order of the variables in a set (order of variables on statements in syntax perhaps but not the data set).

 

The only time it should be any concern is to send the data to a different data system and you should take control of that process to make sure it works properly.

daradanye
Obsidian | Level 7

I need to send the dataset to my colleague. With proper order, it is easier for him to handle. He is handling the dataset in different software.

Kurt_Bremser
Super User

Then you determine the order in the PUT statement of your export DATA step.

 


@daradanye wrote:

I need to send the dataset to my colleague. With proper order, it is easier for him to handle. He is handling the dataset in different software.


 

andreas_lds
Jade | Level 19

Another solution, just a minor variation of the code posted by @mkeintz :

 

data work.c;
   if 0 then set work.a;
   set work.b;
run;
daradanye
Obsidian | Level 7

Thanks!  One thing I forgot to mention is that: the length of some string variables are different in a and b (the string length is longer in a). For this code, I assume that the length of variables in B will be kept. Is there any way to keep the length of variables in dataset a but follow the order in dataset b?  Thanks.

mkeintz
PROC Star

Are you exporting the data from sas dataset A to another format? Then maybe it makes the most sense to use @Kurt_Bremser's suggestion of  a hard-coded PUT statement.

 

If you really must make a copy of dataset A, with vars ordered as in B, but with var lengths from A, then you can use the metadata resources found via the "dictionary" facility in PROC SQL  (see Accessing SAS Information By Using DICTIONARY Tables)  :

 

data a ;
  length age 4 sex $10 name $40 ;
  set sashelp.class;
run;
data b;
  set sashelp.class;
run;

proc sql ;
  select catx(' ',dictb.name,_alength) into :LENGTHS_FROM_A_ORDERED_BY_B separated by '   ' 
  from
    (select name,varnum
      from dictionary.columns    where libname='WORK' and memname='B'  )
      as dictb

  inner join

    (select name,
      case when type='char' then cats('$',length)
           else cats(length)
      end 
      as _alength
      from dictionary.columns    where libname='WORK' and memname='A'  )
      as dicta

  on upcase(dicta.name)=upcase(dictb.name)
  order by dictb.varnum 
  ;
quit ;
%put &=lengths_from_a_ordered_by_b ;

data new_a;
  length &lengths_from_a_ordered_by_b ;
  set a;
run;

The PROC SQL above builds the arguments for a LENGTH statement with variables ordered by dataset B, but assigning lengths from A.  Those list of arguments get pasted into a macro variable, which is subsequently used in a LENGTH statement.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Patrick
Opal | Level 21

A SAS SQL UNION CORR ALL will create a table where the variable length is the maximum of any contributing source table and the variable order is taken from the first source table.

You could use code similar to below example.

data work.have1;
   length var2 $20 var1 $15;
   var2='aaa';
   var1='bbb';
   output;
run;

data work.have2;
   length var1 $5 var2 $50;
   var1='yyy';
   var2='xxx';
   output;
run;

proc sql;
  create table want as
  select * from work.have1(obs=0)
  union corr all
  select * from work.have2
  ;
quit;

proc print data=want;
run;
proc contents data=want order=varnum;
quit;

Patrick_0-1694740222997.png

Patrick_1-1694740237645.png

 

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1098 views
  • 5 likes
  • 6 in conversation