I have a multiple datasets with many columns. I am attempting to go through each one and order the columns and add in any columns stored in a macro variable that aren't in the current table. See below for example.
Here is what I have.
data have;
input var7 4. var3 4. var2 4. var5 4. var1 4.;
datalines;
492 234 230 921 394
843 230 102 748 239
;
Here is what I want.
data want;
input var1 4. var2 4. var3 4. var4 4. var5 4. var6 4. var7 4.;
datalines;
394 230 234 921 492
239 102 230 748 843
;
I have the names of the columns stored in a macro variable that is structured like below.
proc sql;
select name
into :my_column_order separated by " "
from DICTIONARY.COLUMNS
where UPCASE(LIBNAME) = "WORK"
and UPCASE(MEMNAME) = "WANT"
order by name;
quit;
%put &my_column_order;
Is there a way to use my have dataset and &my_column_order to add in and correctly order missing columns and format them the same as the other columns?
data have;
input var7 4. var3 4. var2 4. var5 4. var1 4.;
datalines;
492 234 230 921 394
843 230 102 748 239
;
data want;
input var1 4. var2 4. var3 4. var4 4. var5 4. var6 4. var7 4.;
datalines;
394 230 234 921 492
239 102 230 748 843
;
proc sql;
select name
into :my_column_order separated by " "
from DICTIONARY.COLUMNS
where UPCASE(LIBNAME) = "WORK"
and UPCASE(MEMNAME) = "WANT"
order by name;
quit;
%put &my_column_order;
data want_final;
retain &my_column_order.;
set have;
array vars [*] 4. &my_column_order.;
run;
data have;
input var7 4. var3 4. var2 4. var5 4. var1 4.;
datalines;
492 234 230 921 394
843 230 102 748 239
;
data want;
input var1 4. var2 4. var3 4. var4 4. var5 4. var6 4. var7 4.;
datalines;
394 230 234 921 492
239 102 230 748 843
;
proc sql;
select name
into :my_column_order separated by " "
from DICTIONARY.COLUMNS
where UPCASE(LIBNAME) = "WORK"
and UPCASE(MEMNAME) = "WANT"
order by name;
quit;
%put &my_column_order;
data want_final;
retain &my_column_order.;
set have;
array vars [*] 4. &my_column_order.;
run;
Your example implies that the variables are all numeric. Is this in fact the case? If not you may have quite a bit more work, especially if any of the variables are of different data types in different data sets with the same name. Since you did not pull the FORMAT information from that Have data set, I don't think so.
And why? If you intend to combine the data then in a data step or Proc append the order of the variables in the first data set listed will be the resulting order and sets the format and length of the variables.
Is a macro variable really needed here? I don't think so:
data have;
input var7 4. var3 4. var2 4. var5 4. var1 4.;
datalines;
492 234 230 921 394
843 230 102 748 239
;
data have0;
input var1-var7;
run;
data want;
set have0 have;
run;
Or even:
data have;
input var7 4. var3 4. var2 4. var5 4. var1 4.;
datalines;
492 234 230 921 394
843 230 102 748 239
;
data want;
array v var1-var7;
set have;
run;
Bart
I don't have a data set with a list of all the variables. The macro variable pulling from want was to demonstrate what the variable looks like. I.e., I don't have my want data set and I can't list all my variables out.
Also simple:
data want;
retain &my_column.;
set have;
run;
We need to use the array method to create the additional variables:
data want;
array x{*} &my_column_order.;
set have;
run;
But this will make all variables numeric. If you also have character variables, this will fail.
So we need to know more about your real issue, and the real resources you have available.
Post the code that creates the macro variable, and/or post the macro variable itself in a code box (</>).
@A_SAS_Man wrote:
I don't have a data set with a list of all the variables. The macro variable pulling from want was to demonstrate what the variable looks like. I.e., I don't have my want data set and I can't list all my variables out.
Providing code/information that is not actually available, such as your Proc SQL obfuscates the entire problem.
You specifically included Formats in your request. So, where do you create this macro variable? And where do you want to assign formats?
Since you say that you have "multiple data set" perhaps modify that code to use: "and member in ("FIRSTSET" "SECONDSET" "OTHERDATASET") , listing all your actual data set names and Select distinct name. Then you would have list of all variables in all the data sets to play with.
Can be done in one easy, simple step:
data want_final;
set
want (obs=0)
have
;
run;
If your task is to re-order variables already in a SAS dataset, then you have lots of working solutions available.
But if you are really starting out with raw data that you wish to re-order, then you don't need an extra DATA step, just a minor change in the first DATA step:
data want;
length var1-var7 8;
input var7 4. var3 4. var2 4. var5 4. var1 4. ... other vars ... ;
datalines;
....
run;
If you have a macro variable with the space delimited names that must exist then use that to generate code that will insure those variable names exist.
For example you might have one of these values for MY_ORDER_VARIABLE.
%let my_order_variable=v1 v2 v3 v3 v4 v5 v6 v7;
%let my_order_variable=v1-v7;
%let my_order_variable=v7 v3 v6-v4 v1 v2;
So that you could use it in a data step like this:
data want;
retain &my_order_variable;
set have;
run;
Now if you need some of those variables to be of type CHAR instead of type NUM then you need more information than just the names of the variables. Specifically you need to know the LENGTH. So if for example V6 should be a character variable that can hold up to 20 bytes and the other 6 variables are numeric you might make the macro variable hold something like this:
%let my_order_variable=v1 v2 v3 v3 v4 v5 8 v6 $20 v7 8;
Which you could then use with a LENGTH statement to define the variables and their order.
data want;
length &my_order_variable;
set have;
run;
And if any of the variables require a special format to be attached to them, such as DATE or DATETIME values, then you need even more information than just the variable's type and storage length.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.