I have to process different datasets and add a column to it. But, this new column gets added to the beginning of the datasets. I want to move this to the end of all columns in each dataset. I know that length, attrib, retain, format, informat can be used to reorder the columns as required, but they require to mention the columns names, but the problem here is that I don't know what other columns are there in each dataset since each dataset has different columns and each of them have 25+ columns.
Please, advise how to reorder in this case and provide some sample code, if possible.
The Length statement is for the new variables!
if (0) then SET <originalDataSet>; /* Will place all pre-existing variables in the PDV */
Any other variables created after that statement, will appear at the end of the output data set. For illustration, run the code below.
data class; set sashelp.class(keep=name age sex obs=5); run; proc contents data=class varnum; run; data class; if (0) then set class; Length newV1 newV2 newV3 3; retain newV1 newV2 newV3 0; set class; run; proc contents data=class varnum; run;
Thank you and I have run this sample code and it is working. Since, I have two new char variables ch1 and ch2 and the data set test has 25 columns. Now, I have tried the following:
if (0) then set test;
length ch1 ch2 $12.;
retain ch1 ch2 $12.;
It's not working. Getting error:
ERROR 200-322: The symbol is not recognized and will be ignored.
Can you please advise how to work with new char variables?
The error message you posted would not be generated by that code (unless there are some invisible characters there, perhaps TAB or non-breaking spaces).
But if you just want to add two new variables then just add the two new variables. There is no need to do anything strange.
data test; set test; length ch1 ch2 $12; run;
If the request is to MOVE existing variables to the end then use DROP= and KEEP= dataset options.
data test; set test(drop=ch1 ch2) test(keep=ch1 ch2); run;
If you want to MOVE them and also change the lengths use two SET statements so you can add the LENGTH statement in between.
data test; set test(drop=ch1 ch2); length ch1 ch2 $12; set test(keep=ch1 ch2); run;
PS Lengths can only be integers so there is no need to include a decimal point.
If you are one of those people that needlessly attach $ formats to character variables (or you had them attached for you by SAS generated code) then you might also want to remove any format attach to those moved variables by adding this statement to the end of your data step.
format ch1 ch2 ;
The problem with attaching the $ format to a character variable is when the width of the format is different from the length of the variable it can cause problems. If the width is less than the length then when you print the values they look truncated. It the width is larger then you might accidentally try to assign too long a value to the variable and then it will actually be truncated.
My opinion is that there is rarely, if ever, a compelling reason to move columns to the right or move them to the left. Most PROCs and DATA steps really don't care where in the data set the column exists.
However, when you output the data set somehow for humans to look at, then the column order matters, but using PROC PRINT, PROC TABULATE, PROC REPORT you have complete control over the ordering of columns in a much more intuitive way, and this is much easier to program as well. For example, here's a way to print the SASHELP.CLASS data set with columns in different order:
proc print data=sashelp.class; var age sex height weight name; run;
Note that I did not have to move NAME to the end of the data set using DATA step commands, and yet it will print as the last column.
So @Moksha , please re-consider the need to move columns around within a data set, and please consider using other tools to make columns appear in different places as shown above, this will make your programming much easier. It is a very rare situation where you really need to have columns in a specific order within a data set.
I know that with proc print using var will reorder as mentioned. But, I have requirement to have the newly added columns are at end. Is it a limitation of SAS that there is no other alternative ways to reorder the columns in case we don't the column names and we have many number of columns which is not possible to list all columns. But, we have to list the columns names in all the ways that are available right now with SAS, but it's not possible to list all column names as we don't know the column names and the number of columns is huge.
I have requirement to have the newly added columns are at end
Well, if someone has told you that specific columns HAVE TO BE on the right, I think this person has a misunderstanding which causes you to do unnecessary work. You might consider (politely) pushing back on this requirement.
Do you know the name of the new column ahead of time or need to determine it dynamically?
As Tom mentioned, this should be the default behaviour so sharing how you are creating this variable might be helpful. Could be that a small change in code could accomplish what you need without further modifications.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.