Hi,
I am wondering if there is any way to sort variables by type? For example, all numeric variables are appear first, and then comes categorical variables. The only thing I think that MAY work is using proc sql, but I am not too familiar with that. Any help or suggestions would be greatly appreciated. Thank you!
If I correctly understand what you are trying to do, then the following would work:
proc sql; select name into :reorder separated by ' ' from dictionary.columns where libname=upcase('sashelp') and memname=upcase('class') order by type descending ; quit; data want; retain &reorder.; set sashelp.class; run;
Art, CEO, AnalystFinder.com
there is a way, but to what purpose? In most SAS procedures you can use syntax to run on all numeric (or character) variables, regardless of how they are ordered. What is the ultimate problem that you are trying to solve?
@Rick_SAS I am using arrays to help clean very large datasets, and from what I know, arrays in SAS can only hold variables of the same type. The macro we have right now requires us to "alternate" once there's a change in variable type. So, we're looking to see if it's at all possible to group variables of the same type together.
@nehalsanghvi Yup, just arranged by type.
I suggest you look into using the _NUMERIC_ and _CHARACTER_ keywords when you define the arrays.
For example
array NumerVars[*] _NUMERIC_;
array CharVars[*] _CHARACTER_;
A basic principle in efficient data analysis is to not rearrange the data unless necessary. If you say more about what sort of "data cleaning" you are interested in, we can provide specific suggestions.
For example, if you want to drop observations for which any variable is missing and also want to drop observations for which any numerical variable has a negative value, you can say:
data Want;
set sashelp.Heart;
array NumerVars[*] _NUMERIC_;
array CharVars[*] _CHARACTER_;
if ^CMISS(of NumerVars[*], of CharVars[*]) & /* omit any missing */
min(of NumerVars[*]) > 0; /* omit any negative values */
run;
@corkee wrote:
@Rick_SAS The macros we have take a "range" of consecutive variables (e.g., firstvar--lastvar). Not sure if I understand how to use your logic here.
You should provide some more details. Arrays can be defined with lists such that when the variables have the same stem you just indicate either the start and end: Thisvar1 - ThisVar10 or ThisVar: with the : indicating to assign all variables whose names start with ThisVar to the array. Neither of these require the variables to be in any specific order/location in the data vector. If your macro is not using the array -- range construct (array somearray thisvar -- anothervarname;) explicitly then you may be misunderstanding the requirement.
If your macro requires adjacent variables by order in the data vector then 1) they are so fragile they need to be rewritten or 2) the concept behind the data structure needs to be addressed. What you are describing (need to re order variables in the data vector) seems to indicate that you have an existing data structure that keeps adding variables (at the end) of the data. Which is indicative of a poorly designed approach.
Take a look at metadata available in dictionary.columns. You can use this to pick up only the columns of a single datatype:
proc sql;
select name into :numvarlist separated by ' '
from dictionary.columns
where libname='WORK'
and memname='YOURTBL'
and type='num';
quit;
Rearranging the order of the PDV putting numerics before character variables
inspired by
https://goo.gl/B3yGv9
https://communities.sas.com/t5/Base-SAS-Programming/Sorting-variables-by-type/m-p/337829
HAVE
====
Up to 40 obs from sashelp.class total obs=19
Obs NAME SEX AGE HEIGHT WEIGHT
1 Alfred M 14 69.0 112.5
2 Alice F 13 56.5 84.0
3 Barbara F 13 65.3 98.0
4 Carol F 14 62.8 102.5
5 Henry M 14 63.5 102.5
6 James M 12 57.3 83.0
...
WANT
====
Up to 40 obs from want total obs=19
Obs AGE HEIGHT WEIGHT NAME SEX
1 14 69.0 112.5 Alfred M
2 13 56.5 84.0 Alice F
3 13 65.3 98.0 Barbara F
4 14 62.8 102.5 Carol F
5 14 63.5 102.5 Henry M
6 12 57.3 83.0 James M
WORKING CODE
============
DOSUBL
array num _numeric_;
chrs=catx(' ',chrs,vname(chr[i]));
call symputx("retnum",nums);
retain &retnum;
FULL SOLUTION
=============
%symdel retnum; * just in case it exists;
data want;
* get meta data;
if _n_ =0 then do;
%let rc=%sysfunc(dosubl('
data _null_;
set sashelp.class(obs=1);
array num _numeric_;
length nums $4096;
do i=1 to dim(num);
nums=catx(' ',nums,vname(num[i]));
end;
call symputx("retnum",nums);
run;quit;
'));
end;
retain &retnum;
set sashelp.class;
run;quit;
run;quit;
If I correctly understand what you are trying to do, then the following would work:
proc sql; select name into :reorder separated by ' ' from dictionary.columns where libname=upcase('sashelp') and memname=upcase('class') order by type descending ; quit; data want; retain &reorder.; set sashelp.class; run;
Art, CEO, AnalystFinder.com
@art297 This did the trick. However, is there a way to sort the variables, after sorting them by type, in alphabetical order? Thank you so much.
proc sql;
select name into :varlist separated by ' '
from dictionary.columns
where libname='WORK'
and memname='YOURTBL'
order by type, name;
quit;
Add the name column to the order by statement
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.