Have an output table from a proc transpose with about 1000 records, 20 variables.
Data contained is a "1" or "." (null).
Would like to concatenate all variable names with a "1' value into a new, separate field, (building an array of column names and using catx, I suppose), on each row so I have a list in this new, separate field (in the same row, separated by a ",") of "field1, field7, field10, field 14", etc. - IF those fields on that row have a "1" value. Of the twenty variables, the most I've seen is five with a "1" value, so it's not a long string. 90% of the rows have only one occurrence that is not null.
This is similar to the Excel index/match function which returns the column header in a matrix of data (not the cell address such as A, B,, AB, AC, AD, etc.), if that helps.
Thanx.
I have a suspicion that there are better ways to arrange this output than "field1,field4,field8,field10". Please tell us what you plan to do next once you have this new field created. What analysis or chart or table are you going to produce next? Help us help you, let us know where you plan to go once you have this new field created, we may have better suggestions than "field1,field4,field8,field10". (Letting us know what you are going to do next is always a good thing to do in every new problem you post, and it is something I have asked you to provide at least once before).
I am giving it to my supervisor - because he asked me to..
There is no "next step".
I agree with @PaigeMiller that there's probably a better approach to this, but if that's what you want, you can achieve this like this:
data want;
set have;
array f {*} fieldvars:;
length fields $1000 ;
fields="";
do i=1 to dim(f);
if f[i] then fields=catx(',',fields,vname(f[i]));
end;
run;
This is great.
Part of what I used before on a much smaller list of variables:
DATA BASE;
Length TESTS $3000;
SET DATA1;
TESTS=catx(',',of APPLES--ZUCHINI);
So I don't need to assign the range of variables specifically?
"array f {*} fieldvars:;" will grab all variables, without declaring the array with the name of var1, and finishing the array with the name of the last var? Would work on any data set, (ir)regardless of variable names?
@Jumboshrimps wrote:
This is great.
Part of what I used before on a much smaller list of variables:
DATA BASE;
Length TESTS $3000;
SET DATA1;
TESTS=catx(',',of APPLES--ZUCHINI);
So I don't need to assign the range of variables specifically?
"array f {*} fieldvars:;" will grab all variables, without declaring the array with the name of var1, and finishing the array with the name of the last var? Would work on any data set, (ir)regardless of variable names?
That CATX() function call is doing something different. It is just making a string listing all of the non-empty values of that list of variables. It does not look for the variables that are EMPTY like you asked for. It does not store the NAME of the variable like you asked for.
Also in that code you are using a positional variable list, the double hyphen, which means take the variables that are located in the dataset being created from the first variable named to the last one named. So this is one of the rare situations where you need to know what order the variables are in.
You can use a positional variable list in an array definition if you want.
So something like:
ARRAY F APPLES--ZUCHINI ;
Note: There is no need to include the [*]. The * for the number of elements means that SAS needs to count. But that is what it does when you don't mention the number of elements at all anyway. And you only need to tell SAS in advance how many variables are in the array when you are NOT telling it which variables to use. Or if you want it to index the array differently than the default of starting from 1 to the array dimension. Examples where you need to define the array dimension is when using the _TEMPORARY_ keyword to make a temporary array (which does not use variable names) and because you want SAS to automatically create the variables by appending the array index number to the end of the array name.
Running this code generates a data set of all variables and an additional column "i", with the value "1" for each ob. This is where I got stuck in a previous attempt, I don't have the code from that job, so I can't remember how I resolved it.
Also, where in the code is it including only those with a value of "1", and/or excluding nulls?
Should I add an IF f[i] IS NOT NULL THEN
if f[i] then fields=catx(',',fields,vname(f[i]));
Thanx.
@Jumboshrimps wrote:
Running this code generates a data set of all variables and an additional column "i", with the value "1" for each ob. This is where I got stuck in a previous attempt, I don't have the code from that job, so I can't remember how I resolved it.
Also, where in the code is it including only those with a value of "1", and/or excluding nulls?
Should I add an IF f[i] IS NOT NULL THEN
if f[i] then fields=catx(',',fields,vname(f[i]));
Thanx.
If I has 1 after that DO loop then there weren't any variables in the array! That because a normal DO loop like : DO I=1 to 5; .... END; will stop when I becomes larger then the upper bound. So if I ends you as 1 then there were zero variables in the array. You should see a message in the log to that effect.
WARNING: Defining an array with zero elements.
To your question the answer is NO.
First because the syntax is wrong.
IF f[i] IS NOT NULL THEN ...
Will not work in SAS code. If you do need to test if something is MISSING (what SQL means by the keyword NULL) in SAS code then use the MISSING() function.
Second because the test is already covered by the existing test when the variables are NUMERIC.
if f[i] then fields=catx(',',fields,vname(f[i]));
SAS will treat a number that is missing or zero as FALSE and any other value as TRUE.
Now if the variables are CHARACTER then you will want to change the test in the IF statement to reflect that fact. So perhaps something like
if left(f[i]) = '1' then fields=catx(',',fields,vname(f[i]));
Because otherwise SAS will first try to convert the character value into a number and then test if the number means TRUE or FALSE. You will get a message in the LOG when SAS does that type of automatic type conversion.
1759 data test; 1760 array x [2] $2 ; 1761 do i=1 to dim(x); 1762 if x[i] then put x[i]= 'TRUE'; 1763 else put x[i]= 'FALSE'; 1764 end; 1765 run; NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 1762:8 x1= FALSE x2= FALSE NOTE: The data set WORK.TEST has 1 observations and 3 variables.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.