Hi everyone. I need help.
Thank you in advance!
I have 100 columns/variables (var1-var 100).
Var1 Var 2 Var 3...... Var100
1 3 1
2 8 2
3 7 8
I need to stack all the columns into one variable (let us call it, newcol).
It should look like this.
newcol
1
2
3
3
8
2
7
.
.
.
1
8
How do stack them at the same time making sure all the missing data are not included?
Also, I noticed some of the variables are characters, others are numeric. How do I make all of them numeric using a single code?
Thank you very much for your help!
Thanks to @PGStats for providing usable test-data.
If you don't have any other variables in the source-dataset, the following data-step + proc sort seem so solve the problem:
data work.step;
set have;
array nums _numeric_;
array chars _character_;
length varname $ 32 newcol 8;
do i = 1 to dim(nums);
if not missing(nums[i]) then do;
newcol = nums[i];
varname = vname(nums[i]);
output;
end;
end;
do i = 1 to dim(chars);
if not missing(chars[i]) then do;
newcol = input(chars[i], best.);
varname = vname(chars[i]);
output;
end;
end;
keep varname newcol;
run;
proc sort data=step out=want(drop=varname);
by varname;
run;
It may seem odd, but this method came to mind: go back to type character for all vars, using a temporary file, like this:
data have;
input Var1 Var2 $ Var3 Var4;
datalines;
1 3 . 1
2 8 2 .
3 . 7 8
;
filename chars temp;
data _null_;
set have;
file chars dsd;
put (_all_) (:);
run;
data temp;
infile chars dsd;
input var1-var4;
run;
proc transpose data=temp out=trans(drop=_name_);
var var1-var4;
run;
data want;
set trans;
array c col: ;
do i = 1 to dim(c);
newCol = c{i};
if not missing(newCol) then output;
end;
keep newCol;
run;
Another possibility is to use a macro:
%macro stack(inds,nvars);
%local dsid i varnum;
%let dsid=%sysfunc(open(&inds));
set &inds;
%do i=1 %to &nvars;
%let varnum=%sysfunc(varnum(&dsid,var&i));
%if %sysfunc(vartype(&dsid,&varnum))=N %then %do;
if not missing(var&i) then do;
newcol=var&i;
output;
end;
%end;
%else %do;
if not missing(var&i) then do;
newcol=input(var&i,best32.);
output;
end;
%end;
%end;
%let dsid=%sysfunc(close(&dsid));
drop var1-var&nvars;
%mend;
data want;
%stack(have,100);
run;
The parameters to the macro are the name of the input dataset, and the number of variables.
What if some of the character variables contain expressions that won't translate into a numeric, such as "N/A" or "53 mg"? Would you like to discard those values?
data have;
input Var1 Var2 Var3 Var4;
datalines;
1 3 . 1
2 8 2 .
3 . 7 8
;
data temp;
set have;
array v{*} var:;
do n=1 to dim(v);
value=v{n};if not missing(value) then output;
end;
keep n value;
run;
proc sort data=temp out=want;
by n;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.