Hi I am trying to pull out some information from few columns and their variable names.,
Data Have;
input ID A B C D E;
cards;
101 1 1 . . 1
102 . . 1 . .
103 1 1 . . .
Run;
I want to bring out only those variable names in a newly created three columns.
like
ID A B C D E X Y Z
101 1 1 . . 1 ; A B E
102 . . 1 . . C
103 1 1 . . . A B
Hello,
Here's one script that gives what you ask.
data final;
set have;
array cols(*) _NUMERIC_;
array new (3) $ x y z;
J=1;
do i=LBOUND(cols) to HBOUND(cols);
if vname(cols(i)) ne "ID" and cols(i) ne . then do;
new(j)=vname(cols(i));
j=j+1;
end;
end;
drop i j;
run;
Best regards
Hello,
Here's one script that gives what you ask.
data final;
set have;
array cols(*) _NUMERIC_;
array new (3) $ x y z;
J=1;
do i=LBOUND(cols) to HBOUND(cols);
if vname(cols(i)) ne "ID" and cols(i) ne . then do;
new(j)=vname(cols(i));
j=j+1;
end;
end;
drop i j;
run;
Best regards
Try this, Beate
DATA Have;
INPUT ID A B C D E;
CARDS;
101 1 1 . . 1
102 . . 1 . .
103 1 1 . . .
RUN;
/* collect contents of your data set and use it to generate a macro variable that has all names different from ID in your data set
macro variable varlist has all names of variables in your data set, macro variable nvar has the number of variables in your data set
note there are other ways of doing the same thing, e.g., PROC SQL ...
*/
PROC CONTENTS DATA=have NOPRINT OUT=cntnts (KEEP=name WHERE=(UPCASE(name) NE 'ID'));
RUN;
DATA _NULL_;
LENGTH varlist $ 512; /* need to make sure to make long enough to accommodate the length of all your variable names strung together */
RETAIN varlist " " nvar 0;
SET cntnts END=lastobs;
nvar + 1;
varlist = TRIM(LEFT(varlist))||" "||COMPRESS(name);
IF lastobs THEN DO; CALL SYMPUT('varlist',TRIM(LEFT(varlist))); CALL SYMPUT('nvar',COMPRESS(nvar)); END;
RUN;
/* need to use a macro as %do are not allowed outside of macros;
I am not sure what is important the variable being 1 or the variable being missing;
the code checks for the variable being missing, need to change as you need it */
%MACRO quick;
DATA havenew;
SET have;
ARRAY _newcols {&nvar} $ 1 col1-col&nvar; /* might have to change length here to the maximum possible length of your variable names */
j=0;
%DO i=1 %TO &nvar;
%LET thisvar=%SCAN(&varlist,&i);
IF MISSING(&thisvar) EQ 0 THEN DO;
j=j+1;
_newcols{j}="&thisvar";
END;
%END;
DROP j;
RUN;
%MEND quick;
%quick;
OPTIONS NOCENTER;
PROC PRINT NOOBS;
RUN;
ID A B C D E col1 col2 col3 col4 col5
101 1 1 . . 1 A B E
102 . . 1 . . C
103 1 1 . . . A B
Here's a solution without array:
Data Have;
input ID A B C D E;
cards;
101 1 1 . . 1
102 . . 1 . .
103 1 1 . . .
;
Run;
proc transpose data=have out=tran(where=(not missing(col1)));by id;
proc transpose data=tran out=tran2(drop=_NAME_ _LABEL_) prefix=col_ ;by id;var _name_;
data want;
merge have tran2;
by id;
run;
I changed the last part, I think this makes more sense:
data want;
merge have(keep=id) tran2;
by id;
run;
If you want to limit the number of columns generated to just 3 then perhaps that is an input parameter?
How do you know which variables to scan? Perhaps that is another input parameter.
* create example data ;
data have;
input ID $ VAR1-VAR5;
cards;
101 1 1 . . 1
102 . . 1 . .
103 1 1 . . .
;
run;
* set parameters ;
%let maxout=3 ;
%let varlist = _numeric_ ;
* process the file ;
data want ;
set have ;
array num &varlist ;
array col (&maxout) $32 ;
do _i=1 to dim(num) until (_n=&maxout);
if num(_i) then do ;
_n=sum(_n,1);
col(_n)=vname(num(_i));
end;
end;
drop _i _n ;
run;
Obs | ID | VAR1 | VAR2 | VAR3 | VAR4 | VAR5 | col1 | col2 | col3 |
1 | 101 | 1 | 1 | . | . | 1 | VAR1 | VAR2 | VAR5 |
2 | 102 | . | . | 1 | . | . | VAR3 | ||
3 | 103 | 1 | 1 | . | . | . | VAR1 | VAR2 |
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.