Hi
I want to remove certain symbols from this list of variables ST2--bsitosrat, and also to rename them into bm1--bm44, so I write this to code. Problem is array a will not run because it states ST2--bsitosrat contains both numeric and character variables. Can you suggest a way to do this? The dataset keep renewing and the numeric or character status of the list change over time which to me is the challenge. thanks
array a[*]
ST2 Tchol LPA LPA_mass HCY CYST_C eGFR bsitos bsitosrat;
array b[*]
bm1 bm2 bm3 bm4 bm5 bm6 bm7. bm8. bm9;
do i = 1 to dim(b);
a[i]=compress(a[i], "<>");
if a[i] in ("See Notes", "Rej") then a[i]=.;
b[i]=1*a[i];
end;
Numeric variables won't contain the symbols that you are trying to remove, so you don't even need to run this code on the numeric variables; just run it on the character variables.
the list of variables is actually much larger and I am numeric vs character status changes week by week
Then you can use PROC CONTENTS to determine which variables are numeric (denoted as TYPE=1 by PROC CONTENTS) and which variables are character (denoted as TYPE=2 by PROC CONTENTS).
proc contents data=have out=_contents_ noprint;
run;
proc sql noprint;
select distinct name into :charvars separated by ' ' from _contents_ where type=2;
quit;
data want;
set have;
array a &varnames;
array b bm1-bm&sqlobs;
do i=1 to &sqlobs;
/* some stuff here */
end;
run;
I don't know if this is an official "maxim" or not, but PROC CONTENTS is a programmer’s best friend.
@PaigeMiller wrote:
I don't know if this is an official "maxim" or not, but PROC CONTENTS is a programmer’s best friend.
Actually, proc contents is part of Maxim 3 (know your data) 😉
"Thanks but the problem is the dataset changes all the time " - and that is indeed the sole root of your problem, thus fixing that should be highest priority. Unless you like refactoring each and every run, in which case just do it manually.
Some sample data would help test
As you can't define numeric and character variables in same array and special symbol are not expected in numeric
you can:
1) create a new dataset and rename variable using proc dataset.
2) define array of character variables only:
data new;
set have;
array a _character_;
do i=1 to dim(a);
a(i) = compress(a(i), '<>');
if a(i) in ("See Notes" , "Rej") then a(i) = . ;
end;
run;
proc datasets lib=work;
modify new;
rename
'ST2' = 'BM1'
'TCHOL' = 'BM2'
'LPA' = 'BM3'
.....
;
quit;
@rykwong wrote:
The dataset keep renewing and the numeric or character status of the list change over time which to me is the challenge.
Translation: I'm wading hip-deep in a pile of excrements, and it STINKS!
Fix the import process, so you get consistent attributes.
DO NOT USE EXCEL, period. If data has to come from an Excel source, save it to a csv file and read that with a data step.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.