Simple Question? I think...

Accepted Solution Solved
Reply
Super Contributor
Posts: 418
Accepted Solution

Simple Question? I think...

Hello everyone. This is probably a super simple question, but I actually don't know how to do it without using macro variables, and I'm almost certain I've done this a few years ago..

Anyways, I hvae a dataset that multiple variables (numeric and character) and i'd like to make a matrix of 1 and 0's for when the variables are missing data. However i'd like that matrix to append to the same dataset, and each new variables name to start with MISS_..

For example, if my data was like below.

Key      Var

1            .

2            5

3            .

I'd want

Key      Var     MISS_Var

1            .               1

2            5              0

3            .               1

I know I can overwrite the sas datavariables with 1 and 0's, and then change their names to all start with MISS_ in a macro variable, and then merge them together, but this seems like something that should be very easy using arrays.

It has to be dynamic so that it can work with any number of variables whose names I don't know in advance!

Thanks so much for your help!

Brandon


Accepted Solutions
Solution
‎11-13-2013 11:04 AM
PROC Star
Posts: 7,358

Re: Simple Question? I think...

I haven't kept up with this thread thus I might have missed something but, in my opinion, a slight variation of 's code appears to be an easy way to achieve what you want.  e.g.:

data have;

  infile cards truncover;

  input  loannumber numvar charvar $;

  cards;

1 . A

2 5

3 . D

;

proc sql noprint;

create table varnames as select name from sashelp.vcolumn

  where libname='WORK' and

        memname='HAVE' and

        upcase(name) ne 'LOANNUMBER';

quit;

filename code temp;

data _null_;

   file code noprint;

   set varnames;

   put 'if missing(' name ') then MISS_' name '=1;'

       'else MISS_' name '=0;';

run;

data want;

   set have;

   %include code;

run;

View solution in original post


All Replies
Super User
Posts: 17,780

Re: Simple Question? I think...

I don't think you can do it without macro variables, but you can get the variable names using SQL:

proc sql noprint;

select name into :varlist separated by ' '

from sashelp.vcolumn

where libname="SASHELP" and memname="CLASS" and name ne 'NAME' and type="num";

select "MISS_"||name into :rename_list separated by ' '

from sashelp.vcolumn

where libname="SASHELP" and memname="CLASS" and name ne 'NAME' and type="num";

quit;

%put &varlist.;

%put &rename_list.;

Super Contributor
Posts: 418

Re: Simple Question? I think...

yeah I didn't think so.

what I"m thinking is to write the code to replace every variable with a 1 or 0 for missing values, and then re-name every variable through macro coding (except loannumber, which is the key). Then I can merge the data onto itself to create an appended matrix of "missing" values.

Well if it's true that you can't do it without macro variables at least I knew more than I thought I did haha!

Trusted Advisor
Posts: 1,128

Re: Simple Question? I think...

Please try the below code, i took the macro variables code of Reeza and modified it a bit and implemented in creating the new dataset.

data have;

    input key var;

cards;

1 .

2 5

3 .

;

proc sql noprint;

select name,"MISS_"||name into :varlist separated by ' ' , :rename_list    separated by ' '

from sashelp.vcolumn

where libname="WORK" and memname="HAVE" and name ne 'key' and type="num";

quit;

%put &varlist.;

%put &rename_list.;

data want;

    set have;

    array cha(*) &rename_list.;

    array val(*) &varlist.;

    do i = 1 to dim(cha);

    cha(i)=nmiss(val(i));

    end;

    drop i;

run;

Thanks,

Jagadish

Thanks,
Jag
Super User
Super User
Posts: 6,498

Re: Simple Question? I think...

Converting that values to 0/1 is simple application of array processing.

The problem is creating the new names.  You need some type of code generation to do that unless your variables are a sequence with numeric suffixes.

data want ;

  set have ;

   array orig var1-var100 ;

   array new miss_var1-miss_var100 ;

   do over orig;

      new = missing(orig);

   end;

run;

Super User
Posts: 5,079

Re: Simple Question? I think...

This problem might be easier if you eliminate the macro language and write the necessary code to a file.  For example:

proc sql noprint;

create table num_names as select name from sashelp.vcolumn

  where libname='WORK' and memname='HAVE' and type='num' and upcase(name) ne 'KEY';

quit;

filename code 'path to some file';

data _null_;

   file code noprint;

   set num_names;

   put 'miss_' 

         name

         '= ('

         name

        '<= .Z);'

        ;

run;

data want;

   set have;

   %include code;

run;

If I recall, you don't need UPCASE for MEMNAME and LIBNAME, but it's worth double-checking.

Good luck.

Super Contributor
Posts: 418

Re: Simple Question? I think...

Yeah unfortunately I can't have numeric suffixes. I was actually thinking of a solution extremely similar to what
Jagadishkatam has posted, however I would have to do it seperate for each of the character and numeric variables (They are mixed).

And getting them in the correct order using this method is a bit tricky...

Else I would just replace each variable with itself into a dataset and merge, but I don't really like that Idea since it involes the extra step of sorting and merging data.

Okay, I will give some of these a shot and let everyone know how things turn out!

Brandon

Super User
Posts: 17,780

Re: Simple Question? I think...

Use the vcolumn table with the order variable to reorder them after the merge.

So, repeat the solution for character.

Select Variable order into macro variable from original table.

Use that macrovariable in a datastep or proc sql to reorder

Super Contributor
Posts: 418

Re: Simple Question? I think...

I did want to point out I actually need to do this across multiple variable types. Aka both Numeric and character values need to be converted to the 1-0 syntax (and maintain their same order, however renamed to MISS_ beforehand).

SO half of Jagadishkatam's answer, with the added step for character variables...

Solution
‎11-13-2013 11:04 AM
PROC Star
Posts: 7,358

Re: Simple Question? I think...

I haven't kept up with this thread thus I might have missed something but, in my opinion, a slight variation of 's code appears to be an easy way to achieve what you want.  e.g.:

data have;

  infile cards truncover;

  input  loannumber numvar charvar $;

  cards;

1 . A

2 5

3 . D

;

proc sql noprint;

create table varnames as select name from sashelp.vcolumn

  where libname='WORK' and

        memname='HAVE' and

        upcase(name) ne 'LOANNUMBER';

quit;

filename code temp;

data _null_;

   file code noprint;

   set varnames;

   put 'if missing(' name ') then MISS_' name '=1;'

       'else MISS_' name '=0;';

run;

data want;

   set have;

   %include code;

run;

Super Contributor
Posts: 418

Re: Simple Question? I think...

Hiya Reeza. Sorry I was actually aware how to get it done using both methods with arrays (I have posted the 'solution' below). I was just actually pointing out to people that I needed it dynamic for both types of variables if they had a way to do it without the array coding (or with array without macros!). Thanks for pointing that out tho!!

Hey Arthur, your code works perfectly!! I am going to mark you as the official correct answer since your code does exactly what I needed. On a further notice, could you tell me any restrictions to your method? Aka does it stop working once you get > 500 variables, etc? (lenght restrictions in string, etc?) I am actually not sure what you are doing, I've never seen that done before. IT looks like you are writing a file out with a bunch of strings and then somehow running it in your code.

On a similar note I was also able to get the same results using the following code modified from above answers.

data have;

infile datalines dsd dlm=',';

input key var var2 $ var3 $ var4;

cards;

1,.,hey,lets go,

2,5, , to the,1

3,.,ho, ,1

;

run;

proc sql noprint;

select name, "MISS_'||name into :numvarlist separated by ' ',:numrename_list separated by ' '

from sashelp.vcolumn

where libname="WORK" and memname="HAVE" and upcase(name) ne 'KEY' and type="num";

quit;

proc sql noprint;

select name, "MISS_'||name into :charvarlist separated by ' ',:charrename_list separated by ' '

from sashelp.vcolumn

where libname="WORK" and memname="HAVE" and upcase(name) ne 'KEY' and type="char";

quit;

proc sort data=sashelp.vcolumn out=ordered(where=(libname="WORK" and memname="HAVE" and upcase(name) ne 'KEY'));

by varnum;

run;

proc sql noprint;

select name, "MISS_'||name into Smiley Surprisedrderlist separated by ' ',Smiley Surprisedrderrename separated by ' '

from ordered;

quit;

data want;

retain key &orderlist &orderrename;

set have;

array char(*) &numrename_list;

array val(*) &numvarlist;

array valchar(*) &charrename_list;

array values1(*) &charvarlist;

do i=1 to dim(cha);

if missing(val(i)) then cha(i)=1;

else cha(i)=0;

end;

do p=1 to dim(valchar);

if missing(values1(p)) then valchar(p)=1;

else valchar(p)=0;

end;

drop i p;

run;

PROC Star
Posts: 7,358

Re: Simple Question? I think...

The method suggested by Astounding, and stolen by me, is only limited (I think) in the initial creation of a macro variable containing your variable names.  Macro variables can only contain up to 32,767 characters.

However, even if all of your variable names are 32 characters long, a macro variable containing a space separated list of those names would only consume 16,500 characters.

The code, itself, doesn't confront any other limitation (that I'm aware of), as it is simply doing all of the work for you by writing the if then else statements for every variable, and then including all of those statements by %include(ing) them in a data step.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 363 views
  • 6 likes
  • 6 in conversation