SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Check if Multiple Variables exist in a SAS Dataset

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

Check if Multiple Variables exist in a SAS Dataset

I have a dataset that I want to perform checks on:

1st Check: determine if all wanted variables exist in the dataset

2nd Check: determine if there are any extra/unwanted variables in the dataset other than the wanted variables

Let's say I am using this data below:

Data test1;

input id score1 score2 score3 score4 score5 score6;

cards;

24 100 97 99 100 85 85

28 98 87 98 100 44 90

run;

I want to check if id, score1 score2 score3 and score4 exist in the dataset (let's assume I am pulling this dataset from a database and checking to see if someone created the correct dataset with those wanted variables only).

My desire is to perform those two checks above and either output a dataset called check_var_problem1, if not all the wanted variables exist, and check_var_problem2, if there exists more than the wanted variables in the dataset. If a dataset cannot be created with the problems, then a note would be acceptable in the LOG that says something like (Check1): "score2 is not in this dataset" or (Check2): "This dataset contains score5, score6 in the dataset".

I tried using a loop to check for the missing variables, as well as creating macro variables.  See my code below:

%let var1 = id;

%let var2 = score1;

%let var3 = score2;

%let var4 = score3;

%let var5 = score4;

data check_var_problem1;

dsid = open('test1');

do i = 1 to 5;

check=varnum(dsid, 'var&i');

if check = 0 then put 'var&1 does not exist';

end;

run;

Any help is appreciated.

Thanks,


Accepted Solutions
Solution
‎11-28-2014 11:31 AM
Super User
Posts: 10,018

Re: Check if Multiple Variables exist in a SAS Dataset

Posted in reply to jprosenbaum8908
Data test1;
input id score1 score2 score3 score4 score5 score6;
cards;
24 100 97 99 100 85 85
28 98 87 98 100 44 90
run;

%macro check(libname= ,tname= ,vars= );
data have;
 list="&vars";
 do i=1 to countw(list);
  var=scan(list,i);output;
 end;
 keep var;
run;

proc sql noprint;
create table a as
 select upcase(var) as var from have
 except
 select upcase(name) from dictionary.columns where libname="%upcase(&libname)" and memname="%upcase(&tname)";

create table b as
 select upcase(name) as var from dictionary.columns where libname="%upcase(&libname)" and memname="%upcase(&tname)"
 except
 select upcase(var) from have ;

select count(*) into : a from a;
select count(*) into : b from b;

select var into : alist separated by ' ' from a;
select var into : blist separated by ' ' from b;

quit;

%if &a eq 0 %then %put WARNING: All the variable is included. ;
 %else      %put WARNING: &alist variables are not included. ;

%if &b eq 0 %then %put WARNING: there are not extra variables. ;
 %else      %put WARNING: &blist variables are extra variables. ;
%mend check;

%check(libname=work ,tname=test1 ,vars= id score1 score2 score3 score4)



Xia Keshan

View solution in original post


All Replies
Super User
Posts: 19,768

Re: Check if Multiple Variables exist in a SAS Dataset

Posted in reply to jprosenbaum8908

Perhaps consider checking the metadata on the dataset instead.

Look into SASHELP.VCOLUMN or dictionary.column table that SAS has.

Then if you have your data structure defined in a dataset you can do a SQL join or Datastep Merge and easily identify which variables are present/absent.

Occasional Contributor
Posts: 14

Re: Check if Multiple Variables exist in a SAS Dataset

I need something to be displayed in the log for a quick verification. Thanks, but I don't think that is what I am looking for as an answer.

Super User
Posts: 5,495

Re: Check if Multiple Variables exist in a SAS Dataset

Posted in reply to jprosenbaum8908

A few items to note ...

First, single quotes suppresses all macro-related activity.  All single quotes must become double quotes.

Second, var&i is looking for a macro variable named &i.  You don't have one, nor do you need one.  Instead of %LET statements, you could include these statements in a DATA step:

var1 = 'id';

var2 = 'score1';

...

var5 = 'score4';

Then use DATA step processing:

array vars {5} var1-var5;

Inside the loop:

check = varnum(dsid, vars{i});

I'd suggest you get the code working 100% for a single data set without any macro language involved.  Then consider how to use macro language in the solution.

Super User
Posts: 19,768

Re: Check if Multiple Variables exist in a SAS Dataset

Posted in reply to jprosenbaum8908

This is a long way.

I'm sure you could do it simpler with some call vnext functions and such in a single data step but I don't know how that would work.

data class;

set sashelp.class;

drop sex age;

weight2=weight*2;

run;

%macro compare_data(base=, compare=);

proc sql noprint;

  create table base_in as

  select name

  from sashelp.vcolumn

  where libname=upper(scan("&base", 1, "."))

  and memname=upper(scan("&base", 2, "."))

  order by varnum;

quit;

proc sql noprint;

  create table compare_in as

  select name

  from sashelp.vcolumn

  where libname=upper(scan("&compare", 1, "."))

  and memname=upper(scan("&compare", 2, "."))

  order by varnum;

quit;

proc sql;

  create table comparison as

  select a.name as base_var, b.name as compare_var,

  case when missing(a.name) then catx("-", "Comparison dataset has extra variable", b.name)

  when missing(b.name) then catx("-", "Comparison dataset is missing variable", a.name)

  when a.name=b.name then "Variable in both datasets"

  else "CHECKME"

  end as comparison

  from base_in as a

  full join compare_in as b

  on a.name=b.name;

quit;

data _null_;

set comparison;

if comparison not eq "Variable in both datasets"

  then put "ERROR:" comparison;

run;

%mend;

%compare_data(base=sashelp.class, compare=work.class);

Occasional Contributor
Posts: 8

Re: Check if Multiple Variables exist in a SAS Dataset

Posted in reply to jprosenbaum8908

The method I always use is based on proc contents. It's almost similar to the method Reeza is describing, but less coding.

You run a proc contents on the test1 data-set. The keep variables you can use not only to check the varname but also if type, length or order matches.

proc contents data=test1 out=meta_test1 (keep=name type length varnum);

run;

Then you create a template file (or use the one that's registered in the metadata);

data meta_temp;

     attrib      Id     length=8      format=8.

                    score1 length=8 format=8.

                    etc...

     stop;

run;

Then the rest is the same as the proc sql; create table comparison from Reeza's code.

If you want to check of order, type, length, just add them to the join.

Super Contributor
Posts: 308

Re: Check if Multiple Variables exist in a SAS Dataset

Posted in reply to jprosenbaum8908

Hello,

%let var1 = Age;
%let var2 = Sex;
%let var3 = blabla;

/*determines variable of database*/
proc sql;
create table varb as select name from sashelp.vcolumn
where memname='CLASS';
quit;

/*write the variables to search for in a database*/
%macro a;

%let k=1;

data tosearch;
length vartosearch $ 20;
%do %while (%symexist(var&k));
vartosearch="&&var&k";
output;
%let k=%eval(&k+1);
%end;
run;

%mend a;

%a

/*determines the 2 wanted databases*/
proc sql;
create table check_var_problem2 as
select name from varb
except
select vartosearch from tosearch;
create table check_var_problem1 as
select vartosearch from tosearch
except
select name from varb;
quit;

PROC Star
Posts: 7,467

Re: Check if Multiple Variables exist in a SAS Dataset

Posted in reply to jprosenbaum8908

I think that something like the following would accomplish what you want:

data test1;

  input id score1 x score3 y score5 score6;

  cards;

24 100 97 99 100 85 85

28 98 87 98 100 44 90

run;

%macro check(filenm,vars);

  data check (drop=&vars.);

    set &filenm. (obs=1);

    _error_=0;

  run;

  proc sql noprint;

    select name

      into :vnames separated by " "

         from dictionary.columns

           where libname="WORK" and

                 memname="CHECK"

     ;

  quit;

  %let nvar=&sqlobs.;

  %if nvar=0 %then %put "There were no extra variables";

  %else %then %put "Unexpected vars in file: &vnames.";

%mend check;

%check(test1,id score1 score2 score3 score4)

Occasional Contributor
Posts: 14

Re: Check if Multiple Variables exist in a SAS Dataset

Thanks all for your help. Xia Keshan's code was truly helpful.  I will put this as Answered.

Happy Holidays!

SAS Employee
Posts: 23

Re: Check if Multiple Variables exist in a SAS Dataset

Posted in reply to jprosenbaum8908

Why not use Proc Compare?

proc compare data=Dataset1 compare=Dataset2 nosummary listvar;

run;

Solution
‎11-28-2014 11:31 AM
Super User
Posts: 10,018

Re: Check if Multiple Variables exist in a SAS Dataset

Posted in reply to jprosenbaum8908
Data test1;
input id score1 score2 score3 score4 score5 score6;
cards;
24 100 97 99 100 85 85
28 98 87 98 100 44 90
run;

%macro check(libname= ,tname= ,vars= );
data have;
 list="&vars";
 do i=1 to countw(list);
  var=scan(list,i);output;
 end;
 keep var;
run;

proc sql noprint;
create table a as
 select upcase(var) as var from have
 except
 select upcase(name) from dictionary.columns where libname="%upcase(&libname)" and memname="%upcase(&tname)";

create table b as
 select upcase(name) as var from dictionary.columns where libname="%upcase(&libname)" and memname="%upcase(&tname)"
 except
 select upcase(var) from have ;

select count(*) into : a from a;
select count(*) into : b from b;

select var into : alist separated by ' ' from a;
select var into : blist separated by ' ' from b;

quit;

%if &a eq 0 %then %put WARNING: All the variable is included. ;
 %else      %put WARNING: &alist variables are not included. ;

%if &b eq 0 %then %put WARNING: there are not extra variables. ;
 %else      %put WARNING: &blist variables are extra variables. ;
%mend check;

%check(libname=work ,tname=test1 ,vars= id score1 score2 score3 score4)



Xia Keshan

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 5124 views
  • 8 likes
  • 8 in conversation