BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
cosmid
Lapis Lazuli | Level 10

Hi,

 

If I have 50 datasets in a directory, all with the same format, for example, 

dataset1:
ID x y

1    a  apple

2   b  banana

3   c  orange

 

dataset2-dataset50 all have the same variables, ID, x, and y. How do I find datasets that have x = c? Let's assume out of all 50 datasets, only dataset1 and dataset 15 have records with x = c. Is there a way to look through all datasets and check for a value of particular variable and return the name of the datasets?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You will have to read the data to see if the values are there.

If the datasets all have the same variables then you can read them all in one data step.  

You can use the INDSNAME= option of the SET statement to know which dataset the current observation comes from.

 

So use a data step like this to read all of the data and keep the names of the datasets that have observations that match your criteria.

data want;
  length dsname indsname $41 ;
  set dataset1-dataset50 indsname=indsname ;
  where x='c';
  dsname=indsname;
  if dsname ne lag(dsname) then output;
  keep dsname;
run;

The only trouble you might have is getting the list of names into the SET statement if the names are not as simple as that.  In that case you might want to get the list into a macro variable.  

 

For example if you want all of the datasets in the libref MYLIB that have a variable named X then the code might look like this instead:

proc sql noprint;
  select distinct catx('.',libname,memname)
    into :memlist separated by ' '
 from dictionary.columns 
 where libname='MYLIB' and upcase(name)='X'
;
quit;
data want;
  length dsname indsname $41 ;
  set &memlist indsname=indsname ;
  where x='c';
  dsname=indsname;
  if dsname ne lag(dsname) then output;
  keep dsname;
run;

 If the datasets might have conflicts for some other variable types (Variables other than X,  for example say Y is character in DATASET1 and numeric in DATASET2) then you will want to add a KEEP= dataset option to each dataset.

View solution in original post

1 REPLY 1
Tom
Super User Tom
Super User

You will have to read the data to see if the values are there.

If the datasets all have the same variables then you can read them all in one data step.  

You can use the INDSNAME= option of the SET statement to know which dataset the current observation comes from.

 

So use a data step like this to read all of the data and keep the names of the datasets that have observations that match your criteria.

data want;
  length dsname indsname $41 ;
  set dataset1-dataset50 indsname=indsname ;
  where x='c';
  dsname=indsname;
  if dsname ne lag(dsname) then output;
  keep dsname;
run;

The only trouble you might have is getting the list of names into the SET statement if the names are not as simple as that.  In that case you might want to get the list into a macro variable.  

 

For example if you want all of the datasets in the libref MYLIB that have a variable named X then the code might look like this instead:

proc sql noprint;
  select distinct catx('.',libname,memname)
    into :memlist separated by ' '
 from dictionary.columns 
 where libname='MYLIB' and upcase(name)='X'
;
quit;
data want;
  length dsname indsname $41 ;
  set &memlist indsname=indsname ;
  where x='c';
  dsname=indsname;
  if dsname ne lag(dsname) then output;
  keep dsname;
run;

 If the datasets might have conflicts for some other variable types (Variables other than X,  for example say Y is character in DATASET1 and numeric in DATASET2) then you will want to add a KEEP= dataset option to each dataset.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 341 views
  • 4 likes
  • 2 in conversation