BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
cosmid
Lapis Lazuli | Level 10

Hi,

 

If I have 50 datasets in a directory, all with the same format, for example, 

dataset1:
ID x y

1    a  apple

2   b  banana

3   c  orange

 

dataset2-dataset50 all have the same variables, ID, x, and y. How do I find datasets that have x = c? Let's assume out of all 50 datasets, only dataset1 and dataset 15 have records with x = c. Is there a way to look through all datasets and check for a value of particular variable and return the name of the datasets?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You will have to read the data to see if the values are there.

If the datasets all have the same variables then you can read them all in one data step.  

You can use the INDSNAME= option of the SET statement to know which dataset the current observation comes from.

 

So use a data step like this to read all of the data and keep the names of the datasets that have observations that match your criteria.

data want;
  length dsname indsname $41 ;
  set dataset1-dataset50 indsname=indsname ;
  where x='c';
  dsname=indsname;
  if dsname ne lag(dsname) then output;
  keep dsname;
run;

The only trouble you might have is getting the list of names into the SET statement if the names are not as simple as that.  In that case you might want to get the list into a macro variable.  

 

For example if you want all of the datasets in the libref MYLIB that have a variable named X then the code might look like this instead:

proc sql noprint;
  select distinct catx('.',libname,memname)
    into :memlist separated by ' '
 from dictionary.columns 
 where libname='MYLIB' and upcase(name)='X'
;
quit;
data want;
  length dsname indsname $41 ;
  set &memlist indsname=indsname ;
  where x='c';
  dsname=indsname;
  if dsname ne lag(dsname) then output;
  keep dsname;
run;

 If the datasets might have conflicts for some other variable types (Variables other than X,  for example say Y is character in DATASET1 and numeric in DATASET2) then you will want to add a KEEP= dataset option to each dataset.

View solution in original post

1 REPLY 1
Tom
Super User Tom
Super User

You will have to read the data to see if the values are there.

If the datasets all have the same variables then you can read them all in one data step.  

You can use the INDSNAME= option of the SET statement to know which dataset the current observation comes from.

 

So use a data step like this to read all of the data and keep the names of the datasets that have observations that match your criteria.

data want;
  length dsname indsname $41 ;
  set dataset1-dataset50 indsname=indsname ;
  where x='c';
  dsname=indsname;
  if dsname ne lag(dsname) then output;
  keep dsname;
run;

The only trouble you might have is getting the list of names into the SET statement if the names are not as simple as that.  In that case you might want to get the list into a macro variable.  

 

For example if you want all of the datasets in the libref MYLIB that have a variable named X then the code might look like this instead:

proc sql noprint;
  select distinct catx('.',libname,memname)
    into :memlist separated by ' '
 from dictionary.columns 
 where libname='MYLIB' and upcase(name)='X'
;
quit;
data want;
  length dsname indsname $41 ;
  set &memlist indsname=indsname ;
  where x='c';
  dsname=indsname;
  if dsname ne lag(dsname) then output;
  keep dsname;
run;

 If the datasets might have conflicts for some other variable types (Variables other than X,  for example say Y is character in DATASET1 and numeric in DATASET2) then you will want to add a KEEP= dataset option to each dataset.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 150 views
  • 4 likes
  • 2 in conversation