BookmarkSubscribeRSS Feed
CHL0320
Obsidian | Level 7

Hi Community,

 

I would like to see if there are variables in the dataset that have a birthday or dob information in them. The input datasets are in CSV and there are more than 100K files.  The following code seems fine if the input filename has no space or special character in it. However, there are many original filenames with space or special characters as the example below, fij k-1, so the code cannot work. I need help to tackle it as the original names in CSV don't allow to alter. 

Thank you for your help.

 

%Macro CSV(filename);
options obs=5;
options validvarname=any;
options validmemname=extend;
proc import out=&filename
datafile="&dir\&filename..csv"
dbms=csv;
run;

proc sql;
create table keep as
select memname, name ,format
from dictionary.columns
where libname='WORK' and
(
(index(upcase(name),'BIRTH')>0) or
(index(upcase(name),'DOB')>0))

AND

((index(format, 'MMDDYY')>0) or
(index(format, 'DATE')>0))


;
quit;

data list;
set keep;
Directoryname="&dir";
rename memname=Filename name=Variable;
run;

%Mend CSV;

%let dir=C:\abc\ef g
;

%CSV(fij k-1)

 

3 REPLIES 3
ballardw
Super User

Search the forum for reading lists of files using operating system tools such as the DIR (windows) or LS command (Unix/Linux) so that you have a complete path/filename as one variable.

 

Some :

https://communities.sas.com/t5/SAS-Programming/macro-for-load-in-multiple-files-csv/m-p/763795

https://communities.sas.com/t5/SAS-Programming/Help-merging-multiple-CSV-files-to-a-dataset/m-p/1848...

 

 

Or if there are some rules for building the names...

 

If all the files (or possibly even most) have the same structure there are ways to read them using wildcards in the names.

 

BTW:

Doesn't work is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.

 

Typically debugging macros involves setting Options MPRINT; and possibly SYMBOLGEN and MLOGIC depending on the type of code generation failure you are having.

Reeza
Super User

I wouldn't bother with this portion:

 

((index(format, 'MMDDYY')>0) or
(index(format, 'DATE')>0))

If you're using PROC IMPORT to import the data it could be character or datetime because you're not specifying variable types. Do you only want SAS dates? If so, look at the FMTINFO() function instead to get all date formats. 

 


@CHL0320 wrote:

Hi Community,

 

I would like to see if there are variables in the dataset that have a birthday or dob information in them. The input datasets are in CSV and there are more than 100K files.  The following code seems fine if the input filename has no space or special character in it. However, there are many original filenames with space or special characters as the example below, fij k-1, so the code cannot work. I need help to tackle it as the original names in CSV don't allow to alter. 

Thank you for your help.

 

%Macro CSV(filename);
options obs=5;
options validvarname=any;
options validmemname=extend;
proc import out=&filename
datafile="&dir\&filename..csv"
dbms=csv;
run;

proc sql;
create table keep as
select memname, name ,format
from dictionary.columns
where libname='WORK' and
(
(index(upcase(name),'BIRTH')>0) or
(index(upcase(name),'DOB')>0))

AND

((index(format, 'MMDDYY')>0) or
(index(format, 'DATE')>0))


;
quit;

data list;
set keep;
Directoryname="&dir";
rename memname=Filename name=Variable;
run;

%Mend CSV;

%let dir=C:\abc\ef g
;

%CSV(fij k-1)

 


 

Kurt_Bremser
Super User

I would go about this differently.

  • Find files with a .csv extension. Might be simple or a little complex if you need to search multiple directories.
  • Read the header line of each file
  • Scan it for names indicating a date

I would not rely at all on the formats set by the guessing of PROC IMPORT.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1124 views
  • 2 likes
  • 4 in conversation