BookmarkSubscribeRSS Feed
CHL0320
Obsidian | Level 7

Hi Community,

 

I would like to see if there are variables in the dataset that have a birthday or dob information in them. The input datasets are in CSV and there are more than 100K files.  The following code seems fine if the input filename has no space or special character in it. However, there are many original filenames with space or special characters as the example below, fij k-1, so the code cannot work. I need help to tackle it as the original names in CSV don't allow to alter. 

Thank you for your help.

 

%Macro CSV(filename);
options obs=5;
options validvarname=any;
options validmemname=extend;
proc import out=&filename
datafile="&dir\&filename..csv"
dbms=csv;
run;

proc sql;
create table keep as
select memname, name ,format
from dictionary.columns
where libname='WORK' and
(
(index(upcase(name),'BIRTH')>0) or
(index(upcase(name),'DOB')>0))

AND

((index(format, 'MMDDYY')>0) or
(index(format, 'DATE')>0))


;
quit;

data list;
set keep;
Directoryname="&dir";
rename memname=Filename name=Variable;
run;

%Mend CSV;

%let dir=C:\abc\ef g
;

%CSV(fij k-1)

 

3 REPLIES 3
ballardw
Super User

Search the forum for reading lists of files using operating system tools such as the DIR (windows) or LS command (Unix/Linux) so that you have a complete path/filename as one variable.

 

Some :

https://communities.sas.com/t5/SAS-Programming/macro-for-load-in-multiple-files-csv/m-p/763795

https://communities.sas.com/t5/SAS-Programming/Help-merging-multiple-CSV-files-to-a-dataset/m-p/1848...

 

 

Or if there are some rules for building the names...

 

If all the files (or possibly even most) have the same structure there are ways to read them using wildcards in the names.

 

BTW:

Doesn't work is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.

 

Typically debugging macros involves setting Options MPRINT; and possibly SYMBOLGEN and MLOGIC depending on the type of code generation failure you are having.

Reeza
Super User

I wouldn't bother with this portion:

 

((index(format, 'MMDDYY')>0) or
(index(format, 'DATE')>0))

If you're using PROC IMPORT to import the data it could be character or datetime because you're not specifying variable types. Do you only want SAS dates? If so, look at the FMTINFO() function instead to get all date formats. 

 


@CHL0320 wrote:

Hi Community,

 

I would like to see if there are variables in the dataset that have a birthday or dob information in them. The input datasets are in CSV and there are more than 100K files.  The following code seems fine if the input filename has no space or special character in it. However, there are many original filenames with space or special characters as the example below, fij k-1, so the code cannot work. I need help to tackle it as the original names in CSV don't allow to alter. 

Thank you for your help.

 

%Macro CSV(filename);
options obs=5;
options validvarname=any;
options validmemname=extend;
proc import out=&filename
datafile="&dir\&filename..csv"
dbms=csv;
run;

proc sql;
create table keep as
select memname, name ,format
from dictionary.columns
where libname='WORK' and
(
(index(upcase(name),'BIRTH')>0) or
(index(upcase(name),'DOB')>0))

AND

((index(format, 'MMDDYY')>0) or
(index(format, 'DATE')>0))


;
quit;

data list;
set keep;
Directoryname="&dir";
rename memname=Filename name=Variable;
run;

%Mend CSV;

%let dir=C:\abc\ef g
;

%CSV(fij k-1)

 


 

Kurt_Bremser
Super User

I would go about this differently.

  • Find files with a .csv extension. Might be simple or a little complex if you need to search multiple directories.
  • Read the header line of each file
  • Scan it for names indicating a date

I would not rely at all on the formats set by the guessing of PROC IMPORT.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 854 views
  • 2 likes
  • 4 in conversation