BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Chelsea8
Calcite | Level 5
Spoiler
 
I am pulling 11 csv files over from my local c: and the first row is not becoming the sas column names. The first field name is 'sep=' and the following field names are 'var1, var2, var3....etc'. I've tried using 'getnames' and the column names are still sep=, VAR1, VAR2...ect. Also the log shows informat as sep=n, VAR1, VAR2...etc. I've attached a test file as an example of my data. I'm not sure how to get SAS to grab the first row and make them column names. 
 Part of the log: 
informat "sep="N $12. ;
678 informat VAR2 $94. ;
679 informat VAR3 $16. ;
680 informat VAR4 $11. ;
681 informat VAR5 $11. ;
682 informat VAR6 $11. ;
683 informat VAR7 $11. ;
 
 
Import code: 
 
%macro drive(dir,ext); 
   %local cnt filrf rc did memcnt name;
 
   %let cnt=0;
   %let filrf=mydir;
 
   %let rc=%sysfunc(filename(filrf,&dir));
   %let did=%sysfunc(dopen(&filrf));
 
   %if &did ne 0 %then %do;
      %let memcnt=%sysfunc(dnum(&did));
      %do i=1 %to &memcnt;
         %let name=%qscan(%qsysfunc(dread(&did,&i)),-1,.);
         %if %qupcase(%qsysfunc(dread(&did,&i))) ne %qupcase(&name) %then %do;
            %if %superq(ext) = %superq(name) %then %do;
               %let cnt=%eval(&cnt+1);
               %put %qsysfunc(dread(&did,&i));
               proc import datafile="&dir\%qsysfunc(dread(&did,&i))" out=dsn&cnt
                  dbms=csv replace;
getnames=yes;
               run;
            %end;
         %end;
       %end;
    %end;
  %else %put &dir cannot be opened.;
 
  %let rc=%sysfunc(dclose(&did));
%mend drive;
 
%drive(C:\Users\Chelsea.H\Desktop\QSS_2022,csv);
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Perhaps this gives you an alternate:

data example;
   fname='C:\Users\Chelsea.H\Desktop\QRR_2022\2022_*.csv';
   length QHP $ 4.;
   qhp = scan(fname,-1,'\');
run;

Scan doesn't mind negatives as long you don't exceed the number of substrings separated by the provided delimiter (at which point you get missing values). -1 asks for the last "word" where words in this string are separated by the \ character only.

View solution in original post

7 REPLIES 7
Tom
Super User Tom
Super User

The file you showed does not have a variable named SEP=.

Did you try to read in your SAS program as if it was a CSV file?  That might have SEP= in it.

 

1    data _null_;
2      infile 'c:\downloads\testsas.csv' obs=5;
3      input;
4      list;
5    run;

NOTE: A byte-order mark in the file "c:\downloads\testsas.csv" (for fileref "#LN00102") indicates that the data is encoded in
      "utf-8".  This encoding will be used to process the file.
NOTE: The infile 'c:\downloads\testsas.csv' is:
      (system-specific pathname),
      (system-specific file attributes)

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
1         Identifier,HierarchyComponent,ComponentScore,CutPoint1,CutPoint2,CutPoint3,CutPoint4,Rating,MeasureS
     101  core,RawValue,Denominator,EligiblePopulation,Percentile5th,Percentile10th,Percentile25th,Percentile5
     201  0th,Percentile75th,Percentile90th,Percentile95th,Mean,StdDeviation,Min,Max,MeasureBenchmark 291
2         GLOBAL,Global,56.87,60,70,80,90,2,,,,,,,,,,,,,,,, 49
3         S1,SI: Clinical Quality Management,14.987,60,70,80,90,3,,,,,,,,,,,,,,,, 71
4         S1M49,M: Asthma Medication Ratio,,,,,,,64.333,0.99,354,654,0.688,0.72,0.768,0.845,0.892,0.915,0.945,
     101  0.827,0.084,0.57,1,0.925 124
5         S1M49a,MI: Asthma Medication Ratio (5-11),,,,,,,,0.75,8,8,,,,,,,,,,,, 69
NOTE: 5 records were read from the infile (system-specific pathname).
      The minimum record length was 49.
      The maximum record length was 291.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

 

Note that if you are reading many CSV files that all have the same variables in the same order then just read them all in one data step.  You can use the FILENAME= option on the INFILE statement to create a variable with the name of the current file being read.

 

So if all of the files look like the one you shared the program might look like this:

data fromcsv;
  infile 'C:\Users\Chelsea.H\Desktop\QSS_*.csv' dlm=',' dsd truncover filename=fname;
  length Identifier $6 HierarchyComponent $35 ComponentScore 8 CutPoint1 8
    CutPoint2 8 CutPoint3 8 CutPoint4 8 Rating 8 MeasureScore 8 RawValue 8
    Denominator 8 EligiblePopulation 8 Percentile5th 8 Percentile10th 8
    Percentile25th 8 Percentile50th 8 Percentile75th 8 Percentile90th 8
    Percentile95th 8 Mean 8 StdDeviation 8 Min 8 Max 8 MeasureBenchmark 8
  ;
  input @;
  if fname ne lag(fname) then delete;
  input Identifier -- MeasureBenchmark ;
run;
Chelsea8
Calcite | Level 5

This works great. Thank you. 

I am unable to grab the file name to create a column however. I need the 4 characters to the left of underscrore here: \2022_ (where the * is). When I input positive number it brings back error: (substr function call has too many arguments). When it is negative I either get blank or 'User' with smaller negative numbers. 

 

data fromcsv;
infile 'C:\Users\Chelsea.H\Desktop\QRR_2022\2022_*.csv' dlm=',' dsd truncover filename=fname;
length Identifier $6 HierarchyComponent $35 ComponentScore 8 CutPoint1 8
CutPoint2 8 CutPoint3 8 CutPoint4 8 Rating 8 MeasureScore 8 RawValue 8
Denominator 8 EligiblePopulation 8 Percentile5th 8 Percentile10th 8
Percentile25th 8 Percentile50th 8 Percentile75th 8 Percentile90th 8
Percentile95th 8 Mean 8 StdDeviation 8 Min 8 Max 8 MeasureBenchmark 8
;
input @;
if fname ne lag(fname) then delete;
input Identifier -- MeasureBenchmark ;
QHP= substr(fname,length(fname)38,4);
run;

ballardw
Super User

Perhaps this gives you an alternate:

data example;
   fname='C:\Users\Chelsea.H\Desktop\QRR_2022\2022_*.csv';
   length QHP $ 4.;
   qhp = scan(fname,-1,'\');
run;

Scan doesn't mind negatives as long you don't exceed the number of substrings separated by the provided delimiter (at which point you get missing values). -1 asks for the last "word" where words in this string are separated by the \ character only.

Chelsea8
Calcite | Level 5
Thank you! This got me almost there. Someone from another thread suggested adding VALIDVARNAME=ANY which got the name to come over.
Tom
Super User Tom
Super User

Using SCAN() is much easier.  Since you seem to want the first four characters of the actual name of the file (ignoring the path) you could just set the length of variable to $4 and then SAS will truncate it to 4 bytes since there is no place to store the other characters.

data fromcsv;
infile 'C:\Users\Chelsea.H\Desktop\QRR_2022\2022_*.csv' dlm=',' dsd truncover filename=fname;
length QHP $4;
length Identifier $6 HierarchyComponent $35 ComponentScore 8 CutPoint1 8
CutPoint2 8 CutPoint3 8 CutPoint4 8 Rating 8 MeasureScore 8 RawValue 8
Denominator 8 EligiblePopulation 8 Percentile5th 8 Percentile10th 8
Percentile25th 8 Percentile50th 8 Percentile75th 8 Percentile90th 8
Percentile95th 8 Mean 8 StdDeviation 8 Min 8 Max 8 MeasureBenchmark 8
;
input @;
if fname ne lag(fname) then delete;
input Identifier -- MeasureBenchmark ;
QHP= scan(fname,-1,'/\') ;
run;
Chelsea8
Calcite | Level 5

I've tried the follow and their results: -1 (user), 1 (c:), 2 (user), 3 (error improper order), 4 (blank), 5 (blank), -5 (blank)

I know that it should grab right to left when negative, but I don't see that happening, as -1 or -5 doesn't work. Totally stumped. 

 

data fromcsv;
infile 'C:\Users\Chelsea.Ha-Co\Desktop\QRR_2022\2022_*.csv' dlm=',' dsd truncover filename=fname;
length QHP $4;
length Identifier $6 HierarchyComponent $35 ComponentScore 8 CutPoint1 8
CutPoint2 8 CutPoint3 8 CutPoint4 8 Rating 8 MeasureScore 8 RawValue 8
Denominator 8 EligiblePopulation 8 Percentile5th 8 Percentile10th 8
Percentile25th 8 Percentile50th 8 Percentile75th 8 Percentile90th 8
Percentile95th 8 Mean 8 StdDeviation 8 Min 8 Max 8 MeasureBenchmark 8;
input @;
if fname ne lag(fname) then delete;
input Identifier -- MeasureBenchmark ;
QHP= scan(fname,3,'/\');
run;

Patrick
Opal | Level 21

@Chelsea8 Define the length for fname explicitly to avoid any risk of truncation. Then use scan() with -1

 

data fromcsv;
length fname $1000 QHP $100;
infile 'C:\Users\Chelsea.Ha-Co\Desktop\QRR_2022\2022_*.csv' dlm=',' dsd truncover filename=fname;
....
QHP= scan(fname,-1,'\');
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1382 views
  • 0 likes
  • 4 in conversation