Perhaps this gives you an alternate:
data example; fname='C:\Users\Chelsea.H\Desktop\QRR_2022\2022_*.csv'; length QHP $ 4.; qhp = scan(fname,-1,'\'); run;
Scan doesn't mind negatives as long you don't exceed the number of substrings separated by the provided delimiter (at which point you get missing values). -1 asks for the last "word" where words in this string are separated by the \ character only.
The file you showed does not have a variable named SEP=.
Did you try to read in your SAS program as if it was a CSV file? That might have SEP= in it.
1 data _null_; 2 infile 'c:\downloads\testsas.csv' obs=5; 3 input; 4 list; 5 run; NOTE: A byte-order mark in the file "c:\downloads\testsas.csv" (for fileref "#LN00102") indicates that the data is encoded in "utf-8". This encoding will be used to process the file. NOTE: The infile 'c:\downloads\testsas.csv' is: (system-specific pathname), (system-specific file attributes) RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 1 Identifier,HierarchyComponent,ComponentScore,CutPoint1,CutPoint2,CutPoint3,CutPoint4,Rating,MeasureS 101 core,RawValue,Denominator,EligiblePopulation,Percentile5th,Percentile10th,Percentile25th,Percentile5 201 0th,Percentile75th,Percentile90th,Percentile95th,Mean,StdDeviation,Min,Max,MeasureBenchmark 291 2 GLOBAL,Global,56.87,60,70,80,90,2,,,,,,,,,,,,,,,, 49 3 S1,SI: Clinical Quality Management,14.987,60,70,80,90,3,,,,,,,,,,,,,,,, 71 4 S1M49,M: Asthma Medication Ratio,,,,,,,64.333,0.99,354,654,0.688,0.72,0.768,0.845,0.892,0.915,0.945, 101 0.827,0.084,0.57,1,0.925 124 5 S1M49a,MI: Asthma Medication Ratio (5-11),,,,,,,,0.75,8,8,,,,,,,,,,,, 69 NOTE: 5 records were read from the infile (system-specific pathname). The minimum record length was 49. The maximum record length was 291. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
Note that if you are reading many CSV files that all have the same variables in the same order then just read them all in one data step. You can use the FILENAME= option on the INFILE statement to create a variable with the name of the current file being read.
So if all of the files look like the one you shared the program might look like this:
data fromcsv;
infile 'C:\Users\Chelsea.H\Desktop\QSS_*.csv' dlm=',' dsd truncover filename=fname;
length Identifier $6 HierarchyComponent $35 ComponentScore 8 CutPoint1 8
CutPoint2 8 CutPoint3 8 CutPoint4 8 Rating 8 MeasureScore 8 RawValue 8
Denominator 8 EligiblePopulation 8 Percentile5th 8 Percentile10th 8
Percentile25th 8 Percentile50th 8 Percentile75th 8 Percentile90th 8
Percentile95th 8 Mean 8 StdDeviation 8 Min 8 Max 8 MeasureBenchmark 8
;
input @;
if fname ne lag(fname) then delete;
input Identifier -- MeasureBenchmark ;
run;
This works great. Thank you.
I am unable to grab the file name to create a column however. I need the 4 characters to the left of underscrore here: \2022_ (where the * is). When I input positive number it brings back error: (substr function call has too many arguments). When it is negative I either get blank or 'User' with smaller negative numbers.
data fromcsv;
infile 'C:\Users\Chelsea.H\Desktop\QRR_2022\2022_*.csv' dlm=',' dsd truncover filename=fname;
length Identifier $6 HierarchyComponent $35 ComponentScore 8 CutPoint1 8
CutPoint2 8 CutPoint3 8 CutPoint4 8 Rating 8 MeasureScore 8 RawValue 8
Denominator 8 EligiblePopulation 8 Percentile5th 8 Percentile10th 8
Percentile25th 8 Percentile50th 8 Percentile75th 8 Percentile90th 8
Percentile95th 8 Mean 8 StdDeviation 8 Min 8 Max 8 MeasureBenchmark 8
;
input @;
if fname ne lag(fname) then delete;
input Identifier -- MeasureBenchmark ;
QHP= substr(fname,length(fname)38,4);
run;
Perhaps this gives you an alternate:
data example; fname='C:\Users\Chelsea.H\Desktop\QRR_2022\2022_*.csv'; length QHP $ 4.; qhp = scan(fname,-1,'\'); run;
Scan doesn't mind negatives as long you don't exceed the number of substrings separated by the provided delimiter (at which point you get missing values). -1 asks for the last "word" where words in this string are separated by the \ character only.
Using SCAN() is much easier. Since you seem to want the first four characters of the actual name of the file (ignoring the path) you could just set the length of variable to $4 and then SAS will truncate it to 4 bytes since there is no place to store the other characters.
data fromcsv;
infile 'C:\Users\Chelsea.H\Desktop\QRR_2022\2022_*.csv' dlm=',' dsd truncover filename=fname;
length QHP $4;
length Identifier $6 HierarchyComponent $35 ComponentScore 8 CutPoint1 8
CutPoint2 8 CutPoint3 8 CutPoint4 8 Rating 8 MeasureScore 8 RawValue 8
Denominator 8 EligiblePopulation 8 Percentile5th 8 Percentile10th 8
Percentile25th 8 Percentile50th 8 Percentile75th 8 Percentile90th 8
Percentile95th 8 Mean 8 StdDeviation 8 Min 8 Max 8 MeasureBenchmark 8
;
input @;
if fname ne lag(fname) then delete;
input Identifier -- MeasureBenchmark ;
QHP= scan(fname,-1,'/\') ;
run;
I've tried the follow and their results: -1 (user), 1 (c:), 2 (user), 3 (error improper order), 4 (blank), 5 (blank), -5 (blank)
I know that it should grab right to left when negative, but I don't see that happening, as -1 or -5 doesn't work. Totally stumped.
data fromcsv;
infile 'C:\Users\Chelsea.Ha-Co\Desktop\QRR_2022\2022_*.csv' dlm=',' dsd truncover filename=fname;
length QHP $4;
length Identifier $6 HierarchyComponent $35 ComponentScore 8 CutPoint1 8
CutPoint2 8 CutPoint3 8 CutPoint4 8 Rating 8 MeasureScore 8 RawValue 8
Denominator 8 EligiblePopulation 8 Percentile5th 8 Percentile10th 8
Percentile25th 8 Percentile50th 8 Percentile75th 8 Percentile90th 8
Percentile95th 8 Mean 8 StdDeviation 8 Min 8 Max 8 MeasureBenchmark 8;
input @;
if fname ne lag(fname) then delete;
input Identifier -- MeasureBenchmark ;
QHP= scan(fname,3,'/\');
run;
@Chelsea8 Define the length for fname explicitly to avoid any risk of truncation. Then use scan() with -1
data fromcsv;
length fname $1000 QHP $100;
infile 'C:\Users\Chelsea.Ha-Co\Desktop\QRR_2022\2022_*.csv' dlm=',' dsd truncover filename=fname;
....
QHP= scan(fname,-1,'\');
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.