Hi,
I am trying to import the attached "Sample File.csv" however I am unable to import it successfully. I noticed that there are line feed values and possible carriage return values hence unable to import the data.
Please help in import this .csv file successfully.
filename test 'C:\temp\Sample File.csv' termstr=CRLF;
proc import datafile=test
out=sample
dbms=csv
replace;
getnames=yes;
guessingrows=max;
run;
Please note after searching in google for the solution I have posted this query as I didn't any get solution.
I have tried the below two different codes however still I am unable to import it successfully hence please help me.
FILENAME REFFILE 'C:\temp\Sample File.csv' TERMSTR=LF;
PROC IMPORT DATAFILE=REFFILE
DBMS=CSV
OUT=WORK.TEST;
GETNAMES=YES;
RUN;
DATA WORK.TEST;
LENGTH
PATIENTNUMBER $ 13
SEQ 8
TERM $ 5
TESTSPEC_ $ 200
LN $ 1000
NL $ 990 ;
FORMAT
PATIENTNUMBER $CHAR13.
SEQ BEST1.
TERM $CHAR5.
TESTSPEC_ $CHAR200.
LN $CHAR1000.
NL $CHAR990. ;
INFORMAT
PATIENTNUMBER $CHAR13.
SEQ BEST1.
TERM $CHAR5.
TESTSPEC_ $CHAR200.
LN $CHAR1000.
NL $CHAR990. ;
INFILE 'C:\temp\Sample File.csv'
LRECL=32767 TERMSTR=LF DLM=',' MISSOVER DSD FIRSTOBS = 2 PAD;
INPUT
PATIENTNUMBER : $CHAR13.
SEQ : ?? BEST1.
TERM : $CHAR5.
TESTSPEC_ : $CHAR200.
LN : $CHAR1000.
NL : $CHAR990. ;
run;
You need to first fix the file to not have embedded end of line markers in the middle of the values of fields. You should be able to do that by just counting the number of quotes and removing any end of line markers that are inside quotes (when the number of quotes seen is odd).
Example:
%let dl=C:\Downloads;
filename fixed temp;
data _null_;
infile "&dl/Sample File.csv" end=eof ;
file fixed ;
input;
nq+countc(_infile_,'"');
nq=mod(nq,2);
put _infile_ @;
if eof or not nq then put;
else put '|' @;
run;
data test (compress=yes);
infile fixed dsd truncover firstobs=2;
input PATIENTNUMBER :$20. SEQ TERM :$20. (TESTSPEC_ LN NL) (:$2500.);
run;
proc freq ;
tables seq term;
run;
The FREQ Procedure Cumulative Cumulative SEQ Frequency Percent Frequency Percent -------------------------------------------------------- 1 32 32.00 32 32.00 2 20 20.00 52 52.00 3 15 15.00 67 67.00 4 12 12.00 79 79.00 5 8 8.00 87 87.00 6 5 5.00 92 92.00 7 4 4.00 96 96.00 8 2 2.00 98 98.00 9 2 2.00 100 100.00 Cumulative Cumulative TERM Frequency Percent Frequency Percent ---------------------------------------------------------- AE 59 59.00 59 59.00 OTHER 41 41.00 100 100.00
Here is.
Did you just pick up the first 100 records ?
options compress=yes; data temp; infile 'c:\temp\Sample File.csv' termstr=crlf length=len firstobs=2 lrecl=32767; input temp $varying4000. len; if prxmatch('/^X\d+,|^\d+-\d+,/',temp) then group+1; run; data temp1; length x $ 10000; do until(last.group); set temp; by group; x=cats(x,temp); end; run; data want; set temp1; PATIENTNUMBER=scan(x,1,',','qm'); SEQ=scan(x,2,',','qm'); TERM=dequote(scan(x,3,',','qm')); TESTSPEC_LN=dequote(scan(x,4,',','qm')); NL=dequote(scan(x,5,',','qm')); z= dequote(scan(x,6,',','qm')); drop x temp; run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.