So I have a .csv file that has what I call hard-returns in it. I'm posting an example below in HAVE. From the example below, I need it to read in 3 records instead of 4. The 2nd record, the Field should read, "Example Test Example Return"
I've tried variations of TERMSTR=CR, SCANOVER, FLOWEVER, etc. I would love any potential help. Thank you!
HAVE:
Field1,Field2,Field3
X,Example Test,100
X,"Example Test
Example Return",200
X,Example Test,300
CODE
DATA NEWFILE;
INFILE TEMP DLM= "," DSD LRECL=32767 FIRSTOBS=2;
INPUT
Field1 :$10.
Field2 :$10.
Field3 :$10.
;
RUN;
Clean it up first. This program will take the data and then remove the hard return and then create a file without it. Then read in that file instead.
filename old 'original.csv';
filename new 'fixed.csv';
data _null_ ;
if eof then put 'NOTE: Records read=' newn 'Records with missing quotes=' missq ;
infile old lrecl=10000 end=eof ;
file new lrecl=10000;
nq=0;
do until (mod(nq,2)=0 or eof );
input;
newn+1;
nq = nq + countc(_infile_,'"');
put _infile_ @;
if mod(nq,2) then do;
missq+1;
put '|' @;
end;
end;
put;
run;
@daadorno wrote:
So I have a .csv file that has what I call hard-returns in it. I'm posting an example below in HAVE. From the example below, I need it to read in 3 records instead of 4. The 2nd record, the Field should read, "Example Test Example Return"
I've tried variations of TERMSTR=CR, SCANOVER, FLOWEVER, etc. I would love any potential help. Thank you!
HAVE:
Field1,Field2,Field3
X,Example Test,100
X,"Example Test
Example Return",200
X,Example Test,300
CODE
DATA NEWFILE;
INFILE TEMP DLM= "," DSD LRECL=32767 FIRSTOBS=2;
INPUT
Field1 :$10.
Field2 :$10.
Field3 :$10.;
RUN;
Clean it up first. This program will take the data and then remove the hard return and then create a file without it. Then read in that file instead.
filename old 'original.csv';
filename new 'fixed.csv';
data _null_ ;
if eof then put 'NOTE: Records read=' newn 'Records with missing quotes=' missq ;
infile old lrecl=10000 end=eof ;
file new lrecl=10000;
nq=0;
do until (mod(nq,2)=0 or eof );
input;
newn+1;
nq = nq + countc(_infile_,'"');
put _infile_ @;
if mod(nq,2) then do;
missq+1;
put '|' @;
end;
end;
put;
run;
@daadorno wrote:
So I have a .csv file that has what I call hard-returns in it. I'm posting an example below in HAVE. From the example below, I need it to read in 3 records instead of 4. The 2nd record, the Field should read, "Example Test Example Return"
I've tried variations of TERMSTR=CR, SCANOVER, FLOWEVER, etc. I would love any potential help. Thank you!
HAVE:
Field1,Field2,Field3
X,Example Test,100
X,"Example Test
Example Return",200
X,Example Test,300
CODE
DATA NEWFILE;
INFILE TEMP DLM= "," DSD LRECL=32767 FIRSTOBS=2;
INPUT
Field1 :$10.
Field2 :$10.
Field3 :$10.;
RUN;
Also see the fifth entry of my footnotes and upvote it.
Hello @daadorno Not necessarily recommending, but I wanted some fun
data want;
if _n_=1 then
do;
dcl hash H ();
h.definekey ("_f");
h.definedata ("field1", "field2");
h.definedone ();
end;
INFILE cards DSD truncover;
array field(3)$100;
if _f=1 then
do;
input _dummy1 :$100. field3 :$100.;
_rc=h.find();
field2=catx(' ',field2,_dummy1);
h.clear();
call missing(_f);
end;
else
do;
input field(*);
_k=cmiss(of field(*));
if _k>0 then do;
_f=1;
_rc=h.add();
end;
end;
if not _f then output;
retain _f;
drop _:;
cards;
X,Example Test,100
X,"Example Test
Example Return",200
X,Example Test,300
;
run;
data want; infile cards; input; length temp $ 200; retain temp; temp=cats(temp,_infile_); if countw(temp,',','qm')=3 then do; Field1=dequote(scan(temp,1,',','m')); Field2=dequote(scan(temp,2,',','m')); Field3=dequote(scan(temp,3,',','m')); output; call missing(temp); end; drop temp; cards; X,Example Test,100 X,"Example Test Example Return",200 X,Example Test,300 ;
Very elegant, sharp and innovative. Class act. I didn't know about 'q' modifier. I just read the documentation. The left to right and right to left is still confusing. May have to look at practical example than words to understand. Brilliant indeed.
Btw, would the M modifier take care of trailing blanks too without a T modifier or would the countw count an extra word for a trailing blank without a T modifier? Not perhaps relevant to this thread but just to understand modifiers well
Nov,
Ha. Actually I steal this idea from PGStat . Now You could see I took many advantage from all of you .
It is good for you to stick with sas communities to sharp your sas programming skill .
For your last question:
I think "would the countw count an extra word for a trailing blank without a T modifier" .You can test it on your own.
Everyone thank you for the responses. I am trying them all to see which will work best for me and my team. This is a HUGE help to me and saves me hours of work. Thank you again.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.