Hi @ajulio4
I am sorry, but it is working perfectly well for me as shown below.
The file you sent me (newfile.txt) has 10 lines, first 40 bytes shown here:
dlm_count=251 line_abbrev=XXX;366XXXX;H;20210112;17:32:22;17004;36
dlm_count=31 line_abbrev=XXX;366XXXX;H;20210121;12:08:46;17004;36
dlm_count=1 line_abbrev=(Axxxxxxxxxxxxx, xxxxxxxxxvation, toit,
dlm_count=219 line_abbrev=(rafraichissement, réagencement des espa
dlm_count=251 line_abbrev=XXX;366XXXX;H;20210118;10:40:27;;XXX;XXX
dlm_count=251 line_abbrev=XXX;366XXXX;H;20210118;10:40:28;;XXX;XXX
dlm_count=251 line_abbrev=XXX;366XXXX;H;20210118;10:40:47;;XXX;XXX
dlm_count=251 line_abbrev=XXX;366XXXX;H;20210118;10:40:48;;XXX;XXX
dlm_count=251 line_abbrev=XXX;366XXXX;H;20210118;10:40:51;;XXX;XXX
dlm_count=251 line_abbrev=XXX;366XXXX;H;20210118;10:40:55;;XXX;XXX
The first record is complete with 251 delimiters.
The second record is split over 3 lines with 31+1+219 delimiters, totalling 251.
The remaining 6 records are all complete with 251 delimiters.
So the input file contains 8 complete records.
960 * Preprocess the file;
961 filename in 'c:\temp\newfile.txt';
962 filename temp 'c:\temp\csvtest_collapsed.csv';
963 %let expected_delimiters = 251;
964 data _null_;
965 infile in;
966 file temp;
967 retain csum;
968 input;
969 c = count(_infile_,';');
970 if c = &expected_delimiters then put _infile_;
971 else do;
972 put _infile_ @;
973 csum = sum(csum,c);
974 if csum = &expected_delimiters then do;
975 put;
976 csum = 0;
977 end;
978 end;
979 run;
NOTE: The infile IN is:
Filename=c:\temp\newfile.txt,
RECFM=V,LRECL=32767,File Size (bytes)=3763,
Last Modified=01. december 2022 19:05:57,
Create Time=01. december 2022 19:05:56
NOTE: The file TEMP is:
Filename=c:\temp\csvtest_collapsed.csv,
RECFM=V,LRECL=32767,File Size (bytes)=0,
Last Modified=01. december 2022 19:44:36,
Create Time=30. november 2022 22:25:34
NOTE: 10 records were read from the infile IN.
The minimum record length was 94.
The maximum record length was 466.
NOTE: 8 records were written to the file TEMP.
The minimum record length was 434.
The maximum record length was 537.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
The compressed file has - as expected - 8 records, because lines 2-4 are compressed into one line.
I cannot figure out why you would expect 9 records ?
Then proceed with import:
980 * Import the preprocessed file;
981 proc import datafile=temp out=want dbms=csv replace;
982 delimiter=';';
983 getnames=no;
984 run;
...
NOTE: The infile TEMP is:
Filename=c:\temp\csvtest_collapsed.csv,
RECFM=V,LRECL=32767,File Size (bytes)=3759,
Last Modified=01. december 2022 19:44:36,
Create Time=30. november 2022 22:25:34
NOTE: 8 records were read from the infile TEMP.
The minimum record length was 434.
The maximum record length was 537.
NOTE: The data set WORK.WANT has 8 observations and 252 variables.
NOTE: DATA statement used (Total process time):
real time 0.44 seconds
cpu time 0.39 seconds
8 rows created in WORK.WANT from TEMP.
NOTE: WORK.WANT data set was successfully created.
NOTE: The data set WORK.WANT has 8 observations and 252 variables.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.58 seconds
cpu time 0.46 seconds
This is also as expected - 8 observations and 252 variables.
I cannot se any errors in this. If you run the same code on the same file and get a different result, please post the complete log from the steps.
... View more