Hi,
Can someone please explain why part of my records disappears after import?
The scenario is, I import a CSV file thru SAS studio, then for some reason, 2 of the records lost a portion of their data. This 2 row is important for me to continue onto the next step.
This is the strip image of the 2 records after the import;
This is the original view of the record from excel(csv.);
Both the strips have some differences because i hide some of the variables, in SAS studio, the missing portion are MODIFIED(date) ,CREATED(date) and PHONETYPE.
The following is the code I use on the import;
FILENAME REFFILE '/folders/myfolders/Folder1/CH_P02_PHONEDETAIL.csv';
PROC IMPORT DATAFILE=REFFILE
DBMS=CSV
OUT=WORK.CH_P02_PHONEDETAIL1;
GETNAMES=YES;
guessingrows= max;
OPTIONS DATESTYLE=dmy;
RUN;
PROC CONTENTS DATA=WORK.CH_P02_PHONEDETAIL1; RUN;
//This part is used because the import include some really strange records that doesn't exist in the original content//
DATA CH_P02_PHONEDETAIL;
SET CH_P02_PHONEDETAIL1;
IF CH_PHPARTYID= '0-000"' THEN DELETE;
IF PHONEID='' THEN DELETE;
RUN;QUIT;
//THIS PART IS PURELY FOR REFERENCE! //
//This is why I did the second step, but I am not sure if it actually affects my original problem so I post it here as well.//
Thank you!!
@Wken1122 wrote:
In the original source csv, there are only 10411 rows.
That is your problem. The SAS log is showing that it read 10,414 lines from the file. If you think there should be three fewer observations than that then the file must have included some line breaks in the middle of some of the records.
That would also explain the strange values you are seeing for some values as SAS will read the line after the extra line break as a new line and so read the values into the wrong columns.
There are multiple answers on this site for ways to use a SAS program to clean up the file so that it can be read properly.
If you are lucky the extra line breaks are different than the normal end of line character in your file and so you can add the TERMSTR= option to let SAS know this and it will work. If you are not lucky then you will need to do something to fix. You could just open the file in a text editor (even the SAS program editor will work if the lines are not too long) and remove the extra line breaks.
Post your log from the PROC IMPORT only.
In the original source csv, there are only 10411 rows.
change MISSOVER to TRUNCOVER in the code from the log and resubmit that and see if it works.
If there's no error in the log and that doesn't work, can you determine which row is the issue and work backwards from there?
How do I change the MISSOVER?
@Wken1122 wrote:
How do I change the MISSOVER?
Copy the code, CTRL+C, Paste it into the editor, CTRL+V and change the word MISSOVER to TRUNCOVER.
You can hold down ALT key while selecting the code to drop the line numbers.
But you literally type it out....
@Wken1122 wrote:
How do I change the MISSOVER?
You cannot since this code is generated by PROC IMPORT. I think the suggestion was to take the code that PROC IMPORT generated and save it as a new program that you could run. You could then change the option in the INFILE statement.
But that will not fix the problem since the code generated by PROC IMPORT works fine using the MISSOVER option since it only uses list mode INPUT statement.
Also the problem is not that SAS is not reading some values from the line. The problem is that your source file has extra line breaks in the middle of one or more data lines.
@Wken1122 wrote:
In the original source csv, there are only 10411 rows.
That is your problem. The SAS log is showing that it read 10,414 lines from the file. If you think there should be three fewer observations than that then the file must have included some line breaks in the middle of some of the records.
That would also explain the strange values you are seeing for some values as SAS will read the line after the extra line break as a new line and so read the values into the wrong columns.
There are multiple answers on this site for ways to use a SAS program to clean up the file so that it can be read properly.
If you are lucky the extra line breaks are different than the normal end of line character in your file and so you can add the TERMSTR= option to let SAS know this and it will work. If you are not lucky then you will need to do something to fix. You could just open the file in a text editor (even the SAS program editor will work if the lines are not too long) and remove the extra line breaks.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.