BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Wken1122
Obsidian | Level 7

Hi,
Can someone please explain why part of my records disappears after import?
The scenario is, I import a CSV file thru SAS studio, then for some reason, 2 of the records lost a portion of their data. This 2 row is important for me to continue onto the next step.

This is the strip image of the 2 records after the import;
1.PNG 

This is the original view of the record from excel(csv.);
2.PNG
Both the strips have some differences because i hide some of the variables, in SAS studio, the missing portion are MODIFIED(date) ,CREATED(date) and PHONETYPE.


The following is the code I use on the import;

FILENAME REFFILE '/folders/myfolders/Folder1/CH_P02_PHONEDETAIL.csv';

PROC IMPORT DATAFILE=REFFILE
DBMS=CSV
OUT=WORK.CH_P02_PHONEDETAIL1;
GETNAMES=YES;
guessingrows= max;
OPTIONS DATESTYLE=dmy;
RUN;

PROC CONTENTS DATA=WORK.CH_P02_PHONEDETAIL1; RUN;

//This part is used because the import include some really strange records that doesn't exist in the original content//

DATA CH_P02_PHONEDETAIL;
SET CH_P02_PHONEDETAIL1;
IF CH_PHPARTYID= '0-000"' THEN DELETE;
IF PHONEID='' THEN DELETE;
RUN;QUIT;

//THIS PART IS PURELY FOR REFERENCE! //
//This is why I did the second step, but I am not sure if it actually affects my original problem so I post it here as well.//

3.PNG

Thank you!!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@Wken1122 wrote:

In the original source csv, there are only 10411 rows.


That is your problem. The SAS log is showing that it read 10,414 lines from the file. If you think there should be three fewer observations than that then the file must have included some line breaks in the middle of some of the records.

 

That would also explain the strange values you are seeing for some values as SAS will read the line after the extra line break as a new line and so read the values into the wrong columns.

 

There are multiple answers on this site for ways to use a SAS program to clean up the file so that it can be read properly.

If you are lucky the extra line breaks are different than the normal end of line character in your file and so you can add the TERMSTR= option to let SAS know this and it will work. If you are not lucky then you will need to do something to fix.  You could just open the file in a text editor (even the SAS program editor will work if the lines are not too long) and remove the extra line breaks. 

View solution in original post

8 REPLIES 8
Reeza
Super User

Post your log from the PROC IMPORT only.

Wken1122
Obsidian | Level 7
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
61
62 %web_drop_table(WORK.IMPORT3);
63
64
65 FILENAME REFFILE '/folders/myfolders/Folder1/CH_P02_PHONEDETAIL.csv';
66
67 PROC IMPORT DATAFILE=REFFILE
68 DBMS=CSV
69 OUT=WORK.CH_P02_PHONEDETAIL;
70 GETNAMES=YES;
71 guessingrows= max;
72 OPTIONS DATESTYLE=dmy;
73 RUN;
 
NOTE: Unable to open parameter catalog: SASUSER.PARMS.PARMS.SLIST in update mode. Temporary parameter values will be saved to
WORK.PARMS.PARMS.SLIST.
74 /**********************************************************************
75 * PRODUCT: SAS
76 * VERSION: 9.4
77 * CREATOR: External File Interface
78 * DATE: 13OCT17
79 * DESC: Generated SAS Datastep Code
80 * TEMPLATE SOURCE: (None Specified.)
81 ***********************************************************************/
82 data WORK.CH_P02_PHONEDETAIL ;
83 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
84 infile REFFILE delimiter = ',' MISSOVER DSD firstobs=2 ;
85 informat CH_PHPARTYID $81. ;
86 informat PHONEID $21. ;
87 informat SOURCESYSTEMCLIENT $5. ;
88 informat COUNTRYCODE $5. ;
89 informat AREACODE $6. ;
90 informat DIALNUMBER $22. ;
91 informat EXT $6. ;
92 informat PRIMARYIND $3. ;
93 informat SOURCECREATED anydtdtm40. ;
94 informat MODIFIED anydtdtm40. ;
95 informat CREATED anydtdtm40. ;
96 informat PHONETYPE $9. ;
97 format CH_PHPARTYID $81. ;
98 format PHONEID $21. ;
99 format SOURCESYSTEMCLIENT $5. ;
100 format COUNTRYCODE $5. ;
101 format AREACODE $6. ;
102 format DIALNUMBER $22. ;
103 format EXT $6. ;
104 format PRIMARYIND $3. ;
105 format SOURCECREATED datetime. ;
106 format MODIFIED datetime. ;
107 format CREATED datetime. ;
108 format PHONETYPE $9. ;
109 input
110 CH_PHPARTYID $
111 PHONEID $
112 SOURCESYSTEMCLIENT $
113 COUNTRYCODE $
114 AREACODE $
115 DIALNUMBER $
116 EXT $
117 PRIMARYIND $
118 SOURCECREATED
119 MODIFIED
120 CREATED
121 PHONETYPE $
122 ;
123 if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
124 run;
 
NOTE: The infile REFFILE is:
Filename=/folders/myfolders/Folder1/CH_P02_PHONEDETAIL.csv,
Owner Name=root,Group Name=vboxsf,
Access Permission=-rwxrwx---,
Last Modified=10Oct2017:09:35:27,
File Size (bytes)=1438221
 
NOTE: 10414 records were read from the infile REFFILE.
The minimum record length was 54.
The maximum record length was 146.
NOTE: The data set WORK.CH_P02_PHONEDETAIL has 10414 observations and 12 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.05 seconds
 
 
10414 rows created in WORK.CH_P02_PHONEDETAIL from REFFILE.
 
 
 
NOTE: WORK.CH_P02_PHONEDETAIL data set was successfully created.
NOTE: The data set WORK.CH_P02_PHONEDETAIL has 10414 observations and 12 variables.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 5.93 seconds
cpu time 5.37 seconds
Wken1122
Obsidian | Level 7

In the original source csv, there are only 10411 rows.

Reeza
Super User

change MISSOVER to TRUNCOVER in the code from the log and resubmit that and see if it works.

 

If there's no error in the log and that doesn't work, can you determine which row is the issue and work backwards from there?

 

Wken1122
Obsidian | Level 7

How do I change the MISSOVER?

Reeza
Super User

@Wken1122 wrote:

How do I change the MISSOVER?


Copy the code, CTRL+C, Paste it into the editor, CTRL+V and change the word MISSOVER to TRUNCOVER. 

 

You can hold down ALT key while selecting the code to drop the line numbers. 

 

But you literally type it out....

Tom
Super User Tom
Super User

@Wken1122 wrote:

How do I change the MISSOVER?


You cannot since this code is generated by PROC IMPORT. I think the suggestion was to take the code that PROC IMPORT generated and save it as a new program that you could run. You could then change the option in the INFILE statement.

 

But that will not fix the problem since the code generated by PROC IMPORT works fine using the MISSOVER option since it only uses list mode INPUT statement.

 

Also the problem is not that SAS is not reading some values from the line. The problem is that your source file has extra line breaks in the middle of one or more data lines. 

Tom
Super User Tom
Super User

@Wken1122 wrote:

In the original source csv, there are only 10411 rows.


That is your problem. The SAS log is showing that it read 10,414 lines from the file. If you think there should be three fewer observations than that then the file must have included some line breaks in the middle of some of the records.

 

That would also explain the strange values you are seeing for some values as SAS will read the line after the extra line break as a new line and so read the values into the wrong columns.

 

There are multiple answers on this site for ways to use a SAS program to clean up the file so that it can be read properly.

If you are lucky the extra line breaks are different than the normal end of line character in your file and so you can add the TERMSTR= option to let SAS know this and it will work. If you are not lucky then you will need to do something to fix.  You could just open the file in a text editor (even the SAS program editor will work if the lines are not too long) and remove the extra line breaks. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 3987 views
  • 3 likes
  • 3 in conversation