BookmarkSubscribeRSS Feed
sas_student1
Quartz | Level 8

I have a CSV file that is being extracted using API (see table 1 below). I am importing that data into SAS so I can clean it up. As I am using proc  I realize that some cells have two values that looks like are separated by a tab (see column flat_value first=John last=Doe). So when I import the data wherever the values are separated by a tab they show up in a new line and under the incorrect column (see table 2). 

 

Table1:

PIDfieldlabelflat_valueclient_id
1121697600Hosp TypeClubhouseA
2121697753Reporting Period6/1/2022A
3121698840Confirm Hosp TypeOneonta Teen CenterA
4121718909What is the name?first = John
last = Doe
A
5121718930Email AddressJohn.doe@aol.comA
6127510515Does you have bilingual staff?NoA
7121838918Total Visits0A
8121838577Unique Individuals Served0A

 

 

Table 2:

PIDfieldlabelflat_valueclient_id
1121697600Hosp TypeClubhouseA
2121697753Reporting Period6/1/2022A
3121698840Confirm Hosp TypeOneonta Teen CenterA
4121718909What is the name?"first = John 
last = Doe"A   
5121718930Email AddressJohn.doe@aol.comA
6127510515Does you have bilingual staff?NoA
7121838918Total Visits0A
8121838577Unique Individuals Served0A

 

This is the code being used.

PROC IMPORT 
OUT= want
DATAFILE= "C:\have.csv" 
            DBMS=csv;
     GETNAMES=YES;
RUN;

Suggestions on how to deal with the possible tab or space in one cell can be handled. I would want it to either (1) stay in the same cell but may have have a separator, or (2) have its own column or row if possible. 

 

Thanks!

 

 

 

 

2 REPLIES 2
Tom
Super User Tom
Super User

Not a TAB.  A TAB would not cause that problem.  Instead it looks like someone inserted end of line characters into the cell.

 

If you are extra lucky the character is a single LF or CR and the actual lines end with a CR and LF.  You can then use the TERMSTR=CRLF option on the INFILE statement (or if you want to try letting SAS GUESS how to read it using PROC IMPORT on the FILENAME statement).

filename csv 'c:\have.csv' termstr=crlf;
PROC IMPORT 
  OUT= want replace
  DATAFILE= csv
  DBMS=csv
;
RUN;

If you are only a little bit lucky the values with the end of line characters in them a quoted.  Then you can use something like this %replace_crlf() macro to first pre-process the file and replace those embedded end of line characters with something else so SAS can parse the file into the right number of records.

 

And if you are not lucky at all then open the file with an actual text editor and remove the line breaks by hand.

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 361 views
  • 0 likes
  • 3 in conversation