I have a CSV file that is being extracted using API (see table 1 below). I am importing that data into SAS so I can clean it up. As I am using proc I realize that some cells have two values that looks like are separated by a tab (see column flat_value first=John last=Doe). So when I import the data wherever the values are separated by a tab they show up in a new line and under the incorrect column (see table 2).
Table1:
PID | field | label | flat_value | client_id |
1 | 121697600 | Hosp Type | Clubhouse | A |
2 | 121697753 | Reporting Period | 6/1/2022 | A |
3 | 121698840 | Confirm Hosp Type | Oneonta Teen Center | A |
4 | 121718909 | What is the name? | first = John last = Doe | A |
5 | 121718930 | Email Address | John.doe@aol.com | A |
6 | 127510515 | Does you have bilingual staff? | No | A |
7 | 121838918 | Total Visits | 0 | A |
8 | 121838577 | Unique Individuals Served | 0 | A |
Table 2:
PID | field | label | flat_value | client_id |
1 | 121697600 | Hosp Type | Clubhouse | A |
2 | 121697753 | Reporting Period | 6/1/2022 | A |
3 | 121698840 | Confirm Hosp Type | Oneonta Teen Center | A |
4 | 121718909 | What is the name? | "first = John | |
last = Doe" | A | |||
5 | 121718930 | Email Address | John.doe@aol.com | A |
6 | 127510515 | Does you have bilingual staff? | No | A |
7 | 121838918 | Total Visits | 0 | A |
8 | 121838577 | Unique Individuals Served | 0 | A |
This is the code being used.
PROC IMPORT
OUT= want
DATAFILE= "C:\have.csv"
DBMS=csv;
GETNAMES=YES;
RUN;
Suggestions on how to deal with the possible tab or space in one cell can be handled. I would want it to either (1) stay in the same cell but may have have a separator, or (2) have its own column or row if possible.
Thanks!
Not a TAB. A TAB would not cause that problem. Instead it looks like someone inserted end of line characters into the cell.
If you are extra lucky the character is a single LF or CR and the actual lines end with a CR and LF. You can then use the TERMSTR=CRLF option on the INFILE statement (or if you want to try letting SAS GUESS how to read it using PROC IMPORT on the FILENAME statement).
filename csv 'c:\have.csv' termstr=crlf;
PROC IMPORT
OUT= want replace
DATAFILE= csv
DBMS=csv
;
RUN;
If you are only a little bit lucky the values with the end of line characters in them a quoted. Then you can use something like this %replace_crlf() macro to first pre-process the file and replace those embedded end of line characters with something else so SAS can parse the file into the right number of records.
And if you are not lucky at all then open the file with an actual text editor and remove the line breaks by hand.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.