BookmarkSubscribeRSS Feed
eas
Calcite | Level 5 eas
Calcite | Level 5

I have data from a national survey database that I received in a txt file. The data is from written surveys that were electronically scanned in so it is a bit messy. The variables are separated by the delimiter |. Unfortunately, there are several string/free text variables throughout the datafile where the delimiter | mistakenly appears in place of similar-looking characters such as / and lower-case L; therefore, when I read the data in SAS breaks these variables into multiple variables. Besides using DSD, which doesn't work in this case, how can I tell SAS to ignore these erroneous delimiters when reading in my data? The preceding and following values are not consistent across cases so there is nothing to anchor an @character pointer or a find-and-replace command to. Some more information in case it helps is that the string variables which pose a problem are varying lengths, and the data breaks across cases so I must use FLOWOVER.

 

 

Here is an example of what my data looks like:

 

Var 1  Var 2                           Var 3        Var 4

12345|123 App|e Tree Lane|20051231|E1

67981|5th grade|6th grade|20091231|F2

 

5 REPLIES 5
Reeza
Super User

Go back to your tool, and see if you have the option of specifying a different delimiter and/or creating quotes around variables that are text.  

 

There may be ways but it definitely won't be clean or easy.

eas
Calcite | Level 5 eas
Calcite | Level 5

Thank you for your response! Do you mean the tool that scanned in the surveys originally? If so, I do not have that option. I received the txt files from another organization and have no way of re-requesting the data in a different format. Any other suggestions are appreciated!

PGStats
Opal | Level 21

Could you build a representative sample of the cases that might occur in your data so that we can experiment different approaches?

PG
RW9
Diamond | Level 26 RW9
Diamond | Level 26

If you cannot go back and fix the Problem, then everything you do from there on in will be guesswork and hence at risk of being wrong.  Sure, we can post code with complicated forumla which try to calculate position of text and where to read from, but unless these cover every possible scenario you still run the risk of being wrong.  Simply put, you cannot have a delimited file with data which contains delimters and is not quoted.  If you export from Excel to CSV for example and a field contains the delimter then the data element has " " surrounding the data to indicate start/end blocks.  The application that created this data must have something like that.

Ksharp
Super User

If there is only messed up delimiter in VAR2 , Try this one :

 

 

data have;
input x $80.;
var1=scan(x,1,'|');
call scan(x,1,p1,l1,'|');
call scan(x,-2,p2,l2,'|');
var2=substr(x,p1+l1+1,p2-p1-l1-2);
var3=scan(x,-2,'|');
var4=scan(x,-1,'|');
drop p1 p2 l1 l2;
cards;
12345|123 App|e Tree Lane|20051231|E1
67981|5th grade|6th grade|20091231|F2
;
run;
 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1757 views
  • 2 likes
  • 5 in conversation