SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Delimiters in my string variables

Reply
New Contributor eas
New Contributor
Posts: 2

Delimiters in my string variables

I have data from a national survey database that I received in a txt file. The data is from written surveys that were electronically scanned in so it is a bit messy. The variables are separated by the delimiter |. Unfortunately, there are several string/free text variables throughout the datafile where the delimiter | mistakenly appears in place of similar-looking characters such as / and lower-case L; therefore, when I read the data in SAS breaks these variables into multiple variables. Besides using DSD, which doesn't work in this case, how can I tell SAS to ignore these erroneous delimiters when reading in my data? The preceding and following values are not consistent across cases so there is nothing to anchor an @character pointer or a find-and-replace command to. Some more information in case it helps is that the string variables which pose a problem are varying lengths, and the data breaks across cases so I must use FLOWOVER.

 

 

Here is an example of what my data looks like:

 

Var 1  Var 2                           Var 3        Var 4

12345|123 App|e Tree Lane|20051231|E1

67981|5th grade|6th grade|20091231|F2

 

Super User
Posts: 19,875

Re: Delimiters in my string variables

Go back to your tool, and see if you have the option of specifying a different delimiter and/or creating quotes around variables that are text.  

 

There may be ways but it definitely won't be clean or easy.

New Contributor eas
New Contributor
Posts: 2

Re: Delimiters in my string variables

Thank you for your response! Do you mean the tool that scanned in the surveys originally? If so, I do not have that option. I received the txt files from another organization and have no way of re-requesting the data in a different format. Any other suggestions are appreciated!

Respected Advisor
Posts: 4,935

Re: Delimiters in my string variables

Could you build a representative sample of the cases that might occur in your data so that we can experiment different approaches?

PG
Super User
Super User
Posts: 7,997

Re: Delimiters in my string variables

If you cannot go back and fix the Problem, then everything you do from there on in will be guesswork and hence at risk of being wrong.  Sure, we can post code with complicated forumla which try to calculate position of text and where to read from, but unless these cover every possible scenario you still run the risk of being wrong.  Simply put, you cannot have a delimited file with data which contains delimters and is not quoted.  If you export from Excel to CSV for example and a field contains the delimter then the data element has " " surrounding the data to indicate start/end blocks.  The application that created this data must have something like that.

Super User
Posts: 10,046

Re: Delimiters in my string variables

If there is only messed up delimiter in VAR2 , Try this one :

 

 

data have;
input x $80.;
var1=scan(x,1,'|');
call scan(x,1,p1,l1,'|');
call scan(x,-2,p2,l2,'|');
var2=substr(x,p1+l1+1,p2-p1-l1-2);
var3=scan(x,-2,'|');
var4=scan(x,-1,'|');
drop p1 p2 l1 l2;
cards;
12345|123 App|e Tree Lane|20051231|E1
67981|5th grade|6th grade|20091231|F2
;
run;
 
Ask a Question
Discussion stats
  • 5 replies
  • 360 views
  • 2 likes
  • 5 in conversation