BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
A_Swoosh
Quartz | Level 8

Hello, I'm attempting to read in raw text files. The data is in text file format with pipe delimiters and the below is a subset example with not all variables listed:

Category|ID|PIN|First_Name|Middle_Name|Last_Name|BusName|Business_Ind|

I encounter situations in the middle of my data as follows: 

SP220|12345|123456789|John|D|Doe|This is | A | Fake Business|N|

From above, I can see that 'This is A Fake Business' should be the business name but since the delimiter is the same format as the rest, it keeps moving on to the next variable.

 

Without manually making these changes, is there a way to correct this issue in SAS? Also, the next variable is likely to be a single letter (e.g. "N") after business name if that is any help. 

 

My code is as follows:

	    data raw_&file.0    ;
		    %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
		    infile  "&prov.\&provfile." delimiter = &dlm. dsd  missover  lrecl=32767 firstobs=1 ;
		       informat v1 -v200 $256. ;
		       format v1 -v200 $256. ;
		    input v1 -v200 @;
		run;
1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

@A_Swoosh wrote:

Here are some lines from my text file:

Category|ID|PIN|First_Name|Middle_Name|Last_Name|BusName|Business_Ind
SP220|686|1003|John|R|Smith|ABC-Health Hospital|N
SK220|786|1633|Steven|R|Smith|Health Clinic R US|N
SV220|886|1653|Adam|R|Smith|HTP Business Depot|N
SP220|12345|123456789|John|D|Doe|This is | A | Fake Business|N
SE220|986|1513|Jane|R|Smith|Non-profit Center Campus|N
SR220|286|11113|Jeremy|R|Smith|Heart & Health Center|N

 


 

INFILE magic.

 

100  data test;
101     infile cards dsd dlm='|';
102     input @;
103     _infile_ = transtrn(_infile_,' | ',' ');
104     input (v1-v8) (:$64.);
105     list;
106     cards;

RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
107        Category|ID|PIN|First_Name|Middle_Name|Last_Name|BusName|Business_Ind
108        SP220|686|1003|John|R|Smith|ABC-Health Hospital|N
109        SK220|786|1633|Steven|R|Smith|Health Clinic R US|N
110        SV220|886|1653|Adam|R|Smith|HTP Business Depot|N
111        SP220|12345|123456789|John|D|Doe|This is A Fake Business|N 76
112        SE220|986|1513|Jane|R|Smith|Non-profit Center Campus|N 76
113        SR220|286|11113|Jeremy|R|Smith|Heart & Health Center|N 76
NOTE: The data set WORK.TEST has 7 observations and 8 variables.

Capture.PNG

View solution in original post

3 REPLIES 3
Reeza
Super User
Typically if the delimiter is also embedded in text the general protocol is to quote the entire field. Is that not happening in your file? Can you post some records from the exact text file - do not use Excel as an intermediary - it will not share the data correctly.
A_Swoosh
Quartz | Level 8

Here are some lines from my text file:

Category|ID|PIN|First_Name|Middle_Name|Last_Name|BusName|Business_Ind
SP220|686|1003|John|R|Smith|ABC-Health Hospital|N|
SK220|786|1633|Steven|R|Smith|Health Clinic R US|N|
SV220|886|1653|Adam|R|Smith|HTP Business Depot|N|
SP220|12345|123456789|John|D|Doe|This is | A | Fake Business|N|
SE220|986|1513|Jane|R|Smith|Non-profit Center Campus|N|
SR220|286|11113|Jeremy|R|Smith|Heart & Health Center|N|

 

data_null__
Jade | Level 19

@A_Swoosh wrote:

Here are some lines from my text file:

Category|ID|PIN|First_Name|Middle_Name|Last_Name|BusName|Business_Ind
SP220|686|1003|John|R|Smith|ABC-Health Hospital|N
SK220|786|1633|Steven|R|Smith|Health Clinic R US|N
SV220|886|1653|Adam|R|Smith|HTP Business Depot|N
SP220|12345|123456789|John|D|Doe|This is | A | Fake Business|N
SE220|986|1513|Jane|R|Smith|Non-profit Center Campus|N
SR220|286|11113|Jeremy|R|Smith|Heart & Health Center|N

 


 

INFILE magic.

 

100  data test;
101     infile cards dsd dlm='|';
102     input @;
103     _infile_ = transtrn(_infile_,' | ',' ');
104     input (v1-v8) (:$64.);
105     list;
106     cards;

RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
107        Category|ID|PIN|First_Name|Middle_Name|Last_Name|BusName|Business_Ind
108        SP220|686|1003|John|R|Smith|ABC-Health Hospital|N
109        SK220|786|1633|Steven|R|Smith|Health Clinic R US|N
110        SV220|886|1653|Adam|R|Smith|HTP Business Depot|N
111        SP220|12345|123456789|John|D|Doe|This is A Fake Business|N 76
112        SE220|986|1513|Jane|R|Smith|Non-profit Center Campus|N 76
113        SR220|286|11113|Jeremy|R|Smith|Heart & Health Center|N 76
NOTE: The data set WORK.TEST has 7 observations and 8 variables.

Capture.PNG

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 857 views
  • 0 likes
  • 3 in conversation