SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Import space delimited text if column values contain spaces

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

Import space delimited text if column values contain spaces

I have a beginners question about importing.

I have the following txt file:

Obs road paint bright

1 Center St. 1 43

2 Broadway 1 46

3 Main St. 1 47

4 Main St. 3 54

5 Elm St. 1 55

6 Station Rd. 1 56

7 Center St. 1 59

8 Center St. 4 61

9 Main St. 3 62

10 Center St. 4 62

11 Park Dr. 3 63

12 Main St. 2 64

13 Park Dr. 1 64

14 Broadway 4 64

15 Broadway 2 64

16 Broadway 3 65

17 Station Rd. 3 67

18 Station Rd. 3 67

19 Elm St. 3 68

20 Beech St. 4 71

21 Elm St. 4 72

22 Beech St. 2 75

23 Beech St. 4 75

24 Beech St. 2 76

25 Park Dr. 4 77

26 Elm St. 2 79

27 Station Rd. 2 79

28 Park Dr. 2 84


I tried to import it and as a control I saved the code in a file and the code is:

PROC IMPORT OUT= WORK.B_ROADS

            DATAFILE= "F:\SAS Unterlagen\Statistical Business\B_roads.txt"

            DBMS=DLM REPLACE;

     DELIMITER='20'x;

     GETNAMES=YES;

     DATAROW=2;

RUN;

Unfortunately the column road contains values which contain spaces. So the resulting table in SAS is different, because it adds a new column and a new variable. Like this:

Obs road      paint bright VAR5

1     Center     St.     1     43

and so on.....

So the problem is that I somehow have to tell SAS that the variable road contains spaces. How can I achieve this?

Thanks a lot for your help in advance.


Accepted Solutions
Solution
‎04-08-2014 04:21 AM
Super User
Super User
Posts: 7,401

Re: Import space delimited text if column values contain spaces

Hi,

Why not import that whole text as one long text string.  Then process it in a datastep.  So (note I wrote this very quickly and didn't think about optimizing or better methods of getting the data, just got it working:

filename mine "S:\Temp\Rob\test.txt";
data have;
  attrib buff format=$2000.;
  infile mine dsd;
  input buff;
run;
data want (drop=tmp);
  set have;
  obs=input(strip(substr(buff,1,index(buff," "))),best.);
  buff=strip(substr(buff,index(buff," ")));
  tmp=index(strip(reverse(buff))," ")-1;
  bright=input(strip(substr(buff,length(strip(buff))-tmp)),best.);
  buff=substr(strip(buff),1,length(strip(buff))-tmp);

run;

View solution in original post


All Replies
Super User
Posts: 10,497

Re: Import space delimited text if column values contain spaces

Whatever creates the text file really should quote qualify the data in which case the simple answer would be the DSD option on an infile statement in  data step program modified from code generated by proc import.

Unfortunately your example data doesn't show the quote qualification. By chance do they appear in the actual file and were removed for the example?

Or perhaps the data is fixed column? If you look at the data using the SAS program editor or Notepad with Courier or fixed width font do the columns align?

Occasional Contributor
Posts: 8

Re: Import space delimited text if column values contain spaces

Well unfortunately it's not fixed columns! So the problem is that the data is not evenly distributed among the columns. I am not sure, but I think there is no "quote qualification". I attached the txt file so you can see that there is no additional information and the problem is that the column can't be imported with a fixed length, because the column has somehow no fixed length.

Attachment
Solution
‎04-08-2014 04:21 AM
Super User
Super User
Posts: 7,401

Re: Import space delimited text if column values contain spaces

Hi,

Why not import that whole text as one long text string.  Then process it in a datastep.  So (note I wrote this very quickly and didn't think about optimizing or better methods of getting the data, just got it working:

filename mine "S:\Temp\Rob\test.txt";
data have;
  attrib buff format=$2000.;
  infile mine dsd;
  input buff;
run;
data want (drop=tmp);
  set have;
  obs=input(strip(substr(buff,1,index(buff," "))),best.);
  buff=strip(substr(buff,index(buff," ")));
  tmp=index(strip(reverse(buff))," ")-1;
  bright=input(strip(substr(buff,length(strip(buff))-tmp)),best.);
  buff=substr(strip(buff),1,length(strip(buff))-tmp);

run;

Respected Advisor
Posts: 3,777

Re: Import space delimited text if column values contain spaces

So what you need to know to read ROAD is the length of the field.  SAS has $VARYING to read it once you know the length.  You can calculate that by finding the starting column of PAINT, you always know where paint is relative to the end of line at SCAN -2.

filename FT15F001 temp;
data paint;
   infile FT15F001 firstobs=2 column=c;
   input obs @;
   call scan(_infile_,-2,s,l);
   l=s-c-1;
  
input road $varying32. l paint bright;
   drop s l;
   parmcards;
Obs road paint bright
1 Center St. 1 43
2 Broadway 1 46
3 Main St. 1 47
4 Main St. 3 54
5 Elm St. 1 55
6 Station Rd. 1 56
7 Center St. 1 59
8 Center St. 4 61
9 Main St. 3 62
10 Center St. 4 62
11 Park Dr. 3 63
12 Main St. 2 64
13 Park Dr. 1 64
14 Broadway 4 64
15 Broadway 2 64
16 Broadway 3 65
17 Station Rd. 3 67
18 Station Rd. 3 67
19 Elm St. 3 68
20 Beech St. 4 71
21 Elm St. 4 72
22 Beech St. 2 75
23 Beech St. 4 75
24 Beech St. 2 76
25 Park Dr. 4 77
26 Elm St. 2 79
27 Station Rd. 2 79
28 Park Dr. 2 84
;;;;
   run;
proc print;
  
run;
Contributor
Posts: 74

Re: Import space delimited text if column values contain spaces

great . request you to pls explain  the same

the use of column c and

call scan(_infile_,-2,s,l);
   l=s-c-1;


Regards

Tarun Kumar

Respected Advisor
Posts: 3,777

Re: Import space delimited text if column values contain spaces

C is the position of the input statement column pointer after reading OBS and is the starting column of ROAD.  See INFILE COLUMN=C;

CALL SCAN tells you where it does not return the word like the SCAN function.

S is the starting position of PAINT

so the length of ROAD is

s-c-1;

Occasional Contributor
Posts: 8

Re: Import space delimited text if column values contain spaces

Ok, these are all great solutions, thanks a lot for your help!

Contributor
Posts: 29

Re: Import space delimited text if column values contain spaces

I dont think you can import the data you have provide using proc import procedure as there are spaces between road variable.

You can use input and infile statement to import this data correctly only if the data is evenly distributed among the columns in txt file.

example:

input obs 1-1 road $ 3-12  paint 15-16  bright 17-3;

cards;

5 Elm St.         1 55

6 Station Rd.   1 56

;

Frequent Contributor
Posts: 106

Re: Import space delimited text if column values contain spaces

data want(drop=line);

infile datalines truncover;

input line $1000.;

obs=scan(line,1,' ');

road =substr(substr(line,length(scan(line,1,' '))+1),1,index(substr(line,length(scan(line,1,' '))+1),scan(line,-2,' '))-1);

paint=scan(line,-2,' ');

bright=scan(line,-1,' ');

datalines;

1 Center St. 1 43

2 Broadway 1 46

3 Main St. 1 47

4 Main St. 3 54

5 Elm St. 1 55

6 Station Rd. 1 56

7 Center St. 1 59

8 Center St. 4 61

9 Main St. 3 62

10 Center St. 4 62

11 Park Dr. 3 63

12 Main St. 2 64

13 Park Dr. 1 64

14 Broadway 4 64

15 Broadway 2 64

16 Broadway 3 65

17 Station Rd. 3 67

18 Station Rd. 3 67

19 Elm St. 3 68

20 Beech St. 4 71

21 Elm St. 4 72

22 Beech St. 2 75

23 Beech St. 4 75

24 Beech St. 2 76

25 Park Dr. 4 77

26 Elm St. 2 79

27 Station Rd. 2 79

28 Park Dr. 2 84

;

run;

Super User
Posts: 9,676

Re: Import space delimited text if column values contain spaces

I would like to use perl regular expression.

data want;
 infile 'c:\temp\B_roads.txt' truncover firstobs=2;
 input;
pid=prxparse('/(\d+)\s+(\D+)\s+(\d+)\s+(\d+)/');
if prxmatch(pid,_infile_) then do;
 call prxposn(pid,1,p,l);
 obs=substr(_infile_,p,l);
 call prxposn(pid,2,p,l);
 road=substr(_infile_,p,l);
 call prxposn(pid,3,p,l);
 paint=substr(_infile_,p,l);
 call prxposn(pid,4,p,l);
 bright=substr(_infile_,p,l);
end;
drop p l pid;
run;


Ksharp

Valued Guide
Posts: 3,208

Re: Import space delimited text if column values contain spaces

@tarun for the call scan explanation, see: SAS(R) 9.4 Functions and CALL Routines: Reference, Second Edition (call scan)

---->-- ja karman --<-----
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 1069 views
  • 15 likes
  • 9 in conversation