BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NCEU
Calcite | Level 5

I have a beginners question about importing.

I have the following txt file:

Obs road paint bright

1 Center St. 1 43

2 Broadway 1 46

3 Main St. 1 47

4 Main St. 3 54

5 Elm St. 1 55

6 Station Rd. 1 56

7 Center St. 1 59

8 Center St. 4 61

9 Main St. 3 62

10 Center St. 4 62

11 Park Dr. 3 63

12 Main St. 2 64

13 Park Dr. 1 64

14 Broadway 4 64

15 Broadway 2 64

16 Broadway 3 65

17 Station Rd. 3 67

18 Station Rd. 3 67

19 Elm St. 3 68

20 Beech St. 4 71

21 Elm St. 4 72

22 Beech St. 2 75

23 Beech St. 4 75

24 Beech St. 2 76

25 Park Dr. 4 77

26 Elm St. 2 79

27 Station Rd. 2 79

28 Park Dr. 2 84


I tried to import it and as a control I saved the code in a file and the code is:

PROC IMPORT OUT= WORK.B_ROADS

            DATAFILE= "F:\SAS Unterlagen\Statistical Business\B_roads.txt"

            DBMS=DLM REPLACE;

     DELIMITER='20'x;

     GETNAMES=YES;

     DATAROW=2;

RUN;

Unfortunately the column road contains values which contain spaces. So the resulting table in SAS is different, because it adds a new column and a new variable. Like this:

Obs road      paint bright VAR5

1     Center     St.     1     43

and so on.....

So the problem is that I somehow have to tell SAS that the variable road contains spaces. How can I achieve this?

Thanks a lot for your help in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Why not import that whole text as one long text string.  Then process it in a datastep.  So (note I wrote this very quickly and didn't think about optimizing or better methods of getting the data, just got it working:

filename mine "S:\Temp\Rob\test.txt";
data have;
  attrib buff format=$2000.;
  infile mine dsd;
  input buff;
run;
data want (drop=tmp);
  set have;
  obs=input(strip(substr(buff,1,index(buff," "))),best.);
  buff=strip(substr(buff,index(buff," ")));
  tmp=index(strip(reverse(buff))," ")-1;
  bright=input(strip(substr(buff,length(strip(buff))-tmp)),best.);
  buff=substr(strip(buff),1,length(strip(buff))-tmp);

run;

View solution in original post

11 REPLIES 11
ballardw
Super User

Whatever creates the text file really should quote qualify the data in which case the simple answer would be the DSD option on an infile statement in  data step program modified from code generated by proc import.

Unfortunately your example data doesn't show the quote qualification. By chance do they appear in the actual file and were removed for the example?

Or perhaps the data is fixed column? If you look at the data using the SAS program editor or Notepad with Courier or fixed width font do the columns align?

NCEU
Calcite | Level 5

Well unfortunately it's not fixed columns! So the problem is that the data is not evenly distributed among the columns. I am not sure, but I think there is no "quote qualification". I attached the txt file so you can see that there is no additional information and the problem is that the column can't be imported with a fixed length, because the column has somehow no fixed length.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Why not import that whole text as one long text string.  Then process it in a datastep.  So (note I wrote this very quickly and didn't think about optimizing or better methods of getting the data, just got it working:

filename mine "S:\Temp\Rob\test.txt";
data have;
  attrib buff format=$2000.;
  infile mine dsd;
  input buff;
run;
data want (drop=tmp);
  set have;
  obs=input(strip(substr(buff,1,index(buff," "))),best.);
  buff=strip(substr(buff,index(buff," ")));
  tmp=index(strip(reverse(buff))," ")-1;
  bright=input(strip(substr(buff,length(strip(buff))-tmp)),best.);
  buff=substr(strip(buff),1,length(strip(buff))-tmp);

run;

data_null__
Jade | Level 19

So what you need to know to read ROAD is the length of the field.  SAS has $VARYING to read it once you know the length.  You can calculate that by finding the starting column of PAINT, you always know where paint is relative to the end of line at SCAN -2.

filename FT15F001 temp;
data paint;
   infile FT15F001 firstobs=2 column=c;
   input obs @;
   call scan(_infile_,-2,s,l);
   l=s-c-1;
  
input road $varying32. l paint bright;
   drop s l;
   parmcards;
Obs road paint bright
1 Center St. 1 43
2 Broadway 1 46
3 Main St. 1 47
4 Main St. 3 54
5 Elm St. 1 55
6 Station Rd. 1 56
7 Center St. 1 59
8 Center St. 4 61
9 Main St. 3 62
10 Center St. 4 62
11 Park Dr. 3 63
12 Main St. 2 64
13 Park Dr. 1 64
14 Broadway 4 64
15 Broadway 2 64
16 Broadway 3 65
17 Station Rd. 3 67
18 Station Rd. 3 67
19 Elm St. 3 68
20 Beech St. 4 71
21 Elm St. 4 72
22 Beech St. 2 75
23 Beech St. 4 75
24 Beech St. 2 76
25 Park Dr. 4 77
26 Elm St. 2 79
27 Station Rd. 2 79
28 Park Dr. 2 84
;;;;
   run;
proc print;
  
run;
TarunKumar
Pyrite | Level 9

great . request you to pls explain  the same

the use of column c and

call scan(_infile_,-2,s,l);
   l=s-c-1;


Regards

Tarun Kumar

data_null__
Jade | Level 19

C is the position of the input statement column pointer after reading OBS and is the starting column of ROAD.  See INFILE COLUMN=C;

CALL SCAN tells you where it does not return the word like the SCAN function.

S is the starting position of PAINT

so the length of ROAD is

s-c-1;

NCEU
Calcite | Level 5

Ok, these are all great solutions, thanks a lot for your help!

Chrishi
Calcite | Level 5

I dont think you can import the data you have provide using proc import procedure as there are spaces between road variable.

You can use input and infile statement to import this data correctly only if the data is evenly distributed among the columns in txt file.

example:

input obs 1-1 road $ 3-12  paint 15-16  bright 17-3;

cards;

5 Elm St.         1 55

6 Station Rd.   1 56

;

pradeepalankar
Obsidian | Level 7

data want(drop=line);

infile datalines truncover;

input line $1000.;

obs=scan(line,1,' ');

road =substr(substr(line,length(scan(line,1,' '))+1),1,index(substr(line,length(scan(line,1,' '))+1),scan(line,-2,' '))-1);

paint=scan(line,-2,' ');

bright=scan(line,-1,' ');

datalines;

1 Center St. 1 43

2 Broadway 1 46

3 Main St. 1 47

4 Main St. 3 54

5 Elm St. 1 55

6 Station Rd. 1 56

7 Center St. 1 59

8 Center St. 4 61

9 Main St. 3 62

10 Center St. 4 62

11 Park Dr. 3 63

12 Main St. 2 64

13 Park Dr. 1 64

14 Broadway 4 64

15 Broadway 2 64

16 Broadway 3 65

17 Station Rd. 3 67

18 Station Rd. 3 67

19 Elm St. 3 68

20 Beech St. 4 71

21 Elm St. 4 72

22 Beech St. 2 75

23 Beech St. 4 75

24 Beech St. 2 76

25 Park Dr. 4 77

26 Elm St. 2 79

27 Station Rd. 2 79

28 Park Dr. 2 84

;

run;

Ksharp
Super User

I would like to use perl regular expression.

data want;
 infile 'c:\temp\B_roads.txt' truncover firstobs=2;
 input;
pid=prxparse('/(\d+)\s+(\D+)\s+(\d+)\s+(\d+)/');
if prxmatch(pid,_infile_) then do;
 call prxposn(pid,1,p,l);
 obs=substr(_infile_,p,l);
 call prxposn(pid,2,p,l);
 road=substr(_infile_,p,l);
 call prxposn(pid,3,p,l);
 paint=substr(_infile_,p,l);
 call prxposn(pid,4,p,l);
 bright=substr(_infile_,p,l);
end;
drop p l pid;
run;


Ksharp

jakarman
Barite | Level 11

@tarun for the call scan explanation, see: SAS(R) 9.4 Functions and CALL Routines: Reference, Second Edition (call scan)

---->-- ja karman --<-----

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 11756 views
  • 15 likes
  • 9 in conversation