I have a beginners question about importing.
I have the following txt file:
Obs road paint bright
1 Center St. 1 43
2 Broadway 1 46
3 Main St. 1 47
4 Main St. 3 54
5 Elm St. 1 55
6 Station Rd. 1 56
7 Center St. 1 59
8 Center St. 4 61
9 Main St. 3 62
10 Center St. 4 62
11 Park Dr. 3 63
12 Main St. 2 64
13 Park Dr. 1 64
14 Broadway 4 64
15 Broadway 2 64
16 Broadway 3 65
17 Station Rd. 3 67
18 Station Rd. 3 67
19 Elm St. 3 68
20 Beech St. 4 71
21 Elm St. 4 72
22 Beech St. 2 75
23 Beech St. 4 75
24 Beech St. 2 76
25 Park Dr. 4 77
26 Elm St. 2 79
27 Station Rd. 2 79
28 Park Dr. 2 84
I tried to import it and as a control I saved the code in a file and the code is:
PROC IMPORT OUT= WORK.B_ROADS
DATAFILE= "F:\SAS Unterlagen\Statistical Business\B_roads.txt"
DBMS=DLM REPLACE;
DELIMITER='20'x;
GETNAMES=YES;
DATAROW=2;
RUN;
Unfortunately the column road contains values which contain spaces. So the resulting table in SAS is different, because it adds a new column and a new variable. Like this:
Obs road paint bright VAR5
1 Center St. 1 43
and so on.....
So the problem is that I somehow have to tell SAS that the variable road contains spaces. How can I achieve this?
Thanks a lot for your help in advance.
Hi,
Why not import that whole text as one long text string. Then process it in a datastep. So (note I wrote this very quickly and didn't think about optimizing or better methods of getting the data, just got it working:
filename mine "S:\Temp\Rob\test.txt";
data have;
attrib buff format=$2000.;
infile mine dsd;
input buff;
run;
data want (drop=tmp);
set have;
obs=input(strip(substr(buff,1,index(buff," "))),best.);
buff=strip(substr(buff,index(buff," ")));
tmp=index(strip(reverse(buff))," ")-1;
bright=input(strip(substr(buff,length(strip(buff))-tmp)),best.);
buff=substr(strip(buff),1,length(strip(buff))-tmp);
run;
Whatever creates the text file really should quote qualify the data in which case the simple answer would be the DSD option on an infile statement in data step program modified from code generated by proc import.
Unfortunately your example data doesn't show the quote qualification. By chance do they appear in the actual file and were removed for the example?
Or perhaps the data is fixed column? If you look at the data using the SAS program editor or Notepad with Courier or fixed width font do the columns align?
Well unfortunately it's not fixed columns! So the problem is that the data is not evenly distributed among the columns. I am not sure, but I think there is no "quote qualification". I attached the txt file so you can see that there is no additional information and the problem is that the column can't be imported with a fixed length, because the column has somehow no fixed length.
Hi,
Why not import that whole text as one long text string. Then process it in a datastep. So (note I wrote this very quickly and didn't think about optimizing or better methods of getting the data, just got it working:
filename mine "S:\Temp\Rob\test.txt";
data have;
attrib buff format=$2000.;
infile mine dsd;
input buff;
run;
data want (drop=tmp);
set have;
obs=input(strip(substr(buff,1,index(buff," "))),best.);
buff=strip(substr(buff,index(buff," ")));
tmp=index(strip(reverse(buff))," ")-1;
bright=input(strip(substr(buff,length(strip(buff))-tmp)),best.);
buff=substr(strip(buff),1,length(strip(buff))-tmp);
run;
So what you need to know to read ROAD is the length of the field. SAS has $VARYING to read it once you know the length. You can calculate that by finding the starting column of PAINT, you always know where paint is relative to the end of line at SCAN -2.
great . request you to pls explain the same
the use of column c and
call scan(_infile_,-2,s,l);
l=s-c-1;
Regards
Tarun Kumar
C is the position of the input statement column pointer after reading OBS and is the starting column of ROAD. See INFILE COLUMN=C;
CALL SCAN tells you where it does not return the word like the SCAN function.
S is the starting position of PAINT
so the length of ROAD is
s-c-1;
Ok, these are all great solutions, thanks a lot for your help!
I dont think you can import the data you have provide using proc import procedure as there are spaces between road variable.
You can use input and infile statement to import this data correctly only if the data is evenly distributed among the columns in txt file.
example:
input obs 1-1 road $ 3-12 paint 15-16 bright 17-3;
cards;
5 Elm St. 1 55
6 Station Rd. 1 56
;
data want(drop=line);
infile datalines truncover;
input line $1000.;
obs=scan(line,1,' ');
road =substr(substr(line,length(scan(line,1,' '))+1),1,index(substr(line,length(scan(line,1,' '))+1),scan(line,-2,' '))-1);
paint=scan(line,-2,' ');
bright=scan(line,-1,' ');
datalines;
1 Center St. 1 43
2 Broadway 1 46
3 Main St. 1 47
4 Main St. 3 54
5 Elm St. 1 55
6 Station Rd. 1 56
7 Center St. 1 59
8 Center St. 4 61
9 Main St. 3 62
10 Center St. 4 62
11 Park Dr. 3 63
12 Main St. 2 64
13 Park Dr. 1 64
14 Broadway 4 64
15 Broadway 2 64
16 Broadway 3 65
17 Station Rd. 3 67
18 Station Rd. 3 67
19 Elm St. 3 68
20 Beech St. 4 71
21 Elm St. 4 72
22 Beech St. 2 75
23 Beech St. 4 75
24 Beech St. 2 76
25 Park Dr. 4 77
26 Elm St. 2 79
27 Station Rd. 2 79
28 Park Dr. 2 84
;
run;
I would like to use perl regular expression.
data want; infile 'c:\temp\B_roads.txt' truncover firstobs=2; input; pid=prxparse('/(\d+)\s+(\D+)\s+(\d+)\s+(\d+)/'); if prxmatch(pid,_infile_) then do; call prxposn(pid,1,p,l); obs=substr(_infile_,p,l); call prxposn(pid,2,p,l); road=substr(_infile_,p,l); call prxposn(pid,3,p,l); paint=substr(_infile_,p,l); call prxposn(pid,4,p,l); bright=substr(_infile_,p,l); end; drop p l pid; run;
Ksharp
@tarun for the call scan explanation, see: SAS(R) 9.4 Functions and CALL Routines: Reference, Second Edition (call scan)
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.