BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NKormanik
Barite | Level 11

Part of dataset, as a regular text file:

i_20304   >= 1.64214 AND < 4.42166   105    0.7048
i_20304   >= 1.64214 AND < 4.81874    17    1.0000
i_20304   >= 1.64214 AND < 4.81874   179    0.5475

Code attempted  to create SAS dataset:

data sas_1.Variable_Values;
infile "C:\0_SAS_3\Variable Values2.txt" ;
input
Indicator $ Range $ Number Pct_Yes ;
run;

The issue is in getting the second variable into SAS.

 

>= 1.64214 AND < 4.42166

 

I tried putting that variable in quotes. 

 

">= 1.64214 AND < 4.42166"

 

But SAS then split when it ran into the first space character.

 

Any guidance greatly appreciated.

 

Nicholas Kormanik

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Your example makes it look like the data is in fixed columns.

----+----1----+----2----+----3----+----4----+----5
i_20304   >= 1.64214 AND < 4.42166   105    0.7048
i_20304   >= 1.64214 AND < 4.81874    17    1.0000
i_20304   >= 1.64214 AND < 4.81874   179    0.5475
----+----1----+----2----+----3----+----4----+----5

So tell SAS that in your INPUT statement.

input Indicator $ 1-9 Range $ 11-35 Number Pct_Yes ;

View solution in original post

13 REPLIES 13
ed_sas_member
Meteorite | Level 14

Hi @NKormanik 

 

Maybe you could try to use the DSD option in the INFILE statement, as well as the DLM option to specify the delimiter (space (" ")? tabulation ("09"x)?).

The DSD option specifies that when data values are enclosed in quotation marks (as you did), delimiters within the value are treated as character data.

 

Hope this helps.

Best,

Astounding
PROC Star
It looks like your data contains multiple blanks between fields. Take advantage of that:

input indicator $ range $ & number pct_yes;

The & tells SAS to look for two consecutive blanks to mark the end of a variable.
NKormanik
Barite | Level 11

The above suggestions just do not make sense to me.

 

The 'delimiters' in a standard text file (actually output from a SAS job) are spaces.  Lots of spaces.

 

The output included what is shown in the original question.

 

I want to now get it (said output, or portion thereof) into a SAS database.

 

The question is how to get the following into the second variable:

 

>= 1.64214 AND < 4.42166

 

Everything I've tried seems to choke SAS.  How can I make this seemingly basic task more understandable to SAS?

 

PS -- there are 2 million such lines to get in, some smaller, some larger.

 

 

 

 

ballardw
Super User

@NKormanik wrote:

The above suggestions just do not make sense to me.

 

The 'delimiters' in a standard text file (actually output from a SAS job) are spaces.  Lots of spaces.


May be it would be a good idea to share details of the SAS job creating that text. Generally creating output to read back in is poorer than using the options to create a data set directly. I know I have done such in SAS 5.18 and SAS 6 because there wasn't a good option for some things. But that is has changed a lot in the past 20+ years.

Plus most of the SAS text output is fixed column, not space delimited. Space delimited would generally be ONE space not "many". Fixed column however often is generated by "padding" values with "many spaces" get the columns to align.

Tom
Super User Tom
Super User

Your example makes it look like the data is in fixed columns.

----+----1----+----2----+----3----+----4----+----5
i_20304   >= 1.64214 AND < 4.42166   105    0.7048
i_20304   >= 1.64214 AND < 4.81874    17    1.0000
i_20304   >= 1.64214 AND < 4.81874   179    0.5475
----+----1----+----2----+----3----+----4----+----5

So tell SAS that in your INPUT statement.

input Indicator $ 1-9 Range $ 11-35 Number Pct_Yes ;
Shmuel
Garnet | Level 18

In case there are always all columns then you can use next code:

data test;
  infile datalines;
  length indicator $7  range $25;
  input row $50.;
  row = compbl(row);
  indicator = scan(row,1);
  do i=2 to 5; range=catx(' ',range,scan(row,i)); end;
  number = scan(row,7,' ');
  pct = scan(row,8,' ');
  drop i row;
datalines;
i_20304   >= 1.64214 AND < 4.42166   105    0.7048
i_20304   >= 1.64214 AND < 4.81874    17    1.0000
i_20304   >= 1.64214 AND < 4.81874   179    0.5475
; run;
DavePrinsloo
Pyrite | Level 9

When reading in text that is not in a fixed format, i prefer to use the automatic variable _infile_ and then I have more control with the wealth of string handling functions.  eg.

data sasodde;
length infile_copy $1024;
infile ..... end=eof;
input ;
infile_copy=_infile_;

/* now use scan, substrn, and other string functions to parse infile_copy */


run;
NKormanik
Barite | Level 11
If one goes this route, will leading and trailing spaces be automatically deleted?

Why 1-9 and 11-35 as opposed to 1-9 and 10-35?

No need for specifying columns for Number and Pct_Yes?
Tom
Super User Tom
Super User

@NKormanik wrote:
If one goes this route, will leading and trailing spaces be automatically deleted?

Why 1-9 and 11-35 as opposed to 1-9 and 10-35?

No need for specifying columns for Number and Pct_Yes?

You will need to look at the file to decide whether fixed columns work. Make sure to look at the data using a fixed space font and not a proportional font.  I think that SAS will ignore leading spaces in a field, at least that is how it normally reads strings. If you want it to treat the leading spaces as significant you have to use the $CHAR informat.

You should probably also figure out the columns for the other fields.  It doesn't matter for the example lines because none of the lines have only blanks in those fields and none of the fields actually abut directly on each other.  Without the column numbers it will revert to list mode input and hunt for the next non-blank text to read.  

NKormanik
Barite | Level 11
An issue with fixed-column statement: As one might imagine, if the total number of rows in the file exceeds 2 million, it's taxing to page down and examine every page to make sure each and every line is exactly lined up as expected.
Tom
Super User Tom
Super User

Are you asking how to type the data into the text file in a way that will be easy for SAS to read? 

If so then you should use fixed columns for the data or use a delimited file format.  IN the delimited format any value that contains the character you used for the delimiter should be quoted.  Any values with actual double quote characters should also be quoted and the existing quotes doubled up.

 

Or are you trying to read an existing text file? 

If so then post a few more example lines, as the ones you posted are easy to read using fixed columns for each field.

NKormanik
Barite | Level 11
This is a response that I can understand. I so appreciate that.

My hunch is, fixed columns is probably the way to go. So I need to read up on fixed columns.

Thanks!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 2621 views
  • 7 likes
  • 7 in conversation