Part of dataset, as a regular text file:
i_20304   >= 1.64214 AND < 4.42166   105    0.7048
i_20304   >= 1.64214 AND < 4.81874    17    1.0000
i_20304   >= 1.64214 AND < 4.81874   179    0.5475
Code attempted to create SAS dataset:
data sas_1.Variable_Values;
infile "C:\0_SAS_3\Variable Values2.txt" ;
input
Indicator $ Range $ Number Pct_Yes ;
run;
The issue is in getting the second variable into SAS.
>= 1.64214 AND < 4.42166
I tried putting that variable in quotes.
">= 1.64214 AND < 4.42166"
But SAS then split when it ran into the first space character.
Any guidance greatly appreciated.
Nicholas Kormanik
Your example makes it look like the data is in fixed columns.
----+----1----+----2----+----3----+----4----+----5 i_20304 >= 1.64214 AND < 4.42166 105 0.7048 i_20304 >= 1.64214 AND < 4.81874 17 1.0000 i_20304 >= 1.64214 AND < 4.81874 179 0.5475 ----+----1----+----2----+----3----+----4----+----5
So tell SAS that in your INPUT statement.
input Indicator $ 1-9 Range $ 11-35 Number Pct_Yes ;Hi @NKormanik
Maybe you could try to use the DSD option in the INFILE statement, as well as the DLM option to specify the delimiter (space (" ")? tabulation ("09"x)?).
The DSD option specifies that when data values are enclosed in quotation marks (as you did), delimiters within the value are treated as character data.
Hope this helps.
Best,
The above suggestions just do not make sense to me.
The 'delimiters' in a standard text file (actually output from a SAS job) are spaces. Lots of spaces.
The output included what is shown in the original question.
I want to now get it (said output, or portion thereof) into a SAS database.
The question is how to get the following into the second variable:
>= 1.64214 AND < 4.42166
Everything I've tried seems to choke SAS. How can I make this seemingly basic task more understandable to SAS?
PS -- there are 2 million such lines to get in, some smaller, some larger.
If it doesn't make sense to you, here's the documentation. That might help it make sense.
@NKormanik wrote:
The above suggestions just do not make sense to me.
The 'delimiters' in a standard text file (actually output from a SAS job) are spaces. Lots of spaces.
May be it would be a good idea to share details of the SAS job creating that text. Generally creating output to read back in is poorer than using the options to create a data set directly. I know I have done such in SAS 5.18 and SAS 6 because there wasn't a good option for some things. But that is has changed a lot in the past 20+ years.
Plus most of the SAS text output is fixed column, not space delimited. Space delimited would generally be ONE space not "many". Fixed column however often is generated by "padding" values with "many spaces" get the columns to align.
Your example makes it look like the data is in fixed columns.
----+----1----+----2----+----3----+----4----+----5 i_20304 >= 1.64214 AND < 4.42166 105 0.7048 i_20304 >= 1.64214 AND < 4.81874 17 1.0000 i_20304 >= 1.64214 AND < 4.81874 179 0.5475 ----+----1----+----2----+----3----+----4----+----5
So tell SAS that in your INPUT statement.
input Indicator $ 1-9 Range $ 11-35 Number Pct_Yes ;In case there are always all columns then you can use next code:
data test;
  infile datalines;
  length indicator $7  range $25;
  input row $50.;
  row = compbl(row);
  indicator = scan(row,1);
  do i=2 to 5; range=catx(' ',range,scan(row,i)); end;
  number = scan(row,7,' ');
  pct = scan(row,8,' ');
  drop i row;
datalines;
i_20304   >= 1.64214 AND < 4.42166   105    0.7048
i_20304   >= 1.64214 AND < 4.81874    17    1.0000
i_20304   >= 1.64214 AND < 4.81874   179    0.5475
; run;
When reading in text that is not in a fixed format, i prefer to use the automatic variable _infile_ and then I have more control with the wealth of string handling functions. eg.
data sasodde;
length infile_copy $1024;
infile ..... end=eof;
input ;
infile_copy=_infile_;
/* now use scan, substrn, and other string functions to parse infile_copy */
run;
@NKormanik wrote:
If one goes this route, will leading and trailing spaces be automatically deleted?
Why 1-9 and 11-35 as opposed to 1-9 and 10-35?
No need for specifying columns for Number and Pct_Yes?
You will need to look at the file to decide whether fixed columns work. Make sure to look at the data using a fixed space font and not a proportional font. I think that SAS will ignore leading spaces in a field, at least that is how it normally reads strings. If you want it to treat the leading spaces as significant you have to use the $CHAR informat.
You should probably also figure out the columns for the other fields. It doesn't matter for the example lines because none of the lines have only blanks in those fields and none of the fields actually abut directly on each other. Without the column numbers it will revert to list mode input and hunt for the next non-blank text to read.
Are you asking how to type the data into the text file in a way that will be easy for SAS to read?
If so then you should use fixed columns for the data or use a delimited file format. IN the delimited format any value that contains the character you used for the delimiter should be quoted. Any values with actual double quote characters should also be quoted and the existing quotes doubled up.
Or are you trying to read an existing text file?
If so then post a few more example lines, as the ones you posted are easy to read using fixed columns for each field.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
