Solved: Re: creating a dataset from the external file which doesnt have any de...

SasFellow331 · Posted 09-08-2014 07:39 PM

hello,

i am having an issue creating a data-set imported from an external file with no delimiters (not even space). I need to create it within sas with added spaces between the variables. Here is what i have so far { data _null_;

set xxxx.class;

file 'C:\Users\xxxxx\Desktop\test9.txt';

put (name sex age height weight) (+(-1));

run; }

i need to get from that to a data-set with delimiters (specifically space between variables). I appreciate any help.

{this raw data is completely made up.]

SasFellow331 · Posted 09-21-2014 10:04 AM

i finally came across the answer.

data a;

infile 'C:\Users\___\Desktop\test1.txt' truncover;

input info $30.;

name = put(reverse(substr(reverse(info),findc(reverse(info),'F''M')+1)),$8.);

sex = put(substr(info, length(name) +1, 1),$1.);

age = input(substr(info, anydigit(info),2),2.);

ht=substr(info, sum(length(name),4), 4);

if ht>=100 then ht=substr(ht,1,2);

height=input(ht,4.);

weight= input(substr(info, sum(length(name),4,length(ht))),5.);

drop info ht;

run;

View solution in original post

Patrick · Posted 09-08-2014 08:11 PM

Can you define ANY logic to be used which defines how to split up the string from source and map SAS variables against?

SasFellow331 · Posted 09-08-2014 09:12 PM

The raw data I had provided was my starting off point, and i need to recreate it through both data export and import statements. The the export logic is as follows (which i had previously stated)

data <dataset>;

set <source data>;

file 'name and location of the external file';

put <variables list>;

run;

as for the import statement, it should follow this format;

data <dataset>;

infile 'name and location of the external file';

input <variables list>;

run;

SASKiwi · Posted 09-08-2014 10:07 PM

Using your made up example, you can figure out the boundary between sex and age because age starts with the first number in the record. If you assume sex is one character long, you can then work out where the name boundary is, which is the position of the first number minus 2.

However there is no way of working out the boundaries between the numeric fields age height and weight unless you make assumptions about how long each of these is. If age height and weight are always the same length then you can find the boundaries.

If they are of varying length it is impossible to do this. So if your real data consists of a series of numbers of varying length (between records) with no delimiters then there is no way you can find the boundaries unless as Patrick says there are "other" logic or rules you can apply.

Tom · Posted 09-08-2014 10:00 PM

If your text file does not have delimiters you will not be able to read it into multiple variables unless you have some rules for how to split the lines into variables. SAS can very easily read a file without delimiters if the variables appear in fixed columns.

For example:

data _null_;

set sashelp.class ;

put (name sex age height weight)( $7. $1. 2. 2. 5.1);

run;

Alfred M1469112.5

Alice F1357 84.0

BarbaraF1365 98.0

Carol F1463102.5

Henry M1464102.5

James M1257 83.0

Jane F1260 84.5

Janet F1563112.5

JeffreyM1363 84.0

John M1259 99.5

Joyce F1151 50.5

Judy F1464 90.0

Louise F1256 77.0

Mary F1567112.0

Philip M1672150.0

Robert M1265128.0

Ronald M1567133.0

Thomas M1158 85.0

WilliamM1567112.0

PGStats · Posted 09-08-2014 10:47 PM

As Tom pointed out, this problem doesn't have a general solution unless you use fixed columns. If you have at least some minimal formatting rules you might be able to parse variable length data lines with regular expressions, using a separate capture buffer for each field :

data _null_;

set sashelp.class;

file "&sasforum.\datasets\sasFellow331.txt";

format age z2. height 4.1 weight 5.1; /* Use fixed point formats */

put (name sex age height weight) (+(-1));

run;

data want;

if not prxID then prxID + prxparse("/(\w+)([MF])(\d\d)(\d+\.\d)(\d+\.\d)/");

infile "&sasforum.\datasets\sasFellow331.txt" truncover;

input txt $100.;

length name $8 sex $1;

if prxmatch(prxID, txt) then do;

call prxposn(prxID, 1, position, length);

name = substr(txt, position, length);

call prxposn(prxID, 2, position, length);

sex = substr(txt, position, length);

call prxposn(prxID, 3, position, length);

age = input(substr(txt, position, length),best.);

call prxposn(prxID, 4, position, length);

height = input(substr(txt, position, length),best.);

call prxposn(prxID, 5, position, length);

weight = input(substr(txt, position, length),best.);

output;

end;

drop prxID txt position length;

run;

proc print data=want noobs; run;

PG

SasFellow331 · Posted 09-21-2014 10:04 AM

i finally came across the answer.

data a;

infile 'C:\Users\___\Desktop\test1.txt' truncover;

input info $30.;

name = put(reverse(substr(reverse(info),findc(reverse(info),'F''M')+1)),$8.);

sex = put(substr(info, length(name) +1, 1),$1.);

age = input(substr(info, anydigit(info),2),2.);

ht=substr(info, sum(length(name),4), 4);

if ht>=100 then ht=substr(ht,1,2);

height=input(ht,4.);

weight= input(substr(info, sum(length(name),4,length(ht))),5.);

drop info ht;

run;

creating a dataset from the external file which doesnt have any delimiters

Re: creating a dataset from the external file which doesnt have any delimiters

Re: creating a dataset from the external file which doesnt have any delimiters

Re: creating a dataset from the external file which doesnt have any delimiters

Re: creating a dataset from the external file which doesnt have any delimiters

Re: creating a dataset from the external file which doesnt have any delimiters

Re: creating a dataset from the external file which doesnt have any delimiters

Re: creating a dataset from the external file which doesnt have any delimiters

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away