- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Everyone,
I have a text file that is similar to the one below (1). It does not have any demiliers and it has the variables: name,sex,age,weight, height. Could you kindly help with how this can be imported into a SAS dataset as shown in (2) ?
I)
External text file:
JanetF1562.5112.5
MaryF1566.5112
CarolF1462.8102.5
JudyF1464.390
2)
Dataset table after importing into SAS should look like:
Janet F 15 62.5 112.5
Mary F 15 66.5 112
Carol F 14 62.8 102.5
Judy F 14 64.3 90
Thank you so much,
Regards,
ramm
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PRX to the rescue! Assuming that age always has two digits and weight always has a single decimal :
data test;
length str $100 name $24 sex $1;
if not prx1 then prx1 + prxparse("/(\D+)(M|F)(\d\d)(\d+\.\d)(\S+)/");
input str;
if prxmatch(prx1,str) then do;
name = prxposn(prx1,1,str);
sex = prxposn(prx1,2,str);
age = input(prxposn(prx1,3,str), best.);
weight = input(prxposn(prx1,4,str), best.);
height = input(prxposn(prx1,5,str), best.);
end;
drop prx1;
datalines;
JanetF1562.5112.5
MaryF1566.5112
CarolF1462.8102.5
JudyF1464.390
;
proc print data=test noobs; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You are welcome. Here is a description of the pattern, to get you started
"/(\D+)(M|F)(\d\d)(\d+\.\d)(\S+)/"
One or more non-digits, followed by
M or F, followed by
two digits, followed by
one or more digits, followed by a period, followed by a single digit, followed by
one or more non-space characters
each pair of parentheses defines a capture buffer (to be extracted with function prxposn)
Good luck!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
More importantly, why do you have a file where you would be guessing which parts are data. You can fix it by fixing width as examples given above by @PGStats. However, what if those assumptions do not work for all data, or it changes. Personally I would go back to the vendor and ask them to provide an easy to use robust file. Correction at the source of issues will save you time in the long run.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content