A friend of mine asked me a question that I could not answer. She's collecting hospital data in a Notepad file (argh!!!!) and wants to read it into a SAS data set.
I've attached records from two hospitals (data were made up just to show what's needed): Cons.txt. Below is how her data set of 13 observations should look like.
Site ID Code fts Heal pc
BTP04 1 xys H9 322 11
BTP04 2 v3bwwwr15 m2 95 5
BTP04 3 qd23 H4 228 3
BTP04 4 FRt5 m3 142 3
BTP04 5 p002Ft H4 361 9
BTP04 6 NNNz13 d1 48 1
RRR02 1 qus H7 204 11
RRR02 2 w752 m3 95 5
RRR02 3 rsss PT 228 3
RRR02 4 4562 m3 142 3
RRR02 5 4862 H4 361 9
RRR02 6 bVf6RW d1 122 1
RRR02 7 YyPpk8 M3 100 12
Note that the variable "Site" is captured from the first line of data for each hospital.
Thank you very much!
Andrew
So are you saying you want help reading data that looks like the lines in your attached file?
BTP04 Patients Mags 1=23 2=15 3=9 4=6 5=4 ___________________________________________________________________________________ ID Code fts |LOS|BH |pcE |pcEP|pcLP|pcSR|Spd |X| CONSm|Heal|pc| 1 xys H9| | | | | | | || | | | 322|11| 2 v3bwwwr15 m2| | | | 3 | 3 | 3 | || 3 | | 3| 95| 5| 3 qd23 H4| 3 | 2 | 3 | | 1 | 1 | 3 || 21 | | 2 1 3| 228| 3| 4 FRt5 m3| 2 | 3 | 2 | 1 | | | 1 || 23 | | 2 2 1| 142| 3| 5 p002Ft H4| | | | | | | || | | | 361| 9| 6 NNNz13 d1| 1 | 1 | 1 | 2 | 2 | 2 | 2 || 37 | | 3 1 | 48| 1| RRR02 Patients Mags 1=23 2=15 3=9 4=6 5=4 ___________________________________________________________________________________ ID Code fts |LOS|BH |pcE |pcEP|pcLP|pcSR|Spd||X| CONSm|Heal|pc| 1 qus H7| | | | | | | || || | 204|11| 2 w752 m3| | | | 3 | 3 | 3 | || 3 || 3| 95| 5| 3 rsss PT| 3 | 2 | 3 | | 1 | 1 | 3 || 21 || 2 1 3| 228| 3| 4 4562 m3| 2 | 3 | 2 | 1 | | | 1 || 23 || 2 2 1| 142| 3| 5 4862 H4| | | | | | | || || | 361| 9| 6 bVf6RW d1| 1 | 1 | 1 | 2 | 2 | 2 | 2 || 37 || 3 4 | 122| 1| 7 YyPpk8 M3| 2 | 3 | 2 | 1 | 1 | 2 | 3 || 922 || 2 2 3| 100|12|
That looks like a pretty standard report file. I am not sure why there seem to be more columns of data than names in the header row, but hopefully you know what that means.
To figure out what kind of line you are on so you know how to read the line take advantage of the trailing @ on the INPUT statement.
To capture the information in the "by" line between the blocks
BTP04 Patients Mags 1=23 2=15 3=9 4=6 5=4
take advantage of the RETAIN statement.
Something like this should work:
data want;
infile text dlm=' |' truncover;
length site $8 id 8;
retain site;
input id ?? @;
if _infile_ = ' ' or _infile_ in: ('_' 'ID') then delete;
if missing(id) then do;
input @1 site ;
delete;
end;
input code $ fts $;
heal = input(scan(_infile_,-2,'|'),??32.);
pc = input(scan(_infile_,-1,'|'),??32.);
run;
Results:
Obs site id code fts heal pc 1 BTP04 1 xys H9 322 11 2 BTP04 2 v3bwwwr1 m2 95 5 3 BTP04 3 qd23 H4 228 3 4 BTP04 4 FRt5 m3 142 3 5 BTP04 5 p002Ft H4 361 9 6 BTP04 6 NNNz13 d1 48 1 7 RRR02 1 qus H7 204 11 8 RRR02 2 w752 m3 95 5 9 RRR02 3 rsss PT 228 3 10 RRR02 4 4562 m3 142 3 11 RRR02 5 4862 H4 361 9 12 RRR02 6 bVf6RW d1 122 1 13 RRR02 7 YyPpk8 M3 100 12
So are you saying you want help reading data that looks like the lines in your attached file?
BTP04 Patients Mags 1=23 2=15 3=9 4=6 5=4 ___________________________________________________________________________________ ID Code fts |LOS|BH |pcE |pcEP|pcLP|pcSR|Spd |X| CONSm|Heal|pc| 1 xys H9| | | | | | | || | | | 322|11| 2 v3bwwwr15 m2| | | | 3 | 3 | 3 | || 3 | | 3| 95| 5| 3 qd23 H4| 3 | 2 | 3 | | 1 | 1 | 3 || 21 | | 2 1 3| 228| 3| 4 FRt5 m3| 2 | 3 | 2 | 1 | | | 1 || 23 | | 2 2 1| 142| 3| 5 p002Ft H4| | | | | | | || | | | 361| 9| 6 NNNz13 d1| 1 | 1 | 1 | 2 | 2 | 2 | 2 || 37 | | 3 1 | 48| 1| RRR02 Patients Mags 1=23 2=15 3=9 4=6 5=4 ___________________________________________________________________________________ ID Code fts |LOS|BH |pcE |pcEP|pcLP|pcSR|Spd||X| CONSm|Heal|pc| 1 qus H7| | | | | | | || || | 204|11| 2 w752 m3| | | | 3 | 3 | 3 | || 3 || 3| 95| 5| 3 rsss PT| 3 | 2 | 3 | | 1 | 1 | 3 || 21 || 2 1 3| 228| 3| 4 4562 m3| 2 | 3 | 2 | 1 | | | 1 || 23 || 2 2 1| 142| 3| 5 4862 H4| | | | | | | || || | 361| 9| 6 bVf6RW d1| 1 | 1 | 1 | 2 | 2 | 2 | 2 || 37 || 3 4 | 122| 1| 7 YyPpk8 M3| 2 | 3 | 2 | 1 | 1 | 2 | 3 || 922 || 2 2 3| 100|12|
That looks like a pretty standard report file. I am not sure why there seem to be more columns of data than names in the header row, but hopefully you know what that means.
To figure out what kind of line you are on so you know how to read the line take advantage of the trailing @ on the INPUT statement.
To capture the information in the "by" line between the blocks
BTP04 Patients Mags 1=23 2=15 3=9 4=6 5=4
take advantage of the RETAIN statement.
Something like this should work:
data want;
infile text dlm=' |' truncover;
length site $8 id 8;
retain site;
input id ?? @;
if _infile_ = ' ' or _infile_ in: ('_' 'ID') then delete;
if missing(id) then do;
input @1 site ;
delete;
end;
input code $ fts $;
heal = input(scan(_infile_,-2,'|'),??32.);
pc = input(scan(_infile_,-1,'|'),??32.);
run;
Results:
Obs site id code fts heal pc 1 BTP04 1 xys H9 322 11 2 BTP04 2 v3bwwwr1 m2 95 5 3 BTP04 3 qd23 H4 228 3 4 BTP04 4 FRt5 m3 142 3 5 BTP04 5 p002Ft H4 361 9 6 BTP04 6 NNNz13 d1 48 1 7 RRR02 1 qus H7 204 11 8 RRR02 2 w752 m3 95 5 9 RRR02 3 rsss PT 228 3 10 RRR02 4 4562 m3 142 3 11 RRR02 5 4862 H4 361 9 12 RRR02 6 bVf6RW d1 122 1 13 RRR02 7 YyPpk8 M3 100 12
I spoke with my colleague and she said that your code worked like a charm. Thank you,
Here is what I got.
data want;
infile "C:\Users\xiakeshan\Documents\Downloads\Consx.txt" encoding='utf8' termstr=crlf expandtabs;
input;
length site id code fts heal pc$ 80;
retain site;
if find(_infile_,'patients','it') then site=scan(_infile_,1,' ');
if prxmatch('/^\s*\d+/',_infile_) then do;
id=scan(_infile_,1,' ');
code=scan(_infile_,2,' ');
fts=scan(_infile_,3,' |');
heal=scan(_infile_,-2,'|');
pc=scan(_infile_,-1,'|');
output;
end;
run;
I know I'm late to the party, but I always like show example of NAMED-INPUT and PARMCARDS
options validvarname=any;
filename FT15F001 temp;
data test;
retain;
length dlm $1;
infile FT15F001 dlm=dlm stopover;
input @;
if _infile_ eq ' ' or _infile_ eq: '_' then delete;
if find(_infile_,'Patients') then do;
dlm = ' ';
input site:$5. @'Mags' ('1'n-'5'n)(=);
delete;
end;
else if _infile_ eq: ' ' then do;
_infile_ = tranwrd(_infile_,'||','| |');
dlm=' ';
input id code:$12. @;
dlm='|';
input fts:$2. (dum1 LOS BH pcE pcEP pcLP pcSR dum2 spd X)(:) CONSm:$8. (Heal pc)(:);
end;
else delete;
rename '1'n-'5'n=mag1-mag5;
parmcards4;
BTP04 Patients Mags 1=23 2=15 3=9 4=6 5=4
___________________________________________________________________________________
ID Code fts |LOS|BH |pcE |pcEP|pcLP|pcSR|Spd |X| CONSm|Heal|pc|
1 xys H9| | | | | | | || | | | 322|11|
2 v3bwwwr15 m2| | | | 3 | 3 | 3 | || 3 | | 3| 95| 5|
3 qd23 H4| 3 | 2 | 3 | | 1 | 1 | 3 || 21 | | 2 1 3| 228| 3|
4 FRt5 m3| 2 | 3 | 2 | 1 | | | 1 || 23 | | 2 2 1| 142| 3|
5 p002Ft H4| | | | | | | || | | | 361| 9|
6 NNNz13 d1| 1 | 1 | 1 | 2 | 2 | 2 | 2 || 37 | | 3 1 | 48| 1|
RRR02 Patients Mags 1=23 2=15 3=9 4=6 5=4
___________________________________________________________________________________
ID Code fts |LOS|BH |pcE |pcEP|pcLP|pcSR|Spd||X| CONSm|Heal|pc|
1 qus H7| | | | | | | || || | 204|11|
2 w752 m3| | | | 3 | 3 | 3 | || 3 || 3| 95| 5|
3 rsss PT| 3 | 2 | 3 | | 1 | 1 | 3 || 21 || 2 1 3| 228| 3|
4 4562 m3| 2 | 3 | 2 | 1 | | | 1 || 23 || 2 2 1| 142| 3|
5 4862 H4| | | | | | | || || | 361| 9|
6 bVf6RW d1| 1 | 1 | 1 | 2 | 2 | 2 | 2 || 37 || 3 4 | 122| 1|
7 YyPpk8 M3| 2 | 3 | 2 | 1 | 1 | 2 | 3 || 922 || 2 2 3| 100|12|
;;;;
run;
proc contents varnum;
proc print;
run;
/*
Obs site mag1 mag2 mag3 mag4 mag5 id code fts dum1 LOS BH pcE pcEP pcLP pcSR dum2 spd X CONSm Heal pc
1 BTP04 23 15 9 6 4 1 xys H9 . . . . . . . . . . 322 11
2 BTP04 23 15 9 6 4 2 v3bwwwr15 m2 . . . 3 3 3 . . 3 . 3 95 5
3 BTP04 23 15 9 6 4 3 qd23 H4 3 2 3 . 1 1 3 . 21 . 2 1 3 228 3
4 BTP04 23 15 9 6 4 4 FRt5 m3 2 3 2 1 . . 1 . 23 . 2 2 1 142 3
5 BTP04 23 15 9 6 4 5 p002Ft H4 . . . . . . . . . . 361 9
6 BTP04 23 15 9 6 4 6 NNNz13 d1 1 1 1 2 2 2 2 . 37 . 3 1 48 1
7 RRR02 23 15 9 6 4 1 qus H7 . . . . . . . . . . 204 11
8 RRR02 23 15 9 6 4 2 w752 m3 . . . 3 3 3 . . 3 . 3 95 5
9 RRR02 23 15 9 6 4 3 rsss PT 3 2 3 . 1 1 3 . 21 . 2 1 3 228 3
10 RRR02 23 15 9 6 4 4 4562 m3 2 3 2 1 . . 1 . 23 . 2 2 1 142 3
11 RRR02 23 15 9 6 4 5 4862 H4 . . . . . . . . . . 361 9
12 RRR02 23 15 9 6 4 6 bVf6RW d1 1 1 1 2 2 2 2 . 37 . 3 4 122 1
13 RRR02 23 15 9 6 4 7 YyPpk8 M3 2 3 2 1 1 2 3 . 922 . 2 2 3 100 12
*/
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.