Hello. Can anyone help me import text (see attached have) data and create a new data (see attached want)?
Thank you.
Emma
@Emma8 wrote:
Thank you. It would be one time (one document contains multiple pages of data), see attached the text version.
I made up as similar to the original data, so n confidential info.
Thank you!
Once you've converted the .docx to a .txt and saved this .txt in a location accessible to the SAS server then some code as below should do the job.
filename src <path/filename.txt> lrecl=256;
data want(drop=_:);
length
Source $11
Id $8
Name $80
Visit $15
;
format Birthday date9. Date date9.;
infile src scanover truncover col=_col;
input @;
/* skip empty source lines */
if missing(_infile_) then delete;
if find(_infile_,'Master File:')>0 then
do;
source='Master File';
input @1 @'Master File:' +1 ID $8. @;
end;
else
if find(_infile_,'DDDI File')>0 then
do;
source='DDDI File';
input @1 @'DDDI File:' +1 ID $8. @;
end;
/* name */
input @_col @'Name:' +(-1) _dummy $1.@;
_start=_col;
input @_col @'VIS:' +(-5) _dummy $1. @;
_stop=_col;
_fwidth=_stop - _start;
input @_start Name $varying80. _fwidth @;
name=compbl(name);
/* visit */
_start=_col+5;
input @_col @'Birthdate:' +(-11) _dummy $1. @;
_stop=_col;
_fwidth=_stop - _start;
input @_start visit $varying15. _fwidth @;
/* Birthdate & Date */
input @'Birthdate:' Birthday :mmddyy10. @':' Date:mmddyy10.;
run;
Is your source data really in a Word doc or is this just how you've posted the data? If it's not in a Word doc then please post the data exactly in the format you have it. If it's in a Word doc then first step would be to save this doc as a text file and use this text file as source.
Is the data you're showing us just a sample or is this all the data you're dealing with? If it's all the data then I'd use a text editor and manually change things to create a friendlier structure for reading into SAS. I feel that would be faster than writing the code for the current structure (IF there isn't more actual data).
IF there is more actual data: Could there be missings or will all data elements ("variables") always have a value?
Is this a one time process or something you need to do multiple times?
1. Convert docx to txt - I posted a vbs script a few days ago to convert RTF to PDF. That would need minor modifications to convert docx to txt
2. Parse txt file - seems manageable to me - but would need to test
FYI - if any of this data is confidential you shouldn’t post it here. Assuming this is your actual data if you can run the first step and post the txt file we can help with the second step. If this is a one time issue, you can also go to the docx file, select Save As and save it as txt and then upload a sample here.
Thank you. It would be one time (one document contains multiple pages of data), see attached the text version.
I made up as similar to the original data, so n confidential info.
Thank you!
@Emma8 wrote:
Thank you. It would be one time (one document contains multiple pages of data), see attached the text version.
I made up as similar to the original data, so n confidential info.
Thank you!
Once you've converted the .docx to a .txt and saved this .txt in a location accessible to the SAS server then some code as below should do the job.
filename src <path/filename.txt> lrecl=256;
data want(drop=_:);
length
Source $11
Id $8
Name $80
Visit $15
;
format Birthday date9. Date date9.;
infile src scanover truncover col=_col;
input @;
/* skip empty source lines */
if missing(_infile_) then delete;
if find(_infile_,'Master File:')>0 then
do;
source='Master File';
input @1 @'Master File:' +1 ID $8. @;
end;
else
if find(_infile_,'DDDI File')>0 then
do;
source='DDDI File';
input @1 @'DDDI File:' +1 ID $8. @;
end;
/* name */
input @_col @'Name:' +(-1) _dummy $1.@;
_start=_col;
input @_col @'VIS:' +(-5) _dummy $1. @;
_stop=_col;
_fwidth=_stop - _start;
input @_start Name $varying80. _fwidth @;
name=compbl(name);
/* visit */
_start=_col+5;
input @_col @'Birthdate:' +(-11) _dummy $1. @;
_stop=_col;
_fwidth=_stop - _start;
input @_start visit $varying15. _fwidth @;
/* Birthdate & Date */
input @'Birthdate:' Birthday :mmddyy10. @':' Date:mmddyy10.;
run;
Thank you so much! It works so nicely.
I wonder could you help it the text file contains some extra lining at the end of some observations (attached). Thank you again.
Answer removed as it won't resolve the challenge. Answer addressing the challenge given here .
So cool. Thank you very much!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.