BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Emma8
Quartz | Level 8

Hello. Can anyone help me import text (see attached have) data and create a new data (see attached want)?

Thank you.

Emma



1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@Emma8 wrote:

Thank you. It would be one time (one document contains multiple pages of data), see attached the text version. 

I made up as similar to the original data, so n confidential info.

Thank you!

 


Once you've converted the .docx to a .txt and saved this .txt in a location accessible to the SAS server then some code as below should do the job.

filename src <path/filename.txt> lrecl=256;

data want(drop=_:);
  length
    Source $11
    Id $8
    Name $80
    Visit $15
    ;
  format Birthday date9. Date date9.;
  infile src scanover truncover col=_col;
  input @;
  /* skip empty source lines */
  if missing(_infile_) then delete;

  if find(_infile_,'Master File:')>0 then
    do;
      source='Master File';
      input @1 @'Master File:' +1 ID $8. @;
    end;
  else 
  if find(_infile_,'DDDI File')>0 then
    do;
      source='DDDI File';
      input @1 @'DDDI File:' +1 ID $8. @;
    end;

  /* name */
  input @_col @'Name:' +(-1) _dummy $1.@;
  _start=_col;
  input @_col @'VIS:' +(-5) _dummy $1. @;
  _stop=_col;
  _fwidth=_stop - _start;
  input @_start Name $varying80. _fwidth @;
  name=compbl(name);

  /* visit */
  _start=_col+5;
  input @_col @'Birthdate:' +(-11) _dummy $1. @;
  _stop=_col;
  _fwidth=_stop - _start;
  input @_start visit $varying15. _fwidth @;

  /* Birthdate & Date */
  input @'Birthdate:' Birthday :mmddyy10. @':' Date:mmddyy10.;

run;

 

View solution in original post

8 REPLIES 8
Patrick
Opal | Level 21

Is your source data really in a Word doc or is this just how you've posted the data? If it's not in a Word doc then please post the data exactly in the format you have it. If it's in a Word doc then first step would be to save this doc as a text file and use this text file as source.

 

Is the data you're showing us just a sample or is this all the data you're dealing with? If it's all the data then I'd use a text editor and manually change things to create a friendlier structure for reading into SAS. I feel that would be faster than writing the code for the current structure (IF there isn't more actual data).

 

IF there is more actual data: Could there be missings or will all data elements ("variables") always have a value?

Emma8
Quartz | Level 8
Thanks.
The word doc is the data file I received—so it is word document. This is just an example data but actual data are over 300 pages.
Reeza
Super User

Is this a one time process or something you need to do multiple times?

 

1. Convert docx to txt - I posted a vbs script a few days ago to convert RTF to PDF. That would need minor modifications to convert docx to txt

2. Parse txt file - seems manageable to me - but would need to test

 

FYI - if any of this data is confidential you shouldn’t post it here. Assuming this is your actual data if you can run the first step and post the txt file we can help with the second step. If this is a one time issue, you can also go to the docx file, select Save As and save it as txt and then upload a sample here. 

Emma8
Quartz | Level 8

Thank you. It would be one time (one document contains multiple pages of data), see attached the text version. 

I made up as similar to the original data, so n confidential info.

Thank you!

 

Patrick
Opal | Level 21

@Emma8 wrote:

Thank you. It would be one time (one document contains multiple pages of data), see attached the text version. 

I made up as similar to the original data, so n confidential info.

Thank you!

 


Once you've converted the .docx to a .txt and saved this .txt in a location accessible to the SAS server then some code as below should do the job.

filename src <path/filename.txt> lrecl=256;

data want(drop=_:);
  length
    Source $11
    Id $8
    Name $80
    Visit $15
    ;
  format Birthday date9. Date date9.;
  infile src scanover truncover col=_col;
  input @;
  /* skip empty source lines */
  if missing(_infile_) then delete;

  if find(_infile_,'Master File:')>0 then
    do;
      source='Master File';
      input @1 @'Master File:' +1 ID $8. @;
    end;
  else 
  if find(_infile_,'DDDI File')>0 then
    do;
      source='DDDI File';
      input @1 @'DDDI File:' +1 ID $8. @;
    end;

  /* name */
  input @_col @'Name:' +(-1) _dummy $1.@;
  _start=_col;
  input @_col @'VIS:' +(-5) _dummy $1. @;
  _stop=_col;
  _fwidth=_stop - _start;
  input @_start Name $varying80. _fwidth @;
  name=compbl(name);

  /* visit */
  _start=_col+5;
  input @_col @'Birthdate:' +(-11) _dummy $1. @;
  _stop=_col;
  _fwidth=_stop - _start;
  input @_start visit $varying15. _fwidth @;

  /* Birthdate & Date */
  input @'Birthdate:' Birthday :mmddyy10. @':' Date:mmddyy10.;

run;

 

Emma8
Quartz | Level 8

Thank you so much! It works so nicely.

 

I wonder could you help it the text file contains some extra lining at the end of some observations (attached). Thank you again.

Patrick
Opal | Level 21

Answer removed as it won't resolve the challenge. Answer addressing the challenge given here .

Emma8
Quartz | Level 8

So cool. Thank you very much!

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 2646 views
  • 6 likes
  • 3 in conversation