DATA Step, Macro, Functions and more

Need help in data import using infile

Reply
Occasional Contributor
Posts: 5

Need help in data import using infile

Hi, I am a beginner of SAS now.

Now I have a data which contains extremely long character variables. Which means the length of these variables are greater than 8. Meanwhile, there are blank and comma within this varable. What's worse, within the same column, there are different format of date.

I wonder how I should do. Actually I have an idea to use macro variables to track length of each variable and then use @ to locate where I should start to read for next variable. However once I have defined the length of several variables, I do not know the order that SAS read them.

I really appreciate your help.

Thank you very much.

SAS Super FREQ
Posts: 8,868

Re: Need help in data import using infile

Hi:

  My advice is not to jump on SAS Macro variables as the solution to your problem. I am not even sure I understand what your input data looks like. While your verbal description is OK, it is not clear. You said that you have "a data which contains extremely long character variables" but that doesn't convey whether you have a SAS data set, a raw ASCII text file of data or some other form of data (RDBMS, Excel, Microsoft Access), that you are trying to read/import.

  Usually when you ask as question about reading data, it is useful to post a sample of the data and the code that you have tried to use. You say that you are importing -- how are you doing the import? With code? What code? Proc IMPORT? Data step program, Enterprise Guide task?

  It would help everybody help you if you could post some of your data and provide information about where the data is coming from. Also provide an example of the code that you've tried to use. And, let us know the ultimate goal (you want a SAS data set so you can forecast something or you want a SAS dataset so you can run PROC GLM or you want a SAS dataset for a frequency report).

cynthia

Occasional Contributor
Posts: 5

Re: Need help in data import using infile

Posted in reply to Cynthia_sas

Thank you SOOOOOO much for your advice. I am posting the data now.

Sorry to confuse you, this is my first question in the community.:smileyblush:

Super Contributor
Super Contributor
Posts: 3,174

Re: Need help in data import using infile

Consider that there is nothing unusual (SAS input processing wise) about very long SAS CHARACTER variables - it's all about the LENGTH statement / variable definitions.

And with stated condition for different formatted dates in one column, one time-honored approach then is to use DATA / INFILE / INPUT processing to read each column as $CHARn.  and then scrub the date-related / date-time-related columns using assignment statement(s) and the INPUT(....) function, as well as possibly "flexible" INFORMATs for "ANY" date / datetime format handling, though use those with care and with validation logic.

Scott Barry

SBBWorks, Inc.

Occasional Contributor
Posts: 5

Re: Need help in data import using infile

Thank you I will try.

Occasional Contributor
Posts: 5

Re: Need help in data import using infile

I am so sorry that the data is confidential. I could only keep the form of the data. I am not sure whether you could accept it.

Actually I found a way to solve my problem yesterday to some extend. That means I use excel to change the format of character variables. For example let "Diagnosis" become "DSmiley Very Happyiagnosis". Then I will know when to read this variable.

However, when testing my code, I found although there is "N:" before "Complication" variable of the 5th observation, sas does not read it into sas data set.

Below is my code:

***Reading from Text file;

data PatientInfo;

      length Diagnosi $ 100

          /*Procedur $ 10000*/

          Anesthet $ 20

  Complica $ 100

  Notes $ 50

  ;

      infile 'F:\Patient Info.txt' delimiter = '09'x MISSOVER truncover DSD lrecl=32767 firstobs=2 obs=10;

      input

                PatienID

                DOB 

         DOS

                @'D:' Diagnosi $

  /*@'P:' Procedur $*/

                @'A:' Anesthet $

                PACUIn

                PACUOut

                Discharg

                @'C:' Complica $

                @'N:' Notes $

     ;

run;

proc print data=PatientInfo;

run;

Attachment
Super User
Super User
Posts: 7,077

Re: Need help in data import using infile

To fix the length just set the variables as long as you need them to be. Set them way too big to begin with and you can adjust them later.

The DSD option should handle the embedded delimiters.  Your sample file looks to have TAB as the delimiter.

You might have an issue with the embedded new lines.  See if using TERMSTR=CRLF fixes that.  If not you can search this site and find examples of programs to pre-process the file to remove the embedded new lines so that file can be read as one record per observation.

options compress=yes;

data PatientInfo;

length

    PatientID 8

    DOB  8

    DOS 8

    Diagnosis $1000

    Procedure $1000

    Anesthetic $1000

    Complication $1000

    Notes $1000

  ;

  infile 'F:\Patient Info.txt' dsd delimiter = '09'x truncover lrecl=32767

    firstobs=2 TERMSTR=CRLF obs=5

  ;

  informat DOB DOS anydtdte.;

  format DOB DOS YYMMDD10.;

  input PatientID -- Notes ;

run;

proc sql noprint ;

  select max(length(diagnosis))

       , max(length(procedure))

       , max(length(anesthetic))

       , max(length(complication))

       , max(length(notes))

    into :l_diagnosis

       , :l_procedure

       , :l_anesthetic

       , :l_complication

       , :l_notes

    from PatientInfo

   ;

quit;

Occasional Contributor
Posts: 5

Re: Need help in data import using infile

Oh you solved me another question. Since I only set some of variables in length statement, the order of variables in the output dataset is not the same as the original one.

Super Contributor
Super Contributor
Posts: 3,174

Re: Need help in data import using infile

Why is the SAS-internal variable order of importance?  Also, the DSD parameter purposefully addresses the condition where you have consecutive delimiters with intention of assigning a blank/null value to the corresponding SAS variable.

Scott Barry

SBBWorks, Inc.

Ask a Question
Discussion stats
  • 8 replies
  • 414 views
  • 3 likes
  • 4 in conversation