DATA Step, Macro, Functions and more

How can I use a DATA step to import data set with varying number of variables?

Accepted Solution Solved
Reply
Contributor
Posts: 33
Accepted Solution

How can I use a DATA step to import data set with varying number of variables?

Good morning,

 

I am a SAS Enterprise Guide 7.1 user with a little Base SAS programming knowledge. I have a monthly data set which sometimes contains 37 variables and sometimes contains 38. The first 37 variables are always the same and formatted the same way. The 38th variable, when it appears, always comes at the very end and is not needed. I would like to write a program using a DATA step to import the data, so that I can use the same program each month and just change the filename. I would like the program to work regardless of whether I have 37 or 38 variables that month. This may be a very basic question, but what do I need to include in the program so that it executes successfully whether the 38th variable appears that month or not?

 

Thanks in advance for any help!

 

Here are excerpts (shortened versions) from the code I use for a report with 37 variables and for 38 variables:

 

37 Variables Version:

 

DATA <LIBRARY>.<FILENAME>;

     LENGTH

          <yada yada field list>;

     FORMAT

         <yada yada field list>;

     INFORMAT

         <yada yada field list>;

     INFILE '/sas/saswork/<path>'

         LRECL=448

         ENCODING="LATIN1"

         TERMSTR=CRLF

         DLM='|'

         MISSOVER

         DSD ;

     INPUT

         <yada yada field list> ;

     RUN;

 

38 variables version (differences in red font):

 

DATA <LIBRARY>.<FILENAME>;

     LENGTH

         <yada yada field list>;

     DROP

            F38 ;

     FORMAT

         <yada yada field list>;

     INFORMAT

         <yada yada field list>;

     INFILE '/sas/saswork/<path>'

         LRECL=452

         ENCODING="LATIN1"

         TERMSTR=CRLF

         DLM='|'

         MISSOVER

         DSD ;

     INPUT

     <yada yada field list>

         F38              : $1. ;

     RUN;


Accepted Solutions
Solution
a week ago
Respected Advisor
Posts: 3,832

Re: How can I use a DATA step to import data set with varying number of variables?

Posted in reply to brookeewhite1

@brookeewhite1 wrote:

Good morning,

 

I am a SAS Enterprise Guide 7.1 user with a little Base SAS programming knowledge. I have a monthly data set which sometimes contains 37 variables and sometimes contains 38. The first 37 variables are always the same and formatted the same way. The 38th variable, when it appears, always comes at the very end and is not needed. I would like to write a program using a DATA step to import the data, so that I can use the same program each month and just change the filename. I would like the program to work regardless of whether I have 37 or 38 variables that month. This may be a very basic question, but what do I need to include in the program so that it executes successfully whether the 38th variable appears that month or not?

 

Thanks in advance for any help!

 

Here are excerpts (shortened versions) from the code I use for a report with 37 variables and for 38 variables:

 

37 Variables Version:

 

DATA <LIBRARY>.<FILENAME>;

     LENGTH

          <yada yada field list>;

     FORMAT

         <yada yada field list>;

     INFORMAT

         <yada yada field list>;

     INFILE '/sas/saswork/<path>'

         LRECL=448

         ENCODING="LATIN1"

         TERMSTR=CRLF

         DLM='|'

         MISSOVER

         DSD ;

     INPUT

         <yada yada field list> ;

     RUN;

 

38 variables version (differences in red font):

 

DATA <LIBRARY>.<FILENAME>;

     LENGTH

         <yada yada field list>;

     DROP

            F38 ;

     FORMAT

         <yada yada field list>;

     INFORMAT

         <yada yada field list>;

     INFILE '/sas/saswork/<path>'

         LRECL=452

         ENCODING="LATIN1"

         TERMSTR=CRLF

         DLM='|'

         MISSOVER

         DSD ;

     INPUT

     <yada yada field list>

         F38              : $1. ;

     RUN;


 

With your existing INFILE option MISSOVER you can read 38 variables for either the 37 variable file or the 38 variable file.  What do you want to do with the 38th variable you have DROP F38 so it is unclear "to me" what your ultimate goal is?

 

If you only want to read 37 variable regardless of the existence of a 38th field the 37 variable program will work for both.

View solution in original post


All Replies
Super User
Posts: 22,844

Re: How can I use a DATA step to import data set with varying number of variables?

Posted in reply to brookeewhite1
If the extra variable is always at the end then include it always and TRUNCOVER and if it's not in the data it will be missing, but the rest of the data is fine. Then you can add a check after to check if its all missing and drop the column if it is. If the field is in the middle you can't use this approach Smiley Sad
Solution
a week ago
Respected Advisor
Posts: 3,832

Re: How can I use a DATA step to import data set with varying number of variables?

Posted in reply to brookeewhite1

@brookeewhite1 wrote:

Good morning,

 

I am a SAS Enterprise Guide 7.1 user with a little Base SAS programming knowledge. I have a monthly data set which sometimes contains 37 variables and sometimes contains 38. The first 37 variables are always the same and formatted the same way. The 38th variable, when it appears, always comes at the very end and is not needed. I would like to write a program using a DATA step to import the data, so that I can use the same program each month and just change the filename. I would like the program to work regardless of whether I have 37 or 38 variables that month. This may be a very basic question, but what do I need to include in the program so that it executes successfully whether the 38th variable appears that month or not?

 

Thanks in advance for any help!

 

Here are excerpts (shortened versions) from the code I use for a report with 37 variables and for 38 variables:

 

37 Variables Version:

 

DATA <LIBRARY>.<FILENAME>;

     LENGTH

          <yada yada field list>;

     FORMAT

         <yada yada field list>;

     INFORMAT

         <yada yada field list>;

     INFILE '/sas/saswork/<path>'

         LRECL=448

         ENCODING="LATIN1"

         TERMSTR=CRLF

         DLM='|'

         MISSOVER

         DSD ;

     INPUT

         <yada yada field list> ;

     RUN;

 

38 variables version (differences in red font):

 

DATA <LIBRARY>.<FILENAME>;

     LENGTH

         <yada yada field list>;

     DROP

            F38 ;

     FORMAT

         <yada yada field list>;

     INFORMAT

         <yada yada field list>;

     INFILE '/sas/saswork/<path>'

         LRECL=452

         ENCODING="LATIN1"

         TERMSTR=CRLF

         DLM='|'

         MISSOVER

         DSD ;

     INPUT

     <yada yada field list>

         F38              : $1. ;

     RUN;


 

With your existing INFILE option MISSOVER you can read 38 variables for either the 37 variable file or the 38 variable file.  What do you want to do with the 38th variable you have DROP F38 so it is unclear "to me" what your ultimate goal is?

 

If you only want to read 37 variable regardless of the existence of a 38th field the 37 variable program will work for both.

Contributor
Posts: 33

Re: How can I use a DATA step to import data set with varying number of variables?

[ Edited ]
Posted in reply to data_null__

Thank you, data_null_,

 

Your solution was correct. I previously thought I had to modify the code because I was getting an error message, but I later determined something else entirely had caused the error. The code example used for 37 variables worked fine for a set with 38 variables also.

 

Thanks so much!

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 98 views
  • 1 like
  • 3 in conversation