Help using Base SAS procedures

Importing quarterly data into SAS

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

Importing quarterly data into SAS

Hello,

I have data from each quarter of a year in a fixed width text file. I am using the import function and then after selecting required variables, exporting it as an excel ( tried different formats too) file so that I can combine all four quarters of data for a particular year. But, when I tried to merge them using SAS its not working because for many variables SAS is reading them differently, either as numeric or character value. Each quarter has 300 plus variables and around 80,000 rows. How can I use SAS to read variable similarly for all quarters so that I can merge them? Thank you.


Accepted Solutions
Solution
‎04-30-2018 03:20 PM
Super User
Posts: 10,217

Re: Importing quarterly data into SAS

DO NOT use proc import repeatedly. Once proc import has imported a text file, you find a data step in the log. Copy that, and use it for all similar files. Proc import is a guessing procedure that will deliver different structures for different content.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Solution
‎04-30-2018 03:20 PM
Super User
Posts: 10,217

Re: Importing quarterly data into SAS

DO NOT use proc import repeatedly. Once proc import has imported a text file, you find a data step in the log. Copy that, and use it for all similar files. Proc import is a guessing procedure that will deliver different structures for different content.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Occasional Contributor
Posts: 9

Re: Importing quarterly data into SAS

Posted in reply to KurtBremser

Hi Kurt,

I have another question. Which one is better to use in data steps, 'Missover' or 'Truncover'? Proc import data log has missover but when I checked for the missover vs truncover comparison, I felt latter is better to use. Any advice? Thank you

Mine is a healthcare data. Attached is the link to the data source.

https://www.dshs.texas.gov/THCIC/Hospitals/Download.shtm

 

Super User
Posts: 10,217

Re: Importing quarterly data into SAS

Truncover is better, hands down. Missover is more or less legacy.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Occasional Contributor
Posts: 9

Re: Importing quarterly data into SAS

Posted in reply to KurtBremser

Thank you

Super User
Posts: 13,523

Re: Importing quarterly data into SAS

I would be very surprised to find that Proc Import read any fixed column file correctly as data can appear such as:

1234567234567 

and be the values for 4 or 5 variables AND include implied decimals.

 

You should have a document that describes what columns represent each variable and whether it is expected to be character, numeric or a date/time/datetime variable. Likely it also describes the expected appearance for such things as dates and whether special codes are used to indicate missing data (old school values like -99 or -999 or 9999999 ).

The use that document to write a data step to read one of the files. Once you have the code to read one file you change the input file name and the output data set to read each.

Generic example:

data firstquarter;
   infile "path/filename"   ;
   input
      var1  1-8
      var2 $ 9-15
      var3  16-17
      var4  18-25
   ;
run;

reads four variables var1, var3 and var4 are numeric, var2 is character (the $). The numbers are the start/end columns of a variable and should be in your documentation. If a variable only takes a single position you use Varx  234 for example to read a single digit from column 234.

 

Dates and times likely will involve specifying the INFORMAT to read them correctly and assign appropriate formats for display.

 

In some cases the documentation is actually amenable to copy and paste into the Input statement if it reads something like

Variable name    start column end column    type

identifier             1                     20                 character

value                  21                   25                 numeric

 

if you can paste this into a spreadsheet with the values in the correct columns you can actual write a formula to build

"indentifier $  1 - 20"

"value   21 - 25"

using the type information.

If you have special code values in variables that should be numeric but look like N/A for "not available" or similar you will get missing (usually appropriate) but if the values are 9999999 to indicate missing or such you may either want to create custom informats to use.

 

 

Then either use a data step or proc append to combine the data in the order you want.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 185 views
  • 1 like
  • 3 in conversation