BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GARYV
Fluorite | Level 6

Hello,

I have data from each quarter of a year in a fixed width text file. I am using the import function and then after selecting required variables, exporting it as an excel ( tried different formats too) file so that I can combine all four quarters of data for a particular year. But, when I tried to merge them using SAS its not working because for many variables SAS is reading them differently, either as numeric or character value. Each quarter has 300 plus variables and around 80,000 rows. How can I use SAS to read variable similarly for all quarters so that I can merge them? Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

DO NOT use proc import repeatedly. Once proc import has imported a text file, you find a data step in the log. Copy that, and use it for all similar files. Proc import is a guessing procedure that will deliver different structures for different content.

View solution in original post

5 REPLIES 5
Kurt_Bremser
Super User

DO NOT use proc import repeatedly. Once proc import has imported a text file, you find a data step in the log. Copy that, and use it for all similar files. Proc import is a guessing procedure that will deliver different structures for different content.

GARYV
Fluorite | Level 6

Hi Kurt,

I have another question. Which one is better to use in data steps, 'Missover' or 'Truncover'? Proc import data log has missover but when I checked for the missover vs truncover comparison, I felt latter is better to use. Any advice? Thank you

Mine is a healthcare data. Attached is the link to the data source.

https://www.dshs.texas.gov/THCIC/Hospitals/Download.shtm

 

GARYV
Fluorite | Level 6

Thank you

ballardw
Super User

I would be very surprised to find that Proc Import read any fixed column file correctly as data can appear such as:

1234567234567 

and be the values for 4 or 5 variables AND include implied decimals.

 

You should have a document that describes what columns represent each variable and whether it is expected to be character, numeric or a date/time/datetime variable. Likely it also describes the expected appearance for such things as dates and whether special codes are used to indicate missing data (old school values like -99 or -999 or 9999999 ).

The use that document to write a data step to read one of the files. Once you have the code to read one file you change the input file name and the output data set to read each.

Generic example:

data firstquarter;
   infile "path/filename"   ;
   input
      var1  1-8
      var2 $ 9-15
      var3  16-17
      var4  18-25
   ;
run;

reads four variables var1, var3 and var4 are numeric, var2 is character (the $). The numbers are the start/end columns of a variable and should be in your documentation. If a variable only takes a single position you use Varx  234 for example to read a single digit from column 234.

 

Dates and times likely will involve specifying the INFORMAT to read them correctly and assign appropriate formats for display.

 

In some cases the documentation is actually amenable to copy and paste into the Input statement if it reads something like

Variable name    start column end column    type

identifier             1                     20                 character

value                  21                   25                 numeric

 

if you can paste this into a spreadsheet with the values in the correct columns you can actual write a formula to build

"indentifier $  1 - 20"

"value   21 - 25"

using the type information.

If you have special code values in variables that should be numeric but look like N/A for "not available" or similar you will get missing (usually appropriate) but if the values are 9999999 to indicate missing or such you may either want to create custom informats to use.

 

 

Then either use a data step or proc append to combine the data in the order you want.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1049 views
  • 1 like
  • 3 in conversation