BookmarkSubscribeRSS Feed
ZRick
Obsidian | Level 7

When reading in the txt / csv file, we have to use input and length to specify the name of var, length, char or numeric etc.

Is there a process to create a sas data set dynamically without using input statement?

Often times using input statement is pain, slight change in the data, sas data set can not be created.

The way I can think of is the to read the first row and break it using ',' as delimiter. This way we get the name of variables.

But how do I set the length, char or numeric, those two things?

9 REPLIES 9
Tom
Super User Tom
Super User

I think you are looking for the PROC IMPORT.

art297
Opal | Level 21

I agree with Tom!  But, since you ask, yes, you CAN write code that does all of that.  Plus you give me an opportunity to plug a paper that I will be presenting at this year's SGF.  Take a look at the code and paper at:

http://www.sascommunity.org/wiki/Copy_and_Paste_Almost_Anything

It uses a datastep to accomplish those tasks for data that is in one's clipboard.  It wouldn't take much to modify the code to work on data that exists in a file.

ZRick
Obsidian | Level 7

Really appreciate the code Art297, I have to study your code little bit.

If I just want to read in csv/txt, what parameters do I need to modify?

ballardw
Super User

Depending on what your "slight change in the data, sas data set can not be created" may be you may still have problems. Changed variable names being the worst followed by the guessing routines in PROC IMPORT having a variable as numeric in one set and text in another and then the different lengths of string variables.

When I worked with a client that couldn't provide data in the same format consistently they got hit with high code maintenance charges. This involved things like order of variables, changing names and changing formats for date fields. If this is inside your own organization some pressure as to documenting standards and using them may be in order.

If the only issue you are actually seeing is the length of text fields I suggest either making noticeably longer text length assignments or more characters in the informat instead of "just enough". I often use PROC IMPORT to generate the base code to read one file and then examine the informats, variable names and formats assigned as well as the input statement. I often add 10 to 50% to the length of text variables and specify use of my variable names.

Proc import has difficultly using long strings as variable names. If the first 32 characters repeat so you may get variables with names like 'THIS_WAS_A_REALLY___LONG___COLUMN_HEADER', 'THIS_WAS_A_REALLY___LONG___COLUMN_HEADE1' 'THIS_WAS_A_REALLY___LONG___COLUMN_HEADE3'

art297
Opal | Level 21

Rick,

If your input file is tab delimited, the easiest way to use the code would not be to modify it but, rather, simply first read the file and write it out to your system's clipboard and then run the paper's code.

e.g., I just wrote a tab-delimited file called "c:\art\testfile.txt".  Before running the code described in the paper, I ran:

filename clippy clipbrd;

data _null_;

  file clippy;

  infile "c:\art\testfile.txt" lrecl=32767;

  input;

  put _infile_;

run;

filename clippy clear;

/*then I ran the code shown in the paper */

ZRick
Obsidian | Level 7

Hello Art297, would you shed some light on those nested scans? arrays and %filarray?

Your code is so powerful, which is beyond my comprehension, sorry about my foolishness.

art297
Opal | Level 21

Rick,  If you look at the powerpoint you'll see that the code was designed to do more than the typical proc import.

It was designed to import data from normal files as well as files that need to be transposed and/or are from forms rather than tables.

One of the macros was designed to transpose a file, and the other was to set up SAS processing of the macro variables to make the program easier to use.

If you review the code, carefully, you will discover that it is trying to accomplish the very things you asked about in your original post.

I'd be glad to answer any questions you have, but it would take an entire book to fully explan all of the code.

Tom
Super User Tom
Super User

The INPUT statement is actually extremely powerful and flexible and I find that it makes SAS a very useful tool for exploring unknown file formats.

If your file is delimited then you can quickly read it in a check the values.

For example if the number of variables changes (but you know they should all be numeric) then you can just read in some overly large number than analyze the dataset to see where they stop.

data sample;

   infile 'myfile' dsd truncover firstobs=2;

  input x1-x100 ;

run;

proc means data=sample; run;

If the file is totally unknown then read in a series of character strings and look at the results.

data sample;

   infile 'myfile' dsd truncover firstobs=2;

   length x1-x100 $200;

   input x1-x100;

run;

Ksharp
Super User

An alternative way is to use some functions about txt file, Such as FOPEN FREAD ..............

Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1055 views
  • 4 likes
  • 5 in conversation