How to output one variable from many varibale while reading .csv?

Reply
New Contributor
Posts: 2

How to output one variable from many varibale while reading .csv?

Since i'm newbee to SAS, i need a help for the following requirement.

SAS code to write only one variable in output from many variables while reading .csv. As the source file is .csv , i feel hard to find out the SAS statements\functions for this scenario.

Example: I've 100 varibles with some 1000 observations in .csv. In my output, need only the observations which is availble only in 88th  variable.

Super Contributor
Posts: 297

Re: How to output one variable from many varibale while reading .csv?

Hi CatchRam,

So let me get this right.  You have 100 variables and there is only data in one of these variables for each observation?  In some cases it could be variable 88 but in others it could be variable 1?  You want a new variable called X that contains whatever appears in the populated variable?

Is that correct or will the value always be located in variable 88?

Regards,

Scott

Super Contributor
Posts: 644

Re: How to output one variable from many varibale while reading .csv?

If you us Proc Import or some other method for importing the text file, follow with the data step

Data want ;

     Set have (keep = <variable88>) ;

Run ;

Otherwise if you are importing using a datastep

Data want (Keep = (variable88>) ;

     infile ... etc

Richard

Super Contributor
Posts: 282

Re: How to output one variable from many varibale while reading .csv?

Hi,

If you are *not* interested in reading the other data into program variables then you could try the scan function in a data step, e.g.:

data want;

  input;

  myvar=scan(_infile_,3,',');

  datalines;

aaa,bbb,ccc,ddd,eee

fff,ggg,hhh,iii,jjj

kkk,lll,mmm,nnn,ooo

;

For more information on the scan function see:

SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition

Regards,

Amir.

Valued Guide
Posts: 765

Re: How to output one variable from many varibale while reading .csv?

Hi ... you might want to add a LENGTH statement for the variable MYVAR since the result of the SCAN function is a character variable with a length of 200. 

Super Contributor
Posts: 282

Re: How to output one variable from many varibale while reading .csv?

Hi,

Agreed; I thought of that when I checked the solution, but then went ahead and forgot to post it :smileyblush:

Regards,

Amir.

N/A
Posts: 1

Re: How to output one variable from many varibale while reading .csv?

Hi,

You can use Data Step with Infile command with DLM=',' and DSD option. In the data step use keep option to keep only the 88th variable.

for example:

input file

Name,Country,Area

A,India,1000

B,China,1500

So you can write the code like:

Data temp (keep=Country);

Infile '<your location of the file>' DLM=',' DSD;

Input Name $ Country $ Area;

Run;

Hope this will help you.

Cheers,

Shrawan

Valued Guide
Posts: 765

Re: How to output one variable from many varibale while reading .csv?

Hi.  The DSD option sets the delimiter as a comma so you can leave out the DLM option.

Super User
Super User
Posts: 6,500

Re: How to output one variable from many varibale while reading .csv?

You will probably get better advice if you put the question in context.  In general it is not necessary to tease out a particular variable from a dataset for processing in SAS as most procedure let you specify which variables to use in your analysis.

As to reading in CSV files into actual data set the main issue that CSV files have no method of providing the necessary metadata to define their contents.  To read them they are two main methods. You can use PROC IMPORT and let SAS try to guess what types of variables are in the file or write your own DATA step to the read the file.

Assuming that you data is a nice set of consistently defined variables then I find it much easier and appropriate to use a data step.  For example if you data is 100 columns of numbers then you could do something like:

data mydata ;

  infile 'mydata.csv' dsd dlm=',' truncover lrecl=30000 firstobs=2;

  input var1-var100 ;

run;

If you actually decide it is useful to only keep the single variable that is in the 88th column then add a KEEP statement.

If what you need to do is to OUTPUT this column (to what? in what format?) then perhaps you just want to use a PROC like REPORT or PRINT.  Or perhaps another data step?

data _null_;

set mydata;

file 'mynewdata.csv' dsd lrecl=30000;

put var88;

run;

You could even include the FILE and PUT statement in the first data step and write the new text file as you read the old one.

Ask a Question
Discussion stats
  • 8 replies
  • 346 views
  • 2 likes
  • 7 in conversation