Need Efficient solution for Reading a raw file conditionally?

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 137
Accepted Solution

Need Efficient solution for Reading a raw file conditionally?

Hi SAS Folks,

I need an efficient solution(robust performance) in reading a very large(over 5 million records) comma separated file conditionally.  The CSV file looks like the following:

Id,date, name, address. post_code......etc totally 32 variables.

The condition is that values start from the 2nd observation with firstobs=2 as straight forward reading as one would expect, however I first need to read values of two variables from the header row and retain them across all obs.

The logic in my mind is something like:

data want;

if first obs 1 then infile 'mycsvfile' obs=1;

input var1 informat. var2 informat.

else if 2nd obs to EOF then infile 'same csv file'  and input all variables with informats

So in essence var1 and var2 will have to read in the beginning and it's value needs to be retained across all iterations until end of file. 

Your help with the right logic would really mean a lot.

Cheers,

Charlotte


Accepted Solutions
Solution
‎06-17-2015 02:59 PM
Esteemed Advisor
Posts: 5,011

Re: Need Efficient solution for Reading a raw file conditionally?

Here's one approach:

data want;

infile "mycsvfile" dsd;

if _n_=1 then do;

   input var1 informat. var2 informat.;

   delete;

run;

retain var1 var2;

input id date name address ...;

run;

I'm assuming you have all the other pieces you would need ... lengths/informats spelled out appropriately.

Good luck.

View solution in original post


All Replies
Grand Advisor
Posts: 10,258

Re: Need Efficient solution for Reading a raw file conditionally?

One way is to set a flag variable:

data want;

     informat var1 informat. var2 informat. ;

     retain flag . var1 var2;

     if flag=. then input var1 var2;

     flag=1; /* after the first read;

     input <the rest of the stuff>;

     drop flag;

run;

Other would be to create two datasets, one only ready line 1 for your var1 and 2 the other starting at obs2 then combine the two

Valued Guide
Posts: 856

Re: Need Efficient solution for Reading a raw file conditionally?

Not sure this is what you are looking for but I do not know how to perform what you are asking while the file is being pulled into SAS.  Hopefully someone else comes up with something better.

data have;

infile cards dsd;

input year code bep_org_code   bep_pgm_ele_code  tran_pgm_ele_code   end_date$ ;

cards;

2009,10,45,34,234,null

2010,15,45,34,234,current

2010,15,45,23,234,null

2010,15,45,34,234,current

2010,15,45,23,234,current

2010,15,45,13,234,null

;

run;

data want(rename=(year2=year code2=code));

set have;

retain year2 code2;

if _N_ = 1 then do;

    year2 = year;

    code2 = code;

end;

if _N_ > 1 then do;   

    year = year2;

    code = code2;

end;

drop year code;

run;

Solution
‎06-17-2015 02:59 PM
Esteemed Advisor
Posts: 5,011

Re: Need Efficient solution for Reading a raw file conditionally?

Here's one approach:

data want;

infile "mycsvfile" dsd;

if _n_=1 then do;

   input var1 informat. var2 informat.;

   delete;

run;

retain var1 var2;

input id date name address ...;

run;

I'm assuming you have all the other pieces you would need ... lengths/informats spelled out appropriately.

Good luck.

Frequent Contributor
Posts: 137

Re: Need Efficient solution for Reading a raw file conditionally?

Hi , very neat logic. Great! and Thank you!

Frequent Contributor
Posts: 137

Re: Need Efficient solution for Reading a raw file conditionally?

Hi , Sorry for the bother, in the code, may i seek a small clarification and tweak that's commented

data want;

infile "mycsvfile" dsd;

if _n_=1 then do;

   input var1 informat. var2 informat.;/* If i want the read the value of Var5, how to skip the 3rd and 4th value and read the value of Var5 correctly?Coz my understanding is clear to read var1 and var2 being the first 2 values and a third value to read happens to be the 5th in the buffer*/

   delete;

run;

retain var1 var2;

input id date name address ...;

run;

Esteemed Advisor
Posts: 5,011

Re: Need Efficient solution for Reading a raw file conditionally?

Charlotte,

This should do it:

length dummyvar $ 1;

drop dummyvar;

input var1 informat. var2 informat. dummyvar dummyvar var5 informat.;

The INPUT statement will keep reading till it finds a delimiter (not stopping just because it read one character of DUMMYVAR).

Good luck.

Frequent Contributor
Posts: 137

Re: Need Efficient solution for Reading a raw file conditionally?

Thank you Sir, I should have guessed it. My apologies for bothering again. Have a nice day

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 271 views
  • 3 likes
  • 4 in conversation