SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Accepted Solution Solved
Reply
Contributor
Posts: 37
Accepted Solution

How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

name date value
aJan-110
aFeb-110
aMar-110
aMay-110
aJun-110
aJul-113
aAug-1115
aOct-1122
aNov-1121
aDec-1120
aFeb-120
aMar-1221
aApr-120
bJan-110
bFeb-110
bMar-110
bMay-1122
bJun-1123
bJul-1155
bAug-11565
bOct-11566
bNov-1120
bDec-1120
bFeb-1220
bMar-1211
bApr-120

Experts,

I am trying to read the data starting from first non-zero value for every "Name"  here reading for "a" should start from July-11 and for "b" it should start from May-11.

Can we do this by manipulating while reading raw file or we need have the data in sas dataset format and then only we use datasteps and procs to do task.

Thanks.


Accepted Solutions
Solution
‎12-29-2014 12:43 AM
Trusted Advisor
Posts: 1,137

Re: How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Posted in reply to HarshadMadhamshettiwar

Sorry, did you try removing the value variable from drop statement as below

data have;

input name$     date$     value;

cards;

a    Jan-11    0

a    Feb-11    0

a    Mar-11    0

a    May-11    0

a    Jun-11    0

a    Jul-11    3

a    Aug-11    15

a    Oct-11    22

a    Nov-11    21

a    Dec-11    20

a    Feb-12    0

a    Mar-12    21

a    Apr-12    0

b    Jan-11    0

b    Feb-11    0

b    Mar-11    0

b    May-11    22

b    Jun-11    23

b    Jul-11    55

b    Aug-11    565

b    Oct-11    566

b    Nov-11    20

b    Dec-11    20

b    Feb-12    20

b    Mar-12    11

b    Apr-12    0

;

data want;

set have;

retain flag;

by name ;

if first.name then flag=.;

if value ne 0 then flag=value;

if flag ne .;

drop flag;

run;

output:

Thanks,

Jag

Thanks,
Jag

View solution in original post


All Replies
Respected Advisor
Posts: 3,156

Re: How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Posted in reply to HarshadMadhamshettiwar

Here is a possible way:

data have;

     infile cards dlm='09'x;

     input (name     date) (:$8.)    value;

     retain n;

     if name ne lag(name) then

           n=constant('big');

     if value ne 0 then

           n=_n_;

     if _n_ >= n then

           output;

     cards;

a    Jan-11     0

a    Feb-11     0

a    Mar-11     0

a    May-11     0

a    Jun-11     0

a    Jul-11     3

a    Aug-11     15

a    Oct-11     22

a    Nov-11     21

a    Dec-11     20

a    Feb-12     0

a    Mar-12     21

a    Apr-12     0

b    Jan-11     0

b    Feb-11     0

b    Mar-11     0

b    May-11     22

b    Jun-11     23

b    Jul-11     55

b    Aug-11     565

b    Oct-11     566

b    Nov-11     20

b    Dec-11     20

b    Feb-12     20

b    Mar-12     11

b    Apr-12     0

;

Trusted Advisor
Posts: 1,137

Re: How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Posted in reply to HarshadMadhamshettiwar

Alternatively please try

data want;

set have;

retain flag;

by name ;

if first.name then flag=.;

if value ne 0 then flag=value;

if flag ne .;

drop value flag;

run;

Thanks,

Jag

Thanks,
Jag
Contributor
Posts: 37

Re: How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Posted in reply to HarshadMadhamshettiwar

Hai.kuo , this code is showing error.

Jagdishkatam, this seems not working correctly as values are truncating and its not starting "date" from first non zero instance for "name" "b".

Solution
‎12-29-2014 12:43 AM
Trusted Advisor
Posts: 1,137

Re: How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Posted in reply to HarshadMadhamshettiwar

Sorry, did you try removing the value variable from drop statement as below

data have;

input name$     date$     value;

cards;

a    Jan-11    0

a    Feb-11    0

a    Mar-11    0

a    May-11    0

a    Jun-11    0

a    Jul-11    3

a    Aug-11    15

a    Oct-11    22

a    Nov-11    21

a    Dec-11    20

a    Feb-12    0

a    Mar-12    21

a    Apr-12    0

b    Jan-11    0

b    Feb-11    0

b    Mar-11    0

b    May-11    22

b    Jun-11    23

b    Jul-11    55

b    Aug-11    565

b    Oct-11    566

b    Nov-11    20

b    Dec-11    20

b    Feb-12    20

b    Mar-12    11

b    Apr-12    0

;

data want;

set have;

retain flag;

by name ;

if first.name then flag=.;

if value ne 0 then flag=value;

if flag ne .;

drop flag;

run;

output:

Thanks,

Jag

Thanks,
Jag
Contributor
Posts: 37

Re: How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Posted in reply to Jagadishkatam

Yes..!

Its working now and giving expected output.

I know there may different ways of achieving the same result, Can you please explain not only how the code is working but also how to approach and think solutions for such data manipulations?

Thanks

Trusted Advisor
Posts: 1,137

Re: How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Posted in reply to HarshadMadhamshettiwar

In my code i used the concept of last observation carried forward (LOCF). With first.name, i tried to retain the values and if the value is zero then the previous values will replace the next value. so if the value is zero between the records then those values will be replaced with previous non zero value. This way only the first records will have zero values and remaining records get a non zero values. In last i did not consider the records with missing values(zero).

Hope this helps.

Thanks,

Jag

Thanks,
Jag
Frequent Contributor
Posts: 115

Re: How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Posted in reply to HarshadMadhamshettiwar

Hi, haikuo's code works fine. Just make the code to read a SAS dataset than raw data to understand easily:

data have;

input name$     date$     value;

cards;

a    Jan-11    0

a    Feb-11    0

a    Mar-11    0

a    May-11    0

a    Jun-11    0

a    Jul-11    3

a    Aug-11    15

a    Oct-11    22

a    Nov-11    21

a    Dec-11    20

a    Feb-12    0

a    Mar-12    21

a    Apr-12    0

b    Jan-11    0

b    Feb-11    0

b    Mar-11    0

b    May-11    22

b    Jun-11    23

b    Jul-11    55

b    Aug-11    565

b    Oct-11    566

b    Nov-11    20

b    Dec-11    20

b    Feb-12    20

b    Mar-12    11

b    Apr-12    0

;


data want;
set have;

   
     retain n;

     if name ne lag(name) then

           n=constant('big');

     if value ne 0 then

           n=_n_;

     if _n_ >= n then

           output;
     drop n;
     run;

Contributor
Posts: 37

Re: How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Posted in reply to naveen_srini

Oh yes it works, but can you also explain the logic behind the code.

I mean first two IF conditions are not interdependent and if i am correct they are mutually exclusive.

And code did not work with raw file/ Infile statement, why?

How can we put this in simple English language for better understanding.

Respected Advisor
Posts: 3,156

Re: How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Posted in reply to naveen_srini

It didn't work because the "copy & paste" between HTML and my text editor messed up the delimiter,  it converted 'Tabs' into 'Blanks'. In following version, the code is updated that way that data is streaming in to mimic the real life raw file instead of using unbuffered 'CARDS'.

filename FT15F001 temp;

data have;

     infile FT15F001;

     input (name     date) (:$8.) value;

     retain n;

     if name ne lag(name) then

           n=constant('big');

     if value ne 0 then

           n=_n_;

     if _n_ >= n then

           output;

     parmcards;

a Jan-11     0

a Feb-11     0

a Mar-11     0

a May-11     0

a Jun-11     0

a Jul-11     3

a Aug-11     15

a Oct-11     22

a Nov-11     21

a Dec-11     20

a Feb-12     0

a Mar-12     21

a Apr-12     0

b Jan-11     0

b Feb-11     0

b Mar-11     0

b May-11     22

b Jun-11     23

b Jul-11     55

b Aug-11     565

b Oct-11     566

b Nov-11     20

b Dec-11     20

b Feb-12     20

b Mar-12     11

b Apr-12     0

;

Frequent Contributor
Posts: 115

Re: How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Sir, That's fine.No worries! I know that haikuo's code can't go wrong. But I am just intrigued with your usage of constant function. What a Brilliant FULLPROOF concept that you gave us that can handle any large mega datasets. Did you just like that knew that 1.78e NUMBER?I was simply reading about that number for 2 hours lol..Can you offer a small and simple explanation on that constant application here,when and if you have the time/?

Sorry for the bother,

Naveen

Respected Advisor
Posts: 3,156

Re: How to ignore leading zeros or read data from first non-zero value in the data set/raw file for every unique observation....

Posted in reply to naveen_srini

Thanks, and my codes go wrong all the time unfortunately. Usage of Constant() is just a spontaneous act and I would have chosen a regular flag (like others here) in the real work to make it more readable and to some extend, more robust. The reason to use Constant('big') was a result from my decision to retain row number when first encounter value=0, and only outputs those greater row numbers (_n_>=n), reset to a largest number when switching IDs.

Regards,

Haikuo

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 1073 views
  • 10 likes
  • 4 in conversation