BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
name date value
aJan-110
aFeb-110
aMar-110
aMay-110
aJun-110
aJul-113
aAug-1115
aOct-1122
aNov-1121
aDec-1120
aFeb-120
aMar-1221
aApr-120
bJan-110
bFeb-110
bMar-110
bMay-1122
bJun-1123
bJul-1155
bAug-11565
bOct-11566
bNov-1120
bDec-1120
bFeb-1220
bMar-1211
bApr-120

Experts,

I am trying to read the data starting from first non-zero value for every "Name"  here reading for "a" should start from July-11 and for "b" it should start from May-11.

Can we do this by manipulating while reading raw file or we need have the data in sas dataset format and then only we use datasteps and procs to do task.

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Jagadishkatam
Amethyst | Level 16

Sorry, did you try removing the value variable from drop statement as below

data have;

input name$     date$     value;

cards;

a    Jan-11    0

a    Feb-11    0

a    Mar-11    0

a    May-11    0

a    Jun-11    0

a    Jul-11    3

a    Aug-11    15

a    Oct-11    22

a    Nov-11    21

a    Dec-11    20

a    Feb-12    0

a    Mar-12    21

a    Apr-12    0

b    Jan-11    0

b    Feb-11    0

b    Mar-11    0

b    May-11    22

b    Jun-11    23

b    Jul-11    55

b    Aug-11    565

b    Oct-11    566

b    Nov-11    20

b    Dec-11    20

b    Feb-12    20

b    Mar-12    11

b    Apr-12    0

;

data want;

set have;

retain flag;

by name ;

if first.name then flag=.;

if value ne 0 then flag=value;

if flag ne .;

drop flag;

run;

output:

Thanks,

Jag

Thanks,
Jag

View solution in original post

11 REPLIES 11
Haikuo
Onyx | Level 15

Here is a possible way:

data have;

     infile cards dlm='09'x;

     input (name     date) (:$8.)    value;

     retain n;

     if name ne lag(name) then

           n=constant('big');

     if value ne 0 then

           n=_n_;

     if _n_ >= n then

           output;

     cards;

a    Jan-11     0

a    Feb-11     0

a    Mar-11     0

a    May-11     0

a    Jun-11     0

a    Jul-11     3

a    Aug-11     15

a    Oct-11     22

a    Nov-11     21

a    Dec-11     20

a    Feb-12     0

a    Mar-12     21

a    Apr-12     0

b    Jan-11     0

b    Feb-11     0

b    Mar-11     0

b    May-11     22

b    Jun-11     23

b    Jul-11     55

b    Aug-11     565

b    Oct-11     566

b    Nov-11     20

b    Dec-11     20

b    Feb-12     20

b    Mar-12     11

b    Apr-12     0

;

Jagadishkatam
Amethyst | Level 16

Alternatively please try

data want;

set have;

retain flag;

by name ;

if first.name then flag=.;

if value ne 0 then flag=value;

if flag ne .;

drop value flag;

run;

Thanks,

Jag

Thanks,
Jag
HarshadMadhamshettiwar
Obsidian | Level 7

Hai.kuo , this code is showing error.

Jagdishkatam, this seems not working correctly as values are truncating and its not starting "date" from first non zero instance for "name" "b".

Jagadishkatam
Amethyst | Level 16

Sorry, did you try removing the value variable from drop statement as below

data have;

input name$     date$     value;

cards;

a    Jan-11    0

a    Feb-11    0

a    Mar-11    0

a    May-11    0

a    Jun-11    0

a    Jul-11    3

a    Aug-11    15

a    Oct-11    22

a    Nov-11    21

a    Dec-11    20

a    Feb-12    0

a    Mar-12    21

a    Apr-12    0

b    Jan-11    0

b    Feb-11    0

b    Mar-11    0

b    May-11    22

b    Jun-11    23

b    Jul-11    55

b    Aug-11    565

b    Oct-11    566

b    Nov-11    20

b    Dec-11    20

b    Feb-12    20

b    Mar-12    11

b    Apr-12    0

;

data want;

set have;

retain flag;

by name ;

if first.name then flag=.;

if value ne 0 then flag=value;

if flag ne .;

drop flag;

run;

output:

Thanks,

Jag

Thanks,
Jag
HarshadMadhamshettiwar
Obsidian | Level 7

Yes..!

Its working now and giving expected output.

I know there may different ways of achieving the same result, Can you please explain not only how the code is working but also how to approach and think solutions for such data manipulations?

Thanks

Jagadishkatam
Amethyst | Level 16

In my code i used the concept of last observation carried forward (LOCF). With first.name, i tried to retain the values and if the value is zero then the previous values will replace the next value. so if the value is zero between the records then those values will be replaced with previous non zero value. This way only the first records will have zero values and remaining records get a non zero values. In last i did not consider the records with missing values(zero).

Hope this helps.

Thanks,

Jag

Thanks,
Jag
naveen_srini
Quartz | Level 8

Hi, haikuo's code works fine. Just make the code to read a SAS dataset than raw data to understand easily:

data have;

input name$     date$     value;

cards;

a    Jan-11    0

a    Feb-11    0

a    Mar-11    0

a    May-11    0

a    Jun-11    0

a    Jul-11    3

a    Aug-11    15

a    Oct-11    22

a    Nov-11    21

a    Dec-11    20

a    Feb-12    0

a    Mar-12    21

a    Apr-12    0

b    Jan-11    0

b    Feb-11    0

b    Mar-11    0

b    May-11    22

b    Jun-11    23

b    Jul-11    55

b    Aug-11    565

b    Oct-11    566

b    Nov-11    20

b    Dec-11    20

b    Feb-12    20

b    Mar-12    11

b    Apr-12    0

;


data want;
set have;

   
     retain n;

     if name ne lag(name) then

           n=constant('big');

     if value ne 0 then

           n=_n_;

     if _n_ >= n then

           output;
     drop n;
     run;

HarshadMadhamshettiwar
Obsidian | Level 7

Oh yes it works, but can you also explain the logic behind the code.

I mean first two IF conditions are not interdependent and if i am correct they are mutually exclusive.

And code did not work with raw file/ Infile statement, why?

How can we put this in simple English language for better understanding.

Haikuo
Onyx | Level 15

It didn't work because the "copy & paste" between HTML and my text editor messed up the delimiter,  it converted 'Tabs' into 'Blanks'. In following version, the code is updated that way that data is streaming in to mimic the real life raw file instead of using unbuffered 'CARDS'.

filename FT15F001 temp;

data have;

     infile FT15F001;

     input (name     date) (:$8.) value;

     retain n;

     if name ne lag(name) then

           n=constant('big');

     if value ne 0 then

           n=_n_;

     if _n_ >= n then

           output;

     parmcards;

a Jan-11     0

a Feb-11     0

a Mar-11     0

a May-11     0

a Jun-11     0

a Jul-11     3

a Aug-11     15

a Oct-11     22

a Nov-11     21

a Dec-11     20

a Feb-12     0

a Mar-12     21

a Apr-12     0

b Jan-11     0

b Feb-11     0

b Mar-11     0

b May-11     22

b Jun-11     23

b Jul-11     55

b Aug-11     565

b Oct-11     566

b Nov-11     20

b Dec-11     20

b Feb-12     20

b Mar-12     11

b Apr-12     0

;

naveen_srini
Quartz | Level 8

Sir, That's fine.No worries! I know that haikuo's code can't go wrong. But I am just intrigued with your usage of constant function. What a Brilliant FULLPROOF concept that you gave us that can handle any large mega datasets. Did you just like that knew that 1.78e NUMBER?I was simply reading about that number for 2 hours lol..Can you offer a small and simple explanation on that constant application here,when and if you have the time/?

Sorry for the bother,

Naveen

Haikuo
Onyx | Level 15

Thanks, and my codes go wrong all the time unfortunately. Usage of Constant() is just a spontaneous act and I would have chosen a regular flag (like others here) in the real work to make it more readable and to some extend, more robust. The reason to use Constant('big') was a result from my decision to retain row number when first encounter value=0, and only outputs those greater row numbers (_n_>=n), reset to a largest number when switching IDs.

Regards,

Haikuo

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 2812 views
  • 10 likes
  • 4 in conversation