BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sam369
Obsidian | Level 7

Hi All,

this is my part of dataset

 

SUBJID             PSADN            days       pcfn

70007             08DEC2008       -8              .

70007           15DEC2008         -1              .

70007            05JAN2009        21           1.2

70007            26JAN2009       42           -16.7

70007             23FEB2009     70             -10.6

70007           16MAR2009      91              47.1

70007            06APR2009      112           32.0

70007             28APR2009     134            8.2

70007            18MAY2009      154           46.1

70007            08JUN2009       175           39.6

70007          29JUN2009         196            55.0

70007           20JUL2009         217           51.8

70007          08SEP2009         267          149.9

70007           28SEP2009        287 1          114.5

i have to pick only those records which having pcfn>25. if pcfn>25 , in next records also it is greater than 25 . In the above example 16MAR2009  pcfn value is 47.1 but later the subject at 18may2009 he had <25 . So i dont want to consider those 2 records also. My final o/p be like

SUBJID             PSADN            days       pcfn        newdate                   flag

70007            18MAY2009      154           46.1       18MAY2009              1

70007            08JUN2009       175           39.6       18MAY2009              0

70007          29JUN2009         196            55.0       18MAY2009            0

70007           20JUL2009         217           51.8        18MAY2009           0

70007          08SEP2009         267          149.9       18MAY2009          0

70007           28SEP2009        287 1          114.5     18MAY2009         0

Thanks in advance!!!!!!!!

1 ACCEPTED SOLUTION

Accepted Solutions
MikeZdeb
Rhodochrosite | Level 12

hi ... one idea ...

data x;

input subjid :$5. psadn :date9.  days  pcfn;

datalines;

70007    08DEC2008     -8      .

70007    15DEC2008     -1      .

70007    05JAN2009     21     1.2

70007    26JAN2009     42   -16.7

70007    23FEB2009     70   -10.6

70007    16MAR2009     91    47.1

70007    06APR2009    112    32.0

70007    28APR2009    134     8.2

70007    18MAY2009    154    46.1

70007    08JUN2009    175    39.6

70007    29JUN2009    196    55.0

70007    20JUL2009    217    51.8

70007    08SEP2009    267   149.9

70007    28SEP2009    287   114.5

;

proc sort data=x;

by subjid days;

run;

data y;

do until (last.subjid);

   set x (in=one) x;

   by subjid;

   if one and pcfn le 25 then call missing(newdate);

   else

   newdate = ifn(one and missing(newdate), psadn , newdate);

   flag = psadn eq newdate;

   if ^one and psadn ge newdate then output;

end;

format newdate date9.;

run;

subjid        psadn    days     pcfn      newdate    flag

70007     18MAY2009     154     46.1    18MAY2009      1

70007     08JUN2009     175     39.6    18MAY2009      0

70007     29JUN2009     196     55.0    18MAY2009      0

70007     20JUL2009     217     51.8    18MAY2009      0

70007     08SEP2009     267    149.9    18MAY2009      0

70007     28SEP2009     287    114.5    18MAY2009      0

View solution in original post

5 REPLIES 5
MikeZdeb
Rhodochrosite | Level 12

hi ... one idea ...

data x;

input subjid :$5. psadn :date9.  days  pcfn;

datalines;

70007    08DEC2008     -8      .

70007    15DEC2008     -1      .

70007    05JAN2009     21     1.2

70007    26JAN2009     42   -16.7

70007    23FEB2009     70   -10.6

70007    16MAR2009     91    47.1

70007    06APR2009    112    32.0

70007    28APR2009    134     8.2

70007    18MAY2009    154    46.1

70007    08JUN2009    175    39.6

70007    29JUN2009    196    55.0

70007    20JUL2009    217    51.8

70007    08SEP2009    267   149.9

70007    28SEP2009    287   114.5

;

proc sort data=x;

by subjid days;

run;

data y;

do until (last.subjid);

   set x (in=one) x;

   by subjid;

   if one and pcfn le 25 then call missing(newdate);

   else

   newdate = ifn(one and missing(newdate), psadn , newdate);

   flag = psadn eq newdate;

   if ^one and psadn ge newdate then output;

end;

format newdate date9.;

run;

subjid        psadn    days     pcfn      newdate    flag

70007     18MAY2009     154     46.1    18MAY2009      1

70007     08JUN2009     175     39.6    18MAY2009      0

70007     29JUN2009     196     55.0    18MAY2009      0

70007     20JUL2009     217     51.8    18MAY2009      0

70007     08SEP2009     267    149.9    18MAY2009      0

70007     28SEP2009     287    114.5    18MAY2009      0

Linlin
Lapis Lazuli | Level 10

one way:

data have;

infile cards missover;

input SUBJID             PSADN $           days       pcfn;

cards;

70007             08DEC2008       -8              .

70007           15DEC2008         -1              .

70007            05JAN2009        21           1.2

70007            26JAN2009       42           -16.7

70007             23FEB2009     70             -10.6

70007           16MAR2009      91              47.1

70007            06APR2009      112           32.0

70007             28APR2009     134            8.2

70007            18MAY2009      154           46.1

70007            08JUN2009       175           39.6

70007          29JUN2009         196            55.0

70007           20JUL2009         217           51.8

70007          08SEP2009         267          149.9

70007           28SEP2009        287         114.5

;

data temp;

  set have;

  by subjid;

  if first.subjid then group+1;

  if not first.subjid and pcfn>25 and lag(pcfn)<=25 then group+1;

  run;

  proc sql;

    create table want as

  select * from temp

    where group in (

   select group from temp

     group by group

     having min(pcfn)>25);

  quit;

proc print;run;

sam369
Obsidian | Level 7

Thanks to you both!!!!!!!

Hi LinLin your apporach is good

Hi mike,

I got what i want but i have small question
what does it mean "Interleaving a Dataset with Itself' ?

could you explain little bit

art297
Opal | Level 21

Did you read the article that Mike mentioned in your previous thread? i.e.:

ps  "Interleaving a Dataset with Itself: How and Why" by Howard Schreier

http://www.nesug.org/proceedings/nesug03/cc/cc002.pdf


It does a nice job of explaining the process.


MikeZdeb
Rhodochrosite | Level 12

Hi ... as Art said, Howard's article is great.

Quickly, "official" definition of INTERLEAVE ...

"Interleaving uses a SET statement and a BY statement to combine multiple data sets into one new data set. The number of observations in the new data set is the sum of the number of observations from the original data sets. However, the observations in the new data set are arranged by the values of the BY variable or variables and, within each BY group, by the order of the data sets in which they occur. You can interleave data sets either by using a BY variable or by using an index."
So, if you specify the same data set name twice ...
set x (in=one) x;
by subjid;
the data step makes two passes through your data for each unique value of SUBJID
that allows you to make decisions about the content of your data during the first pass (in your data,  look for the last observation within a SUBJID where all subsequent values of PCFN are > 25) and in the second pass, you apply the rule formulated in the first pass (in your data, output only those observations with a date => the date found in the first pass)

Howard explains all this in his paper (it was the first place I saw anyone interleave a data set with itself ... among the many things I have learned from Howard's paper, talks, and postings).

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 712 views
  • 3 likes
  • 4 in conversation