BookmarkSubscribeRSS Feed
sasuser143
Calcite | Level 5

if there are some missing in 1 to 6 in visitnum how can we get them out

Here I have this type of data

id     visitnum
101   1
101   4
101   6
102   2
102   4
102   5
103   1
104   2
104   4
104   6
105   1
105   4
106   1
106   6
107   1
107   4
107   5
108   2
108   4
109   2
109   6

 

I  want data as 

 

id    visitnum       avisitn
101     1                  1
101     4                  4
101     6                  5
101     .                   6
102     2                  1
102     4                  4
102     5                  5
102      .                  6
103     1                  1
103     .                   4
103     .                   5
103    .                    6
104    2                  1
104    4                  4
104    6                  5
104    .                   6
105    1                   1
105    4                  4
105    .                   5
105    .                  6
106    1                 1
106    6                 4
106    .                  5
106   .                  6
107   1                  1
107    4                4
107     5              5
107   .                6
108    2              1
108     4            4
108     5            5
108    6             6
109     2            1
109    6             4
109    .               5
109    .               6

8 REPLIES 8
andreas_lds
Jade | Level 19

Please explain the logic explaining the values of avistn, especially for

103   1     1
103   .     4
103   .     5
103   .     6
Ksharp
Super User
data have;
input id     visitnum;
cards;
101   1
101   4
101   6
102   2
102   4
102   5
103   1
104   2
104   4
104   6
105   1
105   4
106   1
106   6
107   1
107   4
107   5
108   2
108   4
109   2
109   6
;
proc freq data=have noprint;
table id*visitnum/out=temp sparse;
run;
data want;
 merge have temp(keep=id visitnum rename=(visitnum=avisitn));
 by id;
 output;
 call missing(of _all_);
run;
mkeintz
PROC Star

@Ksharp 

 

I ran your program which produced 45 observations in dataset WANT.  But the OP's sample WANT has only 36 observations.

 

@sasuser143 .   Please clarify the rules you intend to use to generate the results you provided.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
Ha. It looks like OP ignore the visitnum=2 . I don't know what OP want to do .
mkeintz
PROC Star

@Ksharp wrote:
Ha. It looks like OP ignore the visitnum=2 . I don't know what OP want to do .

Mostly, but not entirely true.  If visitnum=2 is the first obs, then the OP wants avisitnum=1.  But there is no example in which visitnum=2 appears, but is not the first obs.  I assumed that in such cases, avisitnum=2.

 

And the OP doesn't show any examples with visitnum=3.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tarheel13
Rhodochrosite | Level 12

I think this is what you want but agree with others that you need to clarify your requirements for us. 

data dummy;
    do id=101 to 109;
    do avisit='1','4','5','6';
    output;
        end;
        end;
run;

data dummy;
    set dummy;
    avisitn=input(avisit,best.);
    drop avisit;
run;

data have;
input id     visitnum;
cards;
101   1
101   4
101   6
102   2
102   4
102   5
103   1
104   2
104   4
104   6
105   1
105   4
106   1
106   6
107   1
107   4
107   5
108   2
108   4
109   2
109   6
;
run;

data have2;
    set have;
    by id; 
    retain seq;
    if first.id then seq=0;
    seq+1;
run;

data have3;
    set have2;
    if seq=1 then avisitn=1;
    else if seq=2 then avisitn=4;
    else if seq=3 then avisitn=5;
    else if seq=4 then avisitn=6;
run;

data merged(drop=seq);
    merge dummy(in=a) have3;
    by id avisitn;
    if a;
run;
Tom
Super User Tom
Super User

Seems strange but it kind of looks like you are just renumbering whatever is there as 1,4,5,6.

In that case use the list of ID numbers to generate a skeleton file and merge the two. If you want those missing values of VISITN instead of having the last value carried forward like a normal MERGE would add an explicit output and then use CALL MISSING() clear all of the variables that would normally be "retained" because they come from a dataset.

data skeleton;
  set have;
  by id vistnum;
  if first.id then do avistin=1,4,5,6 ;
     output;
  end;
  keep id avisitn;
run;

data want;
  merge have skeleton;
  by id;
  output;
  call missing(of _all_);
run;

 

mkeintz
PROC Star

Thank you for the sample data, and the WANT data set.  Unfortunately, when sample WANT data is provided without a description of the criteria to generate it that I remember why I do not do the NY Times crossword puzzle. Reading between the lines of HAVE and WANT pretty much exhausts my decryption skills and ability to sift through clues.

 

But I've come up with a conjecture to crack this enigma.  Here's what I think you want:

 

  1. For the first observation of any ID group, avisitnum gets a 1, no matter what the actual visitnum is for the first obs of the ID.
  2. There must be 3 obs at the end of each ID group, with avisitnum=4,5,6.
    1. If the last incoming obs is a visitnum=6, then output avisitnum=4 (or 1+prior visitnum  if that is higher) through avisitnum=6.  This avoids multiple outputs for avisitnum 4, 5, or 6.

      Also if last incoming is visitnum=6, then that value is output only with the first corresponding avisitnum.  After that set visitnum to missing for subsequent avisitnum obs.
    2. If the last incoming obs is visitnum=4 or 5, then output with avisitnum=4 (or 5) through avisitnum=6.  Again, set visitnum to missing after the first iteration of this loop.

    3. If the last incoming obs is visitnum<4 then output avisitnum=4 to 6, and set visitnum to missing for ALL iterations of the loop.

 

Using those rules this program produces the results you present (except for ID 108, where I hope you made an error in posting the desired output:

 

data have;
input id     visitnum;
cards;
101   1
101   4
101   6
102   2
102   4
102   5
103   1
104   2
104   4
104   6
105   1
105   4
106   1
106   6
107   1
107   4
107   5
108   2
108   4
109   2
109   6
run;

data _want (drop=_:);
  set have;
  by id;
  avisitnum=ifn(first.id=1,1,visitnum);
  /*Don't immediately output the last obs ... unless it's the ONLY obs*/
  if last.id=0 or first.id=1 then output;

  _lagv=ifn(first.id=1,0,lag(visitnum));
  if last.id;
  if visitnum<4 then visitnum=.;
  do avisitnum=max(_lagv+1,4) to 6;
    output;
    visitnum=.;
  end;
run;

 

 

 

As I said in another note, I would expect the results for ID 108 to be avisitnum=(1,4,5,6) and visitnum=(2,4,missing,missing).  But you have visitnum=(2,4,5,6), which if true, means my decryption above was wrong.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 578 views
  • 1 like
  • 6 in conversation