if there are some missing in 1 to 6 in visitnum how can we get them out
Here I have this type of data
id visitnum
101 1
101 4
101 6
102 2
102 4
102 5
103 1
104 2
104 4
104 6
105 1
105 4
106 1
106 6
107 1
107 4
107 5
108 2
108 4
109 2
109 6
I want data as
id visitnum avisitn
101 1 1
101 4 4
101 6 5
101 . 6
102 2 1
102 4 4
102 5 5
102 . 6
103 1 1
103 . 4
103 . 5
103 . 6
104 2 1
104 4 4
104 6 5
104 . 6
105 1 1
105 4 4
105 . 5
105 . 6
106 1 1
106 6 4
106 . 5
106 . 6
107 1 1
107 4 4
107 5 5
107 . 6
108 2 1
108 4 4
108 5 5
108 6 6
109 2 1
109 6 4
109 . 5
109 . 6
Please explain the logic explaining the values of avistn, especially for
103 1 1 103 . 4 103 . 5 103 . 6
data have;
input id visitnum;
cards;
101 1
101 4
101 6
102 2
102 4
102 5
103 1
104 2
104 4
104 6
105 1
105 4
106 1
106 6
107 1
107 4
107 5
108 2
108 4
109 2
109 6
;
proc freq data=have noprint;
table id*visitnum/out=temp sparse;
run;
data want;
merge have temp(keep=id visitnum rename=(visitnum=avisitn));
by id;
output;
call missing(of _all_);
run;
I ran your program which produced 45 observations in dataset WANT. But the OP's sample WANT has only 36 observations.
@sasuser143 . Please clarify the rules you intend to use to generate the results you provided.
@Ksharp wrote:
Ha. It looks like OP ignore the visitnum=2 . I don't know what OP want to do .
Mostly, but not entirely true. If visitnum=2 is the first obs, then the OP wants avisitnum=1. But there is no example in which visitnum=2 appears, but is not the first obs. I assumed that in such cases, avisitnum=2.
And the OP doesn't show any examples with visitnum=3.
I think this is what you want but agree with others that you need to clarify your requirements for us.
data dummy;
do id=101 to 109;
do avisit='1','4','5','6';
output;
end;
end;
run;
data dummy;
set dummy;
avisitn=input(avisit,best.);
drop avisit;
run;
data have;
input id visitnum;
cards;
101 1
101 4
101 6
102 2
102 4
102 5
103 1
104 2
104 4
104 6
105 1
105 4
106 1
106 6
107 1
107 4
107 5
108 2
108 4
109 2
109 6
;
run;
data have2;
set have;
by id;
retain seq;
if first.id then seq=0;
seq+1;
run;
data have3;
set have2;
if seq=1 then avisitn=1;
else if seq=2 then avisitn=4;
else if seq=3 then avisitn=5;
else if seq=4 then avisitn=6;
run;
data merged(drop=seq);
merge dummy(in=a) have3;
by id avisitn;
if a;
run;
Seems strange but it kind of looks like you are just renumbering whatever is there as 1,4,5,6.
In that case use the list of ID numbers to generate a skeleton file and merge the two. If you want those missing values of VISITN instead of having the last value carried forward like a normal MERGE would add an explicit output and then use CALL MISSING() clear all of the variables that would normally be "retained" because they come from a dataset.
data skeleton;
set have;
by id vistnum;
if first.id then do avistin=1,4,5,6 ;
output;
end;
keep id avisitn;
run;
data want;
merge have skeleton;
by id;
output;
call missing(of _all_);
run;
Thank you for the sample data, and the WANT data set. Unfortunately, when sample WANT data is provided without a description of the criteria to generate it that I remember why I do not do the NY Times crossword puzzle. Reading between the lines of HAVE and WANT pretty much exhausts my decryption skills and ability to sift through clues.
But I've come up with a conjecture to crack this enigma. Here's what I think you want:
Using those rules this program produces the results you present (except for ID 108, where I hope you made an error in posting the desired output:
data have;
input id visitnum;
cards;
101 1
101 4
101 6
102 2
102 4
102 5
103 1
104 2
104 4
104 6
105 1
105 4
106 1
106 6
107 1
107 4
107 5
108 2
108 4
109 2
109 6
run;
data _want (drop=_:);
set have;
by id;
avisitnum=ifn(first.id=1,1,visitnum);
/*Don't immediately output the last obs ... unless it's the ONLY obs*/
if last.id=0 or first.id=1 then output;
_lagv=ifn(first.id=1,0,lag(visitnum));
if last.id;
if visitnum<4 then visitnum=.;
do avisitnum=max(_lagv+1,4) to 6;
output;
visitnum=.;
end;
run;
As I said in another note, I would expect the results for ID 108 to be avisitnum=(1,4,5,6) and visitnum=(2,4,missing,missing). But you have visitnum=(2,4,5,6), which if true, means my decryption above was wrong.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.