BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ramgouveia
Obsidian | Level 7

Hi,

 

Have the following problem, I want to count the number of successive missing values in a row, only in specific lines, also I want to know when the missing values occur at the beginning or at the end of the sequence. Take as an example the following code:

data example;
  input id var1 var2 var3 var4 var5 var6 var7 var8 var9 var10 var11 var12;
cards;

A	1 2 3 . . . . . 1 1 1 3
B	3 3 2 1 3 2 1 . . . . .
C	. . . . 1 2 3 1 2 3 2 .
D	3 . 1 . 3 . 1 . 3 . 1 .
F	1 3 . . 1 3 . . 1 3 . .
E	3 2 1 . . . . . 1 1 1 3
G	3 3 2 1 3 2 1 . . . . .
H	. . . . . 1 2 3 1 2 3 2
I	3 . 1 . 3 . 1 . 3 . 1 .
J	A E . . A E . . A E . . 
; 

In particular, I want to know the successive missing values in B,C,G and F.

 

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Here you go

data have;
  input id $2. var1 $2. var2 $2. var3 $2. var4 $2. var5 $2. var6 $2. var7 $2. var8 $2. var9 $2. var10 $2. var11 $2. var12 $2.;
cards;
A 1 2 3 . . . . . 1 1 1 3
B 3 3 2 1 3 2 1 . . . . .
C . . . . 1 2 3 1 2 3 2 .
D 3 . 1 . 3 . 1 . 3 . 1 .
F 1 3 . . 1 3 . . 1 3 . .
E 3 2 1 . . . . . 1 1 1 3
G 3 3 2 1 3 2 1 . . . . .
H . . . . . 1 2 3 1 2 3 2
I 3 . 1 . 3 . 1 . 3 . 1 .
J A E . . A E . . A E . . 
; 

data want(keep=id miss_lead miss_trail);
  set have;
  miss_lead=0;
  miss_trail=0;
  array v {*} var1-var12; 
  do i=1 to dim(v);
    if missing(v[i]) and miss_lead+1=i then miss_lead=i;
    if missing(v[dim(v)-i+1]) and miss_trail+1=i then miss_trail=i;
  end;
run;

Patrick_0-1618889162908.png

 

View solution in original post

9 REPLIES 9
PeterClemmensen
Tourmaline | Level 20

Why B, C, G and F? Is that arbitrary?

ramgouveia
Obsidian | Level 7
@PeterClemmensen

Because in those cases the missing values occur at the beginning or at the end of the sequence.
cminard
Obsidian | Level 7

I edited your data set a bit since it doesn't quite work as stated.

 

data example;
  input id $2. var1 $2. var2 $2. var3 $2. var4 $2. var5 $2. var6 $2. var7 $2. var8 $2. var9 $2. var10 $2. var11 $2. var12 $2.;
cards;
A 1 2 3 . . . . . 1 1 1 3
B 3 3 2 1 3 2 1 . . . . .
C . . . . 1 2 3 1 2 3 2 .
D 3 . 1 . 3 . 1 . 3 . 1 .
F 1 3 . . 1 3 . . 1 3 . .
E 3 2 1 . . . . . 1 1 1 3
G 3 3 2 1 3 2 1 . . . . .
H . . . . . 1 2 3 1 2 3 2
I 3 . 1 . 3 . 1 . 3 . 1 .
J A E . . A E . . A E . . 
; 
run;

proc sort data=example; by id; run;
proc transpose data=example out=texample;
	var var1-var12;
	by id;
run;

data texample2;
	set texample;
	by id;
	if first.id then misscount=1*(col1='');
	else do;
		if col1='' then misscount+1;
		if col1 ne '' then misscount=0;
	end;
run;

proc tabulate data=texample2;
	class id;
	var misscount;
	tables id='', misscount*(max=''*f=6.0) / box='id';
run;

This provides the number of consecutively missing observations for each id.

id misscount 
A  5 
B  5 
C  4 
D  1 
E  5 
F  2 
G  5 
H  5 
I  1 
J  2 

Is this close to what you want?

 

ramgouveia
Obsidian | Level 7

@cminard

The result I want is:

id misscount
A 0
B 0
C 4
D 0
E 0
F 0
G 5
H 5
I 0
J 0

I want to count the initial missing values

and

id misscount
A 0
B 5
C 1
D 1
E 2
F 0
G 5
H 0
I 1
J 2

I want to count the last missing values.

PeterClemmensen
Tourmaline | Level 20

Why is misscount = 1 for ID = "H"?

ramgouveia
Obsidian | Level 7
@PeterClemmensen

Is 0. Sorry for the error.
PeterClemmensen
Tourmaline | Level 20

Ok. The whole thing can be done like this then.

 

data want(keep = id misscount);
   set example;
   array v var12 - var1;
   misscount = 0;
   do over v;
      if not missing(v) then leave;
      misscount + 1;
   end;
run;

 

Result:

 

id  misscount 
A   0 
B   5 
C   1 
D   1 
F   2 
E   0 
G   5 
H   0 
I   1 
J   2 
Patrick
Opal | Level 21

Here you go

data have;
  input id $2. var1 $2. var2 $2. var3 $2. var4 $2. var5 $2. var6 $2. var7 $2. var8 $2. var9 $2. var10 $2. var11 $2. var12 $2.;
cards;
A 1 2 3 . . . . . 1 1 1 3
B 3 3 2 1 3 2 1 . . . . .
C . . . . 1 2 3 1 2 3 2 .
D 3 . 1 . 3 . 1 . 3 . 1 .
F 1 3 . . 1 3 . . 1 3 . .
E 3 2 1 . . . . . 1 1 1 3
G 3 3 2 1 3 2 1 . . . . .
H . . . . . 1 2 3 1 2 3 2
I 3 . 1 . 3 . 1 . 3 . 1 .
J A E . . A E . . A E . . 
; 

data want(keep=id miss_lead miss_trail);
  set have;
  miss_lead=0;
  miss_trail=0;
  array v {*} var1-var12; 
  do i=1 to dim(v);
    if missing(v[i]) and miss_lead+1=i then miss_lead=i;
    if missing(v[dim(v)-i+1]) and miss_trail+1=i then miss_trail=i;
  end;
run;

Patrick_0-1618889162908.png

 

ramgouveia
Obsidian | Level 7
Thank you very much @Patrick

It worked.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 832 views
  • 8 likes
  • 4 in conversation