BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ramgouveia
Obsidian | Level 7

Hi,

 

Have the following problem, I want to count the number of successive missing values in a row, only in specific lines, also I want to know when the missing values occur at the beginning or at the end of the sequence. Take as an example the following code:

data example;
  input id var1 var2 var3 var4 var5 var6 var7 var8 var9 var10 var11 var12;
cards;

A	1 2 3 . . . . . 1 1 1 3
B	3 3 2 1 3 2 1 . . . . .
C	. . . . 1 2 3 1 2 3 2 .
D	3 . 1 . 3 . 1 . 3 . 1 .
F	1 3 . . 1 3 . . 1 3 . .
E	3 2 1 . . . . . 1 1 1 3
G	3 3 2 1 3 2 1 . . . . .
H	. . . . . 1 2 3 1 2 3 2
I	3 . 1 . 3 . 1 . 3 . 1 .
J	A E . . A E . . A E . . 
; 

In particular, I want to know the successive missing values in B,C,G and F.

 

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Here you go

data have;
  input id $2. var1 $2. var2 $2. var3 $2. var4 $2. var5 $2. var6 $2. var7 $2. var8 $2. var9 $2. var10 $2. var11 $2. var12 $2.;
cards;
A 1 2 3 . . . . . 1 1 1 3
B 3 3 2 1 3 2 1 . . . . .
C . . . . 1 2 3 1 2 3 2 .
D 3 . 1 . 3 . 1 . 3 . 1 .
F 1 3 . . 1 3 . . 1 3 . .
E 3 2 1 . . . . . 1 1 1 3
G 3 3 2 1 3 2 1 . . . . .
H . . . . . 1 2 3 1 2 3 2
I 3 . 1 . 3 . 1 . 3 . 1 .
J A E . . A E . . A E . . 
; 

data want(keep=id miss_lead miss_trail);
  set have;
  miss_lead=0;
  miss_trail=0;
  array v {*} var1-var12; 
  do i=1 to dim(v);
    if missing(v[i]) and miss_lead+1=i then miss_lead=i;
    if missing(v[dim(v)-i+1]) and miss_trail+1=i then miss_trail=i;
  end;
run;

Patrick_0-1618889162908.png

 

View solution in original post

9 REPLIES 9
PeterClemmensen
Tourmaline | Level 20

Why B, C, G and F? Is that arbitrary?

ramgouveia
Obsidian | Level 7
@PeterClemmensen

Because in those cases the missing values occur at the beginning or at the end of the sequence.
cminard
Quartz | Level 8

I edited your data set a bit since it doesn't quite work as stated.

 

data example;
  input id $2. var1 $2. var2 $2. var3 $2. var4 $2. var5 $2. var6 $2. var7 $2. var8 $2. var9 $2. var10 $2. var11 $2. var12 $2.;
cards;
A 1 2 3 . . . . . 1 1 1 3
B 3 3 2 1 3 2 1 . . . . .
C . . . . 1 2 3 1 2 3 2 .
D 3 . 1 . 3 . 1 . 3 . 1 .
F 1 3 . . 1 3 . . 1 3 . .
E 3 2 1 . . . . . 1 1 1 3
G 3 3 2 1 3 2 1 . . . . .
H . . . . . 1 2 3 1 2 3 2
I 3 . 1 . 3 . 1 . 3 . 1 .
J A E . . A E . . A E . . 
; 
run;

proc sort data=example; by id; run;
proc transpose data=example out=texample;
	var var1-var12;
	by id;
run;

data texample2;
	set texample;
	by id;
	if first.id then misscount=1*(col1='');
	else do;
		if col1='' then misscount+1;
		if col1 ne '' then misscount=0;
	end;
run;

proc tabulate data=texample2;
	class id;
	var misscount;
	tables id='', misscount*(max=''*f=6.0) / box='id';
run;

This provides the number of consecutively missing observations for each id.

id misscount 
A  5 
B  5 
C  4 
D  1 
E  5 
F  2 
G  5 
H  5 
I  1 
J  2 

Is this close to what you want?

 

ramgouveia
Obsidian | Level 7

@cminard

The result I want is:

id misscount
A 0
B 0
C 4
D 0
E 0
F 0
G 5
H 5
I 0
J 0

I want to count the initial missing values

and

id misscount
A 0
B 5
C 1
D 1
E 2
F 0
G 5
H 0
I 1
J 2

I want to count the last missing values.

PeterClemmensen
Tourmaline | Level 20

Why is misscount = 1 for ID = "H"?

ramgouveia
Obsidian | Level 7
@PeterClemmensen

Is 0. Sorry for the error.
PeterClemmensen
Tourmaline | Level 20

Ok. The whole thing can be done like this then.

 

data want(keep = id misscount);
   set example;
   array v var12 - var1;
   misscount = 0;
   do over v;
      if not missing(v) then leave;
      misscount + 1;
   end;
run;

 

Result:

 

id  misscount 
A   0 
B   5 
C   1 
D   1 
F   2 
E   0 
G   5 
H   0 
I   1 
J   2 
Patrick
Opal | Level 21

Here you go

data have;
  input id $2. var1 $2. var2 $2. var3 $2. var4 $2. var5 $2. var6 $2. var7 $2. var8 $2. var9 $2. var10 $2. var11 $2. var12 $2.;
cards;
A 1 2 3 . . . . . 1 1 1 3
B 3 3 2 1 3 2 1 . . . . .
C . . . . 1 2 3 1 2 3 2 .
D 3 . 1 . 3 . 1 . 3 . 1 .
F 1 3 . . 1 3 . . 1 3 . .
E 3 2 1 . . . . . 1 1 1 3
G 3 3 2 1 3 2 1 . . . . .
H . . . . . 1 2 3 1 2 3 2
I 3 . 1 . 3 . 1 . 3 . 1 .
J A E . . A E . . A E . . 
; 

data want(keep=id miss_lead miss_trail);
  set have;
  miss_lead=0;
  miss_trail=0;
  array v {*} var1-var12; 
  do i=1 to dim(v);
    if missing(v[i]) and miss_lead+1=i then miss_lead=i;
    if missing(v[dim(v)-i+1]) and miss_trail+1=i then miss_trail=i;
  end;
run;

Patrick_0-1618889162908.png

 

ramgouveia
Obsidian | Level 7
Thank you very much @Patrick

It worked.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2232 views
  • 8 likes
  • 4 in conversation