Help using Base SAS procedures

Dynamic count of null observations that vary in multiple datasets

Accepted Solution Solved
Reply
Contributor PQK
Contributor
Posts: 21
Accepted Solution

Dynamic count of null observations that vary in multiple datasets

Dear Forum,


I have multiple datasets divided into month that each contain many variables based on an ID number. At some point in each dataset, the value for all variables becomes blank--except the value of the ID number. The point at which all observable data (except the ID number) becomes blank is different for each dataset and the total count of ID number per dataset also varies. (Please see below example)

I’ve been trying to devise a way in which to count the ID numbers who only contain null observations. Possibly a do loop will work, but somehow I have to instruct SAS to start counting observations with null data. For example, if SAS encounters at least 50 consecutive ID numbers with null values, begin counting from the start of this consecutive sequence until the end of the dataset. Once I have a working solution I can easily create a macro to find the number of blank observations on all datasets.

Jan20xx

ID         var1      var2

1            xx          xx

2            xx          xx

3            xx          xx

4

5

Feb20xx

ID         var1 var2

1            xx          xx

2           

Mar20xx

ID         var1      var2

1            xx          xx

2            xx          xx

3            xx          xx

4            xx          xx

5

6

7


I tried finding a solution online but somehow this question doesn't lend itself to easily searchable criteria.


Anyhow, I would greatly appreciate any suggestions!!!


PQK



Accepted Solutions
Solution
‎05-22-2014 06:39 PM
Respected Advisor
Posts: 4,173

Re: Dynamic count of null observations that vary in multiple datasets

Not sure that I fully understand how your desired result should look like but hopefully below code will point you into the right direction.

data Jan20xx;

  infile datalines truncover;

  input ID var1 $ var2 $;

  datalines;

1 xx xx

2 xx xx

3 xx xx

4

5

;

run;

proc sql noprint;

  select

    name, count(name)

  into :varlist separated by ',', :n_var

    from dictionary.columns

      where libname='WORK' and memname='JAN20XX' and upcase(name) ne 'ID'

  ;

quit;

data want;

  set Jan20xx end=last;

  if cmiss(&varlist) = &n_var then

    miss_count+1;

  else miss_count=0;

  if last then

    put "number of trailing missing rows: " miss_count;

run;

View solution in original post


All Replies
Super Contributor
Posts: 275

Re: Dynamic count of null observations that vary in multiple datasets

data have;

infile cards missover truncover;

input ID    (var1   var2) ($);

cards;

1            xx          xx

2            xx          xx

3            xx          xx

4

5

;

run;

data want;

set have;

retain count;

array vars _character_;

if cmiss(of vars(*))=dim(vars) then count+1;

run;

Solution
‎05-22-2014 06:39 PM
Respected Advisor
Posts: 4,173

Re: Dynamic count of null observations that vary in multiple datasets

Not sure that I fully understand how your desired result should look like but hopefully below code will point you into the right direction.

data Jan20xx;

  infile datalines truncover;

  input ID var1 $ var2 $;

  datalines;

1 xx xx

2 xx xx

3 xx xx

4

5

;

run;

proc sql noprint;

  select

    name, count(name)

  into :varlist separated by ',', :n_var

    from dictionary.columns

      where libname='WORK' and memname='JAN20XX' and upcase(name) ne 'ID'

  ;

quit;

data want;

  set Jan20xx end=last;

  if cmiss(&varlist) = &n_var then

    miss_count+1;

  else miss_count=0;

  if last then

    put "number of trailing missing rows: " miss_count;

run;

Contributor PQK
Contributor
Posts: 21

Re: Dynamic count of null observations that vary in multiple datasets

Slchen / Patrick,

Many thanks for the responses!Smiley Happy

I tested your suggestions and they both work brilliantly.

Although the datasets I’m dealing with are mixed numeric/char, I was able to adapt them just the same.

The one difference I noticed is that the method using macro variables with PROC SQL and dictionary tables also works even if the dataset has sporadic missing data before the point that all data values become blank.

Thanks again for the help!

PQK

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 300 views
  • 3 likes
  • 3 in conversation