BookmarkSubscribeRSS Feed
statadm
Fluorite | Level 6
I am cleaning a fairly large dataset and need to be able to count the number of missing data fields for each record.

I've been able to do this for most variables, but I have a couple of groups of related variables that I need to be able to define that nothing should be missing if the first variable is not missing. I can't seem to figure out a way to do this in one array. Is this possible?

array d{40} d1start--d1stop d2start--d2stop d3start--d3stop d4start--d4stop;

dflag=0;
do i=1 to 40;
if d1start ne . and d{i}=. then dflag=dflag+1;
end;
(I know this code isn't doing what I want, but I wanted to provide some code to explain what I'm trying to do)

d1start--d1stop contains 10 related variables and I want to count missing data in these variables only if d1start is not missing or d2start is not missing, etc. I actually just need one total missing count variable for all 40 variables, but I was hoping to do this with one array rather than doing a separate array for each of the 10 groups.

I actually have a lot more data than this, but this is just an example of how it is set up.

Thanks and hope this is enough information. Message was edited by: statadm
5 REPLIES 5
chang_y_chung_hotmail_com
Obsidian | Level 7
Why bother with arrays at all?
[pre]
/* test data */
data one;
retain d1start d12-d19 d1stop
d2start d22-d29 d2stop
d3start d32-d39 d3stop
d4start d42-d49 d4stop 1;
call missing(d1start, d12, d24, d25, d32, d43);
run;

/* count missings by "group" */
data two;
set one;
m1 = nmiss(of d1start--d1stop);
m2 = nmiss(of d2start--d2stop);
m3 = nmiss(of d3start--d3stop);
m4 = nmiss(of d4start--d4stop);
mall = sum(of m1-m4);
run;

/* check */
proc print data=two noobs;
var m:;
run;
/* on lst
m1 m2 m3 m4 mall
2 2 1 1 6
*/
[/pre]
statadm
Fluorite | Level 6
Thanks, that looks like something that will work. I have over 100 groups of 10, so I was trying to avoid writing separate statements for each group, but if this is the best way I can certainly do it.
chang_y_chung_hotmail_com
Obsidian | Level 7
If you have 100 groups of 10, then you can do below, too. Just assign 100 to the macro variable, g, instead of 4. This assumes that in your input data the d varaibles are correctly ordered, without gaps, like: d1start ... d1stop d2start ... d2stop ... d100start ... d100stop.

BTW, 1K vars per obs seems quite a strange way to organize information to me, but you may have your reasons... 🙂
[pre]
/* count missings by "group" */
%let g = 4;
data two;
set one;
array d{1:&g,1:10} d1start--d&g.stop;
array m{1:&g} m1-m&g;
drop i;
do i = 1 to &g;
m{i} = nmiss(of d{i,1},d{i,2},d{i,3},
d{i,4}, d{i,5}, d{i,6}, d{i,7},
d{i,8}, d{i,9}, d{i,10});
end;
mall = sum(of m1-m&g);
run;
[/pre]
statadm
Fluorite | Level 6
Great, this looks even better.

The dataset actually has over 3000 variables. I didn't create the dataset it is output from a web based data collection tool. I already reduced it from 5000 to 3000.

Thanks for your help!
statadm
Fluorite | Level 6
Ok, the variables are correctly ordered, but I do have gaps between the groups of 10 so the array won't work this way.

I'm probably just better off using the previous code.

Ideally, this would be better in long format, but I have to send this data to be cleaned in wide format to be read across for each observation. The data is time based and literally reads across for 120 minutes.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 696 views
  • 0 likes
  • 2 in conversation