BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mauri0623
Quartz | Level 8

Hello,

 

We receive daily data. The daily data for each day contain missing for the variables alc, odn, accounting that we have to send back to the customer to validate and populate the missing. We want to look at each daily data and see the trend of missing from each day for those variables. The reason we want to do is to look at the trend to know the differences and see how we can improve the process to get less missing. I am looking for some statistical procedures that shows the daily trend.

 

Thank you!

 

Mauri

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

well, that's not really what I wanted, but here is some sample data that I invented. I suggest you use PROC MEANS NMISS and perform aBY-group analysis of the missing values for each day. You can then generate a report or (even better) create a graph of the daily activity.

 

data Have;
input daily_batch alc odn accounting;
datalines;
1 . . 1 
1 . . 7 
1 . 2 1 
1 . 3 . 
1 . . 8 
2 4 . 1 
2 0 . 7 
2 . 2 1 
2 . 3 . 
2 . 4 8 
3 0 . . 
3 . 0 7 
3 2 2 1 
3 . 3 2 
3 . . 2
;
 
proc means data=Have NMISS noprint;
by daily_batch;
var alc odn accounting;
output out=Want NMISS=;
run;

proc print data=Want;
var daily_batch alc odn accounting;
run;

proc sgplot data=Want;
series x=daily_batch y=alc / markers curvelabel;
series x=daily_batch y=odn / markers curvelabel;
series x=daily_batch y=accounting / markers curvelabel;
run;

 

 

View solution in original post

7 REPLIES 7
Rick_SAS
SAS Super FREQ

Please provide samples of the data and tell us what results you would expect from the example data. Probably the best would be a single data set that has a DAY variable with values 1, 2, 3, ....

mauri0623
Quartz | Level 8

Thank you for the quick reply. The daily sas dataset has 50,000 obs and 76 vars. One of the variables is called batch_id(20190518-0000000001). The odn, ipac, and accounting fields contain a lot of missing. We what to look at the trend of missing on a daily basis from these daily files. Please see 10 observation of the dataset.

 

mauri0623
Quartz | Level 8

It would be nice to see a graphical view as well.

Rick_SAS
SAS Super FREQ

well, that's not really what I wanted, but here is some sample data that I invented. I suggest you use PROC MEANS NMISS and perform aBY-group analysis of the missing values for each day. You can then generate a report or (even better) create a graph of the daily activity.

 

data Have;
input daily_batch alc odn accounting;
datalines;
1 . . 1 
1 . . 7 
1 . 2 1 
1 . 3 . 
1 . . 8 
2 4 . 1 
2 0 . 7 
2 . 2 1 
2 . 3 . 
2 . 4 8 
3 0 . . 
3 . 0 7 
3 2 2 1 
3 . 3 2 
3 . . 2
;
 
proc means data=Have NMISS noprint;
by daily_batch;
var alc odn accounting;
output out=Want NMISS=;
run;

proc print data=Want;
var daily_batch alc odn accounting;
run;

proc sgplot data=Want;
series x=daily_batch y=alc / markers curvelabel;
series x=daily_batch y=odn / markers curvelabel;
series x=daily_batch y=accounting / markers curvelabel;
run;

 

 

mauri0623
Quartz | Level 8
Thank you. I will accept this as a solution. Have a great weekend.
mauri0623
Quartz | Level 8

Sorry. My mistake. The alc, odn, and accounting are character variables and not numeric.

Rick_SAS
SAS Super FREQ

Okay, so convert the character variables to a numeric indicator variable and then use my original solution.

 


data Two / view=Two;
set Have(rename=(alc=char_alc odn=char_odn accounting=char_accounting));
alc = ifn(char_alc=" ", ., 1);
odn = ifn(char_odn=" ", ., 1);
accounting = ifn(char_accounting=" ", ., 1);
run;

proc means data=Two NMISS noprint;
by daily_batch;
var alc odn accounting;
output out=Want NMISS=;
run;

proc print data=Want;
var daily_batch alc odn accounting;
run;

proc sgplot data=Want;
series x=daily_batch y=alc / markers curvelabel;
series x=daily_batch y=odn / markers curvelabel;
series x=daily_batch y=accounting / markers curvelabel;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 2203 views
  • 0 likes
  • 2 in conversation