BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cruise
Ammonite | Level 13

Hi @Reeza , @Rick_SAS@PaigeMiller , @PGStats and SAS experts :

 

I'm trying to visualize the effect of incomplete data on the parameter estimates of the model compared with complete data.
Gold-standard data has a full 'date of diagnosis' with complete month/day/yyyy fields. However, some records have no 'month' and 'day' fields but all have 'year' field complete. Missing records like these are often replaced by July 1 of the given year. My goal is to simulate data as if 0 to 100% of the data is replaced by July 1 of the given year and compare teh resulting model estimates with that of gold-standard data.

 

Reeza solved this simulation problem previsouly. 

https://communities.sas.com/t5/SAS-Programming/Simulate-parameter-estimates-of-the-model-for-missing...

 

I'm now trying to visualize the result showing the disparity between the Simulated(Missing) and Full data. However, resulting plot doesn't make sense. Line for the simulated data (Missing) is static.

 

I need your fresh perspective on this. Let me know if any pointer as to where I made a gross mistake.

 

Thanks a lot!

 

PLOT_DIDN'T WORK.png

 

DATA HAVE1;
INPUT ID status date_of_death_mm    date_of_death_dd    date_of_death_yyyy  date_of_diagnosis_mm    date_of_diagnosis_dd    date_of_diagnosis_yyyy  agegrp  dx  exit    stage   DEATH   date_of_diagnosis_mm1   date_of_diagnosis_dd1   DX_MID_YR   duration_no_missing duration_missing;
cards;
1   0   8   19  2005    3   22  2005    3   16517   16667   3   1   7   1   16618   150 49
2   0   6   19  2010    7   20  2005    2   16637   18432   4   1   7   1   16618   1795    1814
3   0   1   7   2006    8   29  2005    4   16677   16808   4   1   7   1   16618   131 190
4   0   5   24  2006    8   30  2005    3   16678   16945   4   1   7   1   16618   267 327
5   0   6   11  2007    11  7   2005    3   16747   17328   4   1   7   1   16618   581 710
6   0   12  27  2005    11  9   2005    4   16749   16797   4   1   7   1   16618   48  179
7   0   3   12  2006    11  25  2005    2   16765   16872   4   1   7   1   16618   107 254
8   0   11  3   2005    7   31  2005    3   16648   16743   4   1   7   1   16618   95  125
9   0   7   20  2005    3   9   2005    4   16504   16637   99  1   7   1   16618   133 19
10  0   12  21  2006    1   9   2006    3   16810   17156   4   1   7   1   16983   346 173
11  0   5   14  2006    2   25  2005    3   16492   16935   99  1   7   1   16618   443 317
12  0   9   8   2005    4   5   2005    4   16531   16687   99  1   7   1   16618   156 69
13  0   12  8   2005    9   12  2005    4   16691   16778   99  1   7   1   16618   87  160
14  0   4   14  2007    1   18  2006    2   16819   17270   4   1   7   1   16983   451 287
15  0   10  30  2009    2   8   2006    2   16840   18200   2   1   7   1   16983   1360    1217
16  0   3   2   2007    6   27  2006    1   16979   17227   3   1   7   1   16983   248 244
17  0   1   13  2007    10  23  2006    4   17097   17179   4   1   7   1   16983   82  196
18  0   12  18  2006    9   28  2006    3   17072   17153   4   1   7   1   16983   81  170
19  0   7   15  2007    12  15  2006    3   17150   17362   4   1   7   1   16983   212 379
20  0   7   11  2006    5   11  2006    4   16932   16993   4   1   7   1   16983   61  10
21  0   2   12  2008    7   24  2006    4   17006   17574   3   0   7   1   16983   568 591
22  0   1   18  2007    11  3   2006    4   17108   17184   4   0   7   1   16983   76  201
23  0   7   20  2010    6   5   2006    2   16957   18463   2   1   7   1   16983   1506    1480
24  0   11  12  2006    9   21  2006    1   17065   17117   4   1   7   1   16983   52  134
25  0   7   2   2007    1   22  2007    2   17188   17349   3   1   7   1   17348   161 1
26  0   3   16  2008    2   2   2007    3   17199   17607   4   1   7   1   17348   408 259
27  0   5   13  2009    3   12  2007    1   17237   18030   4   1   7   1   17348   793 682
28  0   8   28  2007    3   16  2007    4   17241   17406   3   1   7   1   17348   165 58
29  0   2   16  2007    9   26  2006    4   17070   17213   4   1   7   1   16983   143 230
30  0   10  13  2008    5   14  2007    4   17300   17818   3   1   7   1   17348   518 470
31  0   12  22  2007    8   13  2007    3   17391   17522   4   1   7   1   17348   131 174
32  0   4   13  2008    9   7   2007    3   17416   17635   4   1   7   1   17348   219 287
33  0   12  30  2007    7   9   2007    3   17356   17530   3   0   7   1   17348   174 182
34  0   12  1   2006    7   14  2006    4   16996   17136   99  1   7   1   16983   140 153
35  0   2   7   2010    9   15  2007    3   17424   18300   3   1   7   1   17348   876 952
36  0   4   29  2008    6   29  2007    3   17346   17651   3   1   7   1   17348   305 303
37  0   12  4   2007    12  1   2007    3   17501   17504   4   1   7   1   17348   3   156
38  0   9   18  2007    7   16  2007    3   17363   17427   4   1   7   1   17348   64  79
39  0   12  2   2007    10  18  2007    3   17457   17502   3   1   7   1   17348   45  154
40  0   6   23  2008    12  3   2007    4   17503   17706   3   1   7   1   17348   203 358
41  0   8   23  2008    2   4   2008    2   17566   17767   4   1   7   1   17714   201 53
42  0   6   6   2009    4   14  2008    3   17636   18054   4   1   7   1   17714   418 340
43  0   8   21  2011    5   16  2008    3   17668   18860   3   1   7   1   17714   1192    1146
44  0   1   14  2009    9   24  2008    4   17799   17911   4   1   7   1   17714   112 197
45  0   7   9   2009    4   22  2009    4   18009   18087   3   1   7   1   18079   78  8
46  0   9   28  2010    3   17  2009    2   17973   18533   2   0   7   1   18079   560 454
47  0   10  6   2009    4   30  2009    3   18017   18176   4   0   7   1   18079   159 97
48  0   7   10  2009    5   29  2009    4   18046   18088   3   0   7   1   18079   42  9
49  0   8   9   2010    6   4   2009    3   18052   18483   3   1   7   1   18079   431 404
50  0   11  25  2009    7   10  2009    3   18088   18226   4   1   7   1   18079   138 147
51  0   9   16  2009    5   22  2009    4   18039   18156   3   0   7   1   18079   117 77
52  0   4   28  2010    8   21  2009    3   18130   18380   4   1   7   1   18079   250 301
53  0   1   30  2010    9   15  2009    3   18155   18292   2   1   7   1   18079   137 213
54  0   1   21  2010    9   2   2009    3   18142   18283   3   1   7   1   18079   141 204
55  0   10  18  2010    10  9   2009    3   18179   18553   3   0   7   1   18079   374 474
56  0   2   7   2010    10  24  2009    3   18194   18300   4   1   7   1   18079   106 221
57  0   1   26  2010    10  6   2009    3   18176   18288   4   1   7   1   18079   112 209
58  0   2   24  2010    11  19  2009    2   18220   18317   4   1   7   1   18079   97  238
59  0   12  28  2012    11  17  2009    3   18218   19355   4   1   7   1   18079   1137    1276
60  0   6   14  2010    12  2   2009    3   18233   18427   4   1   7   1   18079   194 348
61  0   3   16  2012    11  16  2009    4   18217   19068   3   1   7   1   18079   851 989
62  0   3   20  2010    7   30  2009    3   18108   18341   3   1   7   1   18079   233 262
63  0   5   10  2011    2   23  2010    2   18316   18757   4   1   7   1   18444   441 313
64  0   6   8   2012    3   12  2010    2   18333   19152   4   1   7   1   18444   819 708
65  0   7   10  2010    5   6   2010    4   18388   18453   3   1   7   1   18444   65  9
66  0   7   11  2010    5   25  2010    3   18407   18454   3   1   7   1   18444   47  10
67  0   3   7   2011    7   30  2010    3   18473   18693   4   1   7   1   18444   220 249
68  0   2   7   2012    7   18  2010    2   18461   19030   4   1   7   1   18444   569 586
69  0   12  18  2010    9   3   2010    3   18508   18614   2   0   7   1   18444   106 170
70  0   11  24  2013    11  9   2010    4   18575   19686   3   1   7   1   18444   1111    1242
71  0   4   30  2012    11  12  2010    3   18578   19113   3   1   7   1   18444   535 669
72  0   2   21  2012    11  12  2010    3   18578   19044   3   1   7   1   18444   466 600
73  0   6   12  2011    12  14  2010    3   18610   18790   4   1   7   1   18444   180 346
74  0   7   17  2011    12  20  2010    4   18616   18825   3   1   7   1   18444   209 381
75  0   7   4   2013    11  3   2010    3   18569   19543   3   1   7   1   18444   974 1099
76  0   12  22  2012    10  22  2010    3   18557   19349   3   1   7   1   18444   792 905
77  0   11  28  2013    1   25  2011    3   18652   19690   4   1   7   1   18809   1038    881
78  0   12  27  2011    6   3   2011    3   18781   18988   3   0   7   1   18809   207 179
79  0   12  5   2011    5   2   2011    4   18749   18966   3   1   7   1   18809   217 157
80  0   7   21  2011    5   23  2011    3   18770   18829   4   1   7   1   18809   59  20
81  0   2   27  2014    7   13  2011    3   18821   19781   3   0   7   1   18809   960 972
82  0   12  22  2011    10  12  2011    4   18912   18983   4   1   7   1   18809   71  174
83  0   2   1   2012    9   10  2011    4   18880   19024   4   1   7   1   18809   144 215
84  0   7   14  2012    9   6   2011    4   18876   19188   3   1   7   1   18809   312 379
85  0   12  2   2012    12  9   2011    4   18970   19329   3   1   7   1   18809   359 520
86  0   10  29  2012    12  5   2011    2   18966   19295   4   1   7   1   18809   329 486
87  0   7   10  2015    11  17  2011    1   18948   20279   4   1   7   1   18809   1331    1470
88  0   3   19  2012    12  28  2011    2   18989   19071   4   1   7   1   18809   82  262
89  0   11  16  2012    1   26  2012    3   19018   19313   2   0   7   1   19175   295 138
90  0   2   8   2014    1   5   2012    4   18997   19762   3   1   7   1   19175   765 587
91  0   8   27  2013    2   21  2012    3   19044   19597   4   1   7   1   19175   553 422
92  0   5   2   2013    2   20  2012    3   19043   19480   3   1   7   1   19175   437 305
93  0   2   12  2013    4   13  2012    4   19096   19401   3   1   7   1   19175   305 226
94  0   7   16  2013    4   16  2012    3   19099   19555   2   1   7   1   19175   456 380
95  0   3   4   2013    5   10  2012    3   19123   19421   4   1   7   1   19175   298 246
96  0   7   23  2012    5   14  2012    3   19127   19197   3   1   7   1   19175   70  22
97  0   7   6   2012    5   31  2012    3   19144   19180   4   1   7   1   19175   36  5
98  0   12  22  2012    6   9   2012    2   19153   19349   3   1   7   1   19175   196 174
99  0   6   19  2013    6   28  2012    3   19172   19528   3   1   7   1   19175   356 353
100 0   6   5   2013    8   14  2012    4   19219   19514   3   1   7   1   19175   295 339
101 0   11  5   2012    10  24  2012    4   19290   19302   3   1   7   1   19175   12  127
102 0   8   1   2013    11  9   2012    2   19306   19571   4   1   7   1   19175   265 396
103 0   8   14  2013    11  30  2012    4   19327   19584   4   1   7   1   19175   257 409
104 0   8   3   2012    4   17  2012    3   19100   19208   4   1   7   1   19175   108 33
105 0   3   19  2013    11  5   2012    4   19302   19436   4   1   7   1   19175   134 261
106 0   7   31  2014    11  26  2012    4   19323   19935   4   1   7   1   19175   612 760
107 0   8   18  2014    12  21  2012    3   19348   19953   4   1   7   1   19175   605 778
108 0   11  22  2013    1   10  2013    3   19368   19684   4   1   7   1   19540   316 144
109 0   11  3   2014    1   17  2013    4   19375   20030   3   1   7   1   19540   655 490
110 0   9   14  2013    1   23  2013    4   19381   19615   3   1   7   1   19540   234 75
111 0   12  6   2013    1   30  2013    2   19388   19698   3   1   7   1   19540   310 158
112 0   2   17  2014    3   8   2013    1   19425   19771   4   1   7   1   19540   346 231
113 0   3   4   2014    7   18  2013    2   19557   19786   4   1   7   1   19540   229 246
114 0   8   2   2015    6   11  2013    3   19520   20302   3   1   7   1   19540   782 762
115 0   3   22  2014    10  27  2009    3   18197   19804   2   1   7   1   18079   1607    1725
116 0   11  15  2013    9   26  2013    4   19627   19677   99  1   7   1   19540   50  137
117 0   5   29  2014    11  13  2013    2   19675   19872   4   1   7   1   19540   197 332
118 0   11  6   2014    9   5   2013    3   19606   20033   3   1   7   1   19540   427 493
119 0   7   11  2012    5   10  2012    4   19123   19185   4   1   7   1   19175   62  10
120 0   9   28  2016    12  4   2012    4   19331   20725   3   1   7   1   19175   1394    1550
121 0   8   31  2014    12  21  2013    3   19713   19966   4   1   7   1   19540   253 426
122 0   5   5   2016    1   17  2014    2   19740   20579   4   1   7   1   19905   839 674
123 0   8   1   2015    3   3   2014    2   19785   20301   3   1   7   1   19905   516 396
124 0   11  29  2014    5   16  2014    4   19859   20056   3   1   7   1   19905   197 151
125 0   6   11  2015    7   2   2014    2   19906   20250   3   1   7   1   19905   344 345
126 0   7   18  2015    9   24  2014    3   19990   20287   4   1   7   1   19905   297 382
127 0   7   24  2016    10  30  2014    3   20026   20659   4   1   7   1   19905   633 754
128 0   7   5   2015    12  9   2014    3   20066   20274   4   1   7   1   19905   208 369
129 0   11  19  2015    2   3   2015    1   20122   20411   3   1   7   1   20270   289 141
130 0   10  31  2015    1   8   2015    2   20096   20392   4   1   7   1   20270   296 122
131 0   3   15  2016    3   26  2015    2   20173   20528   4   1   7   1   20270   355 258
132 0   7   4   2015    3   10  2015    3   20157   20273   3   1   7   1   20270   116 3
133 0   8   9   2015    3   17  2015    2   20164   20309   4   1   7   1   20270   145 39
134 0   12  8   2016    5   22  2015    3   20230   20796   2   1   7   1   20270   566 526
135 0   8   15  2016    5   27  2015    4   20235   20681   4   1   7   1   20270   446 411
136 0   7   11  2015    5   8   2015    3   20216   20280   4   1   7   1   20270   64  10
137 0   10  19  2016    7   1   2015    4   20270   20746   3   0   7   1   20270   476 476
138 0   10  28  2015    9   1   2015    3   20332   20389   4   1   7   1   20270   57  119
139 0   2   14  2016    8   6   2015    3   20306   20498   4   1   7   1   20270   192 228
140 0   12  21  2016    11  17  2015    3   20409   20809   4   1   7   1   20270   400 539
141 0   9   12  2016    10  16  2015    3   20377   20709   3   1   7   1   20270   332 439
142 0   4   26  2016    3   2   2015    4   20149   20570   99  0   7   1   20270   421 300
143 0   7   28  2016    8   25  2015    2   20325   20663   3   1   7   1   20270   338 393
;
RUN;

*sort for demo;
proc sort data=have1; 
by stage;
run;
%macro surveyLoop(sample=);
*calculate sample rate;
%let sampRate = %sysevalf(&sample./100);

*create sample;
ods select none;
proc surveyselect data=have1 
                   samprate=&sampRate. 
                   reps=100  
                   out=_sample 
                   seed=995
                   outall;
strata stage;*ensures stage is present for all values - not needed in prod;
run;

*set duration to values if selected;
data _sample;
set _sample;
*assign values as needed;
if selected = 1 then duration = duration_missing;
else duration=duration_no_missing;

*sort for modeling procedures next;
proc sort data=_sample;
by replicate;
run;

/*SURVIVAL ANALYSIS ON THE ACTUAL DATA WITH NO MISSING*/
ods output HazardRatios=PE_MISS;
PROC PHREG DATA =_sample; 
by replicate;
CLASS STAGE(REF='99') AGEGRP(REF='1')/PARAM=REF; 
MODEL duration_no_missing*DEATH(0) = AGEGRP STAGE/RL EVENTCODE=1;      
hazardratio STAGE/diff=REF at (AGEGRP=all); 
RUN;

ods output HazardRatios=PE_ALL;
/*simulate duration_missing for the 0-100% of the data*/
PROC PHREG DATA =_sample; 
by replicate;
CLASS STAGE(REF='99') AGEGRP(REF='1')/PARAM=REF; 
MODEL duration*DEATH(0) = AGEGRP STAGE/RL EVENTCODE=1;      
hazardratio STAGE/diff=REF at (AGEGRP=all); 
RUN;

*combines results and labels estimates;
data PE_COMBO; /*n=*/ set pe_miss (in=t1) /*n=*/ pe_all (in=t2) /*n=*/;
length estType $15.;
PCT_MISS = &sample/100;
if t1 then estType = 'Missing';
else estType = 'Full Sample';
run;

*omit if no replicates;
*if replicates calculates average of HR + STDERR for bootstrap approach;
proc means data=pe_combo NWAY N MEAN STD STDERR;
class estType Description pct_miss;
var hazardratio;
ods output summary = pe_summary;
run;
ods select all;

*append to main data sets;
proc append base=results_combined data=pe_combo force;
run;
proc append base=results_summary data=pe_summary force;
run;

*clean up after;
/*proc datasets lib=work nodetails nolist;*/
/*delete pe_: _sample;*/
/*run;quit;*/

%let sampRate =;
%mend;
*call macro for different sizes;
%surveyLoop(sample=1);
%surveyLoop(sample=2);
%surveyLoop(sample=3);
%surveyLoop(sample=4);
%surveyLoop(sample=5);
%surveyLoop(sample=6);
%surveyLoop(sample=7);
%surveyLoop(sample=8);
%surveyLoop(sample=9);
%surveyLoop(sample=10);
%surveyLoop(sample=11);
%surveyLoop(sample=12);
%surveyLoop(sample=13);
%surveyLoop(sample=14);
%surveyLoop(sample=15);
%surveyLoop(sample=16);
%surveyLoop(sample=17);
%surveyLoop(sample=18);
%surveyLoop(sample=19);
%surveyLoop(sample=20);
%surveyLoop(sample=21);
%surveyLoop(sample=22);
%surveyLoop(sample=23);
%surveyLoop(sample=24);
%surveyLoop(sample=25);
%surveyLoop(sample=26);
%surveyLoop(sample=27);
%surveyLoop(sample=28);
%surveyLoop(sample=29);
%surveyLoop(sample=30);
%surveyLoop(sample=31);
%surveyLoop(sample=32);
%surveyLoop(sample=33);
%surveyLoop(sample=34);
%surveyLoop(sample=35);
%surveyLoop(sample=36);
%surveyLoop(sample=37);
%surveyLoop(sample=38);
%surveyLoop(sample=39);
%surveyLoop(sample=40);
%surveyLoop(sample=41);
%surveyLoop(sample=42);
%surveyLoop(sample=43);
%surveyLoop(sample=44);
%surveyLoop(sample=45);
%surveyLoop(sample=46);
%surveyLoop(sample=47);
%surveyLoop(sample=48);
%surveyLoop(sample=49);
%surveyLoop(sample=50);
%surveyLoop(sample=51);
%surveyLoop(sample=52);
%surveyLoop(sample=53);
%surveyLoop(sample=54);
%surveyLoop(sample=55);
%surveyLoop(sample=56);
%surveyLoop(sample=57);
%surveyLoop(sample=58);
%surveyLoop(sample=59);
%surveyLoop(sample=60);
%surveyLoop(sample=61);
%surveyLoop(sample=62);
%surveyLoop(sample=63);
%surveyLoop(sample=64);
%surveyLoop(sample=65);
%surveyLoop(sample=66);
%surveyLoop(sample=67);
%surveyLoop(sample=68);
%surveyLoop(sample=69);
%surveyLoop(sample=70);
%surveyLoop(sample=71);
%surveyLoop(sample=72);
%surveyLoop(sample=73);
%surveyLoop(sample=74);
%surveyLoop(sample=75);
%surveyLoop(sample=76);
%surveyLoop(sample=77);
%surveyLoop(sample=78);
%surveyLoop(sample=79);
%surveyLoop(sample=80);
%surveyLoop(sample=81);
%surveyLoop(sample=82);
%surveyLoop(sample=83);
%surveyLoop(sample=84);
%surveyLoop(sample=85);
%surveyLoop(sample=86);
%surveyLoop(sample=87);
%surveyLoop(sample=88);
%surveyLoop(sample=89);
%surveyLoop(sample=90);
%surveyLoop(sample=91);
%surveyLoop(sample=92);
%surveyLoop(sample=93);
%surveyLoop(sample=94);
%surveyLoop(sample=95);
%surveyLoop(sample=96);
%surveyLoop(sample=97);
%surveyLoop(sample=98);
%surveyLoop(sample=99);
%surveyLoop(sample=100);

PROC FREQ DATA=RESULTS_SUMMARY;
TABLES Description;
RUN;

DATA RESULTS_SUMMARY1; SET RESULTS_SUMMARY;
length group $ 16;
if Description='stage 2 vs 99 At agegrp=1' then group='Stage 2 /0-44yr'; else 
if Description='stage 2 vs 99 At agegrp=2' then group='Stage 2 /45-59yr'; else 
if Description='stage 2 vs 99 At agegrp=3' then group='Stage 2 /60-74yr'; else 
if Description='stage 2 vs 99 At agegrp=4' then group='Stage 2 /75+yr'; else 
if Description='stage 3 vs 99 At agegrp=1' then group='Stage 3 /0-44yr'; else 
if Description='stage 3 vs 99 At agegrp=2' then group='Stage 3 /45-59yr'; else 
if Description='stage 3 vs 99 At agegrp=3' then group='Stage 3 /60-74yr'; else 
if Description='stage 3 vs 99 At agegrp=4' then group='Stage 3 /75+yr'; else 
if Description='stage 4 vs 99 At agegrp=1' then group='Stage 4 /0-44yr'; else 
if Description='stage 4 vs 99 At agegrp=2' then group='Stage 4 /45-59yr'; else 
if Description='stage 4 vs 99 At agegrp=3' then group='Stage 4 /60-74yr'; else 
if Description='stage 4 vs 99 At agegrp=4' then group='Stage 4 /75+yr'; 
PCT_MISS100=PCT_MISS*100;
RUN;

proc sort data=RESULTS_SUMMARY1; 
by group estType;
run; 
ods graphics/width=12 in height=6in;
proc sgpanel data=RESULTS_SUMMARY1 noautolegend;
label PCT_MISS100='Percent of Incomplete Data' HazardRatio_Mean='Hazard Ratio';
panelby group/ novarname columns=4 rows=3;
styleattrs DATACONTRASTCOLORS=(black red);
series  x=PCT_MISS100 y=HazardRatio_Mean / group=estType lineattrs=(pattern=solid);
keylegend/  title="Date Type" position=bottom;
colaxis label='percent of missing in data' fitpolicy=thin valuesformat=best4.0;
rowaxis label='Hazard Ratio / Interaction between Stage and Age';
title "Effect of missing in X on the estimates of Y model"; 
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

@Cruise wrote:

@PaigeMiller 

I think you're right. I have to correct to: if t2 then estType = 'Missing'; else estType = 'Full Sample'; or switch PE_MISS and PE_ALL. 

 

OK, I got it.

 

1. Can you please explain why the line indicating Full data gotta be constant? Datasets with full dates and imputed dates produce different hazard ratio estimates thus both datasets gotta take dynamic values on the Y axis, ain't they? I understand Y axis is one-to-one with Hazard ratio estimates. I'll appreciate your corrections since it seems that I'm missing an important concept here.


It's your application, and you should understand the data and manipulations better than I do. Maybe I am mis-understanding the whole thing.

 

So here's my thought process. When you use the full data set, what changes when the % missing changes? As far as I understand this (which is much less than your understanding of what is happening), nothing changes in the full data set when the % missing changes.

 

So you tell me, why should it not be constant?

--
Paige Miller

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

Is it possible you have the red and black lines mis-labelled? Should the line labelled Full Sample be the red one and line labelled Missing should be the black one?

 

One of the lines should be constant, if it is using the entire data set. The other line should change depending on the percent of missings.

--
Paige Miller
Cruise
Ammonite | Level 13

@PaigeMiller 

I think you're right. I have to correct to: if t2 then estType = 'Missing'; else estType = 'Full Sample'; or switch PE_MISS and PE_ALL. 

 

OK, I got it.

 

1. Can you please explain why the line indicating Full data gotta be constant? Datasets with full dates and imputed dates produce different hazard ratio estimates thus both datasets gotta take dynamic values on the Y axis, ain't they? I understand Y axis is one-to-one with Hazard ratio estimates. I'll appreciate your corrections since it seems that I'm missing an important concept here.

 

2. Also, is the way I'm feeding the macro 1 thru 100 is correct?

 

%surveyLoop(sample=100); thru %surveyLoop(sample=100);

 

ods output HazardRatios=PE_MISS;
PROC PHREG DATA =_sample; 
by replicate;
CLASS STAGE(REF='99') AGEGRP(REF='1')/PARAM=REF; 
MODEL duration_no_missing*DEATH(0) = AGEGRP STAGE/RL EVENTCODE=1;      
hazardratio STAGE/diff=REF at (AGEGRP=all); 
RUN;

ods output HazardRatios=PE_ALL;
/*simulate duration_missing for the 0-100% of the data*/
PROC PHREG DATA =_sample; 
by replicate;
CLASS STAGE(REF='99') AGEGRP(REF='1')/PARAM=REF; 
MODEL duration*DEATH(0) = AGEGRP STAGE/RL EVENTCODE=1;      
hazardratio STAGE/diff=REF at (AGEGRP=all); 
RUN;
*combines results and labels estimates;
data PE_COMBO; /*n=*/ set pe_miss (in=t1) /*n=*/ pe_all (in=t2) /*n=*/;
length estType $15.;
PCT_MISS = &sample/100;
if t2 then estType = 'Missing';
else estType = 'Full Sample';
run;

 

PaigeMiller
Diamond | Level 26

@Cruise wrote:

@PaigeMiller 

I think you're right. I have to correct to: if t2 then estType = 'Missing'; else estType = 'Full Sample'; or switch PE_MISS and PE_ALL. 

 

OK, I got it.

 

1. Can you please explain why the line indicating Full data gotta be constant? Datasets with full dates and imputed dates produce different hazard ratio estimates thus both datasets gotta take dynamic values on the Y axis, ain't they? I understand Y axis is one-to-one with Hazard ratio estimates. I'll appreciate your corrections since it seems that I'm missing an important concept here.


It's your application, and you should understand the data and manipulations better than I do. Maybe I am mis-understanding the whole thing.

 

So here's my thought process. When you use the full data set, what changes when the % missing changes? As far as I understand this (which is much less than your understanding of what is happening), nothing changes in the full data set when the % missing changes.

 

So you tell me, why should it not be constant?

--
Paige Miller
Cruise
Ammonite | Level 13
Aha moment here. I'm reverse understanding what Reeza did for me. Thanks PaigeMiller. No, nothing suppose to change because full data produce what full data suppose to produce. Imputed data produce varying results as a function of the extent of missing 0-100%.
Cruise
Ammonite | Level 13

With the correct understanding and switching PE_MISS and PA_ALL below is the plot with updated labeling of the Y axis which by the way reflect my understanding of what's happening in the simulation process.

 

I have to figure out what these loess like wavy additional red waves around the black lines.

 

SGPanel2.png

Reeza
Super User
I had assumed you would be taking a ratio or difference and plotting that,not intending to use the number as is from that macro.
Cruise
Ammonite | Level 13
I will do that way and post back in here. For that, I think, I have to merge PE_ALL and PE_MISS to compute (PE_MISS-ME_ALL)PE_ALL or absolute difference diff(PE_MISS,PE_ALL) and plot this against the PCT100. Just writing down for a reminder here.
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1960 views
  • 5 likes
  • 3 in conversation