DATA Step, Macro, Functions and more

Replace missing by random with certain condition met

Accepted Solution Solved
Reply
Super Contributor
Posts: 331
Accepted Solution

Replace missing by random with certain condition met

I'm replacing missing values in data_of_diagnosis_mm by random integer 1 thru 12. However, integers replacing must be logical to death of month. Can you please help specify a condition in macro below where month of diagnosis occurs earlier than month of death? Without such condition specified I end up with bunch of people died before diagnosed. 

 

 


%macro RandBetween(min, max);
   (&min + floor((1+&max-&min)*rand("uniform")))
%mend;

data temp1; set temp; 
if date_of_diagnosis_mm in ('99') then date_of_diagnosis_mm = %RandBetween(1, 12);
run; 
where date_of_death_mm>date_of_diagnosis_mm

 


Accepted Solutions
Solution
3 weeks ago
PROC Star
Posts: 2,311

Re: Replace missing by random with certain condition met

[ Edited ]

Like this?


data _null_;  
  do DATE_OF_DEATH_MM = 1 to 12; 
    DATE_OF_DIAGNOSIS_MM = %RandBetween(1, DATE_OF_DEATH_MM );
    putlog DATE_OF_DEATH_MM= DATE_OF_DIAGNOSIS_MM=;
  end;
run; 

DATE_OF_DEATH_MM=1 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=2 DATE_OF_DIAGNOSIS_MM=2
DATE_OF_DEATH_MM=3 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=4 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=5 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=6 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=7 DATE_OF_DIAGNOSIS_MM=7
DATE_OF_DEATH_MM=8 DATE_OF_DIAGNOSIS_MM=6
DATE_OF_DEATH_MM=9 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=10 DATE_OF_DIAGNOSIS_MM=3
DATE_OF_DEATH_MM=11 DATE_OF_DIAGNOSIS_MM=8
DATE_OF_DEATH_MM=12 DATE_OF_DIAGNOSIS_MM=2

View solution in original post


All Replies
Solution
3 weeks ago
PROC Star
Posts: 2,311

Re: Replace missing by random with certain condition met

[ Edited ]

Like this?


data _null_;  
  do DATE_OF_DEATH_MM = 1 to 12; 
    DATE_OF_DIAGNOSIS_MM = %RandBetween(1, DATE_OF_DEATH_MM );
    putlog DATE_OF_DEATH_MM= DATE_OF_DIAGNOSIS_MM=;
  end;
run; 

DATE_OF_DEATH_MM=1 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=2 DATE_OF_DIAGNOSIS_MM=2
DATE_OF_DEATH_MM=3 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=4 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=5 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=6 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=7 DATE_OF_DIAGNOSIS_MM=7
DATE_OF_DEATH_MM=8 DATE_OF_DIAGNOSIS_MM=6
DATE_OF_DEATH_MM=9 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=10 DATE_OF_DIAGNOSIS_MM=3
DATE_OF_DEATH_MM=11 DATE_OF_DIAGNOSIS_MM=8
DATE_OF_DEATH_MM=12 DATE_OF_DIAGNOSIS_MM=2

Super Contributor
Posts: 331

Re: Replace missing by random with certain condition met

@ChrisNZ Thanks Chris.

The code eliminated 99 in date_of_diagnosis_mm. But, if you look at freq of "date_of_diagnosis_mm" before and after imputing freq of after values decreased after month 6, ie., 1329(before) vs 1192(after). Isn't freq after imputing supposed to increase because they're now added with portions of 99 allocated to them?

 

before imputing

date_of_diagnosis_mm Frequency Percent
1 1776 12.38
2 1542 10.75
3 1586 11.06
4 1495 10.42
5 1360 9.48
6 1326 9.24
7 1153 8.04
8 1112 7.75
9 933 6.50
10 828 5.77
11 630 4.39
12 331 2.31
99 271 1.89

 


after imputing

date_of_diagnosis_mm Frequency Percent
1 2634 18.36
2 2315 16.14
3 1894 13.21
4 1599 11.15
5 1389 9.68
6 1192 8.31
7 973 6.78
8 811 5.65
9 634 4.42
10 483 3.37
11 293 2.04
12 126 0.88

 

 

data b; set temp;
DATE_OF_DIAGNOSIS_MM = %RandBetween(1, DATE_OF_DEATH_MM );
where date_of_death_mm ne . and date_of_diagnosis_yyyy=date_of_death_yyyy; 
run;

proc freq data=temp; 
tables date_of_diagnosis_mm/list missing nocum;
where date_of_death_mm ne . and date_of_diagnosis_yyyy=date_of_death_yyyy; 
title "before imputing";
run; 

proc freq data=b; 
tables date_of_diagnosis_mm/list missing nocum;
where date_of_death_mm ne . and date_of_diagnosis_yyyy=date_of_death_yyyy; 
title "after imputing";
run;

 

Super User
Posts: 9,880

Re: Replace missing by random with certain condition met

Since you have a where condition, all observations where date_of_diagnosis and date_of_death do not fall into the same year are discarded, reducing the overall numbers.

See this slightly changed code, run against the import of your csv:

proc import datafile="$HOME/sascommunity/b.csv"
out=b
dbms=csv replace;
run;

proc freq data=b;
tables date_of_diagnosis_mm;
run;

data b1;
set b;
if
  date_of_diagnosis_mm = 99 and
  date_of_death_mm ne . and
  date_of_diagnosis_yyyy = date_of_death_yyyy
then date_of_diagnosis_mm = rand("INTEGER", date_of_death_mm);
run;

proc freq data=b1;
tables date_of_diagnosis_mm;
run;

Result:

                          The FREQ Procedure

        date_of_                             Cumulative    Cumulative
    diagnosis_mm    Frequency     Percent     Frequency      Percent
---------------------------------------------------------------------
               1        3316        8.23          3316         8.23  
               2        3066        7.61          6382        15.84  
               3        3510        8.71          9892        24.55  
               4        3435        8.53         13327        33.08  
               5        3511        8.72         16838        41.80  
               6        3445        8.55         20283        50.35  
               7        3306        8.21         23589        58.55  
               8        3304        8.20         26893        66.76  
               9        3245        8.05         30138        74.81  
              10        3347        8.31         33485        83.12  
              11        3113        7.73         36598        90.85  
              12        3091        7.67         39689        98.52  
              99         597        1.48         40286       100.00  
                                                                     

                          The FREQ Procedure

        date_of_                             Cumulative    Cumulative
    diagnosis_mm    Frequency     Percent     Frequency      Percent
---------------------------------------------------------------------
               1        3343        8.30          3343         8.30  
               2        3085        7.66          6428        15.96  
               3        3528        8.76          9956        24.71  
               4        3456        8.58         13412        33.29  
               5        3518        8.73         16930        42.02  
               6        3452        8.57         20382        50.59  
               7        3313        8.22         23695        58.82  
               8        3309        8.21         27004        67.03  
               9        3247        8.06         30251        75.09  
              10        3348        8.31         33599        83.40  
              11        3115        7.73         36714        91.13  
              12        3091        7.67         39805        98.81  
              99         481        1.19         40286       100.00  

You can see that a small shift from 99 to the other months (ecept 12) has happened, with the cumulative sum staying the same.

There's still a lot of 99's left where there's no death date or where death happened in another year.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super Contributor
Posts: 331

Re: Replace missing by random with certain condition met

[ Edited ]

@ChrisNZ

 

I used your approach like below: It appears that problem is SOLVED!!!

 

 

/*if you use data attached to this form:*/

proc import datafile="...\b.csv"
out=cr.b
dbms=csv replace;
run;
data b1; set b; do DATE_OF_DEATH_MM = 1 to 12; IF DATE_OF_DIAGNOSIS_MM=99 THEN DATE_OF_DIAGNOSIS_MM= %RandBetween(DATE_OF_DEATH_MM,12); where date_of_death_mm ne . and date_of_diagnosis_yyyy = date_of_death_yyyy; end; run; proc freq data=b; tables date_of_diagnosis_mm; where date_of_death_mm ne . and date_of_diagnosis_yyyy = date_of_death_yyyy; run; proc freq data=b1; tables date_of_diagnosis_mm; where date_of_death_mm ne . and date_of_diagnosis_yyyy = date_of_death_yyyy; run;

Result:

BEFORE

date_of_diagnosis_mm

n %

AFTER

date_of_diagnosis_mm

n %
1 679 11.7 1 687 11.88
2 645 11.2 2 654 11.31
3 617 10.7 3 625 10.81
4 618 10.7 4 627 10.84
5 562 9.72 5 568 9.82
6 530 9.17 6 535 9.25
7 472 8.16 7 488 8.44
8 449 7.77 8 450 7.78
9 372 6.43 9 387 6.69
10 329 5.69 10 338 5.85
11 257 4.44 11 269 4.65
12 136 2.35 12 154 2.66
99 116 2.01      
Esteemed Advisor
Posts: 5,479

Re: Replace missing by random with certain condition met

Assuming that date_of_diagnosis_mm and date_of_death_mm are month numbers:

 

 

data temp1; 
set temp; 
if date_of_diagnosis_mm in (99) then 
    date_of_diagnosis_mm = rand("INTEGER", date_of_death_mm);
run;
PG
Super Contributor
Posts: 331

Re: Replace missing by random with certain condition met

[ Edited ]

@PGStats

Thanks. There's something wrong. The code replace 99 by . Any idea why?

Otherwise, i really like the idea.

%macro RandBetween(min, max);
   (&min + floor((1+&max-&min)*rand("uniform")))
%mend;

data b1; set b;
if date_of_diagnosis_mm in (99) then
   date_of_diagnosis_mm = rand("INTEGER", date_of_death_mm);
where date_of_death_mm ne . and date_of_diagnosis_yyyy=date_of_death_yyyy;
run;
proc freq data=b1;
tables date_of_diagnosis_mm*date_of_death_mm/list missing;
run;  

 

date_of_diagnosis_mm date_of_death_mm Freq Percent
. 1 8 0.06
. 2 23 0.16
. 3 18 0.13
. 4 19 0.13
. 5 21 0.15
. 6 30 0.21
. 7 25 0.17
. 8 16 0.11
. 9 27 0.19
. 10 33 0.23
. 11 17 0.12
. 12 34 0.24
1 1 307 2.14
1 2 339 2.36
1 3 216 1.51
1 4 166 1.16
1 5 134 0.93
1 6 101 0.7
1 7 84 0.59
1 8 96 0.67
1 9 95 0.66
1 10 85 0.59
1 11 86 0.6
1 12 67 0.47
2 2 303 2.11
2 3 284 1.98
2 4 202 1.41
2 5 163 1.14
2 6 104 0.73
2 7 97 0.68
2 8 84 0.59
2 9 85 0.59
2 10 68 0.47
2 11 73 0.51
2 12 79 0.55
3 3 313 2.18
3 4 318 2.22
3 5 209 1.46
3 6 155 1.08
3 7 115 0.8
3 8 106 0.74
3 9 87 0.61
3 10 95 0.66
3 11 86 0.6
3 12 102 0.71
4 4 317 2.21
4 5 297 2.07
4 6 229 1.6
4 7 156 1.09
4 8 108 0.75
4 9 100 0.7
4 10 89 0.62
4 11 109 0.76
4 12 90 0.63
5 5 303 2.11
5 6 308 2.15
5 7 196 1.37
5 8 171 1.19
5 9 128 0.89
5 10 106 0.74
5 11 67 0.47
5 12 81 0.56
6 6 271 1.89
6 7 314 2.19
6 8 195 1.36
6 9 160 1.12
6 10 143 1
6 11 129 0.9
6 12 114 0.79
7 7 299 2.08
7 8 288 2.01
7 9 186 1.3
7 10 157 1.09
7 11 121 0.84
7 12 102 0.71
8 8 304 2.12
8 9 306 2.13
8 10 217 1.51
8 11 154 1.07
8 12 131 0.91
9 9 266 1.85
9 10 317 2.21
9 11 200 1.39
9 12 150 1.05
10 10 281 1.96
10 11 332 2.31
10 12 215 1.5
11 11 299 2.08
11 12 331 2.31
12 12 331 2.31
Super User
Posts: 9,880

Re: Replace missing by random with certain condition met

If you supply test data in usable form (data step, see my footnotes), we can test code before we suggest it to you.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Posts: 9,880

Re: Replace missing by random with certain condition met

And when something unexpected happens, posting the log is always a great idea.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super Contributor
Posts: 331

Re: Replace missing by random with certain condition met

@KurtBremser, @PGStats

 

this is ERROR log I get.

 

925  data cr.b1; set cr.b;
926  if
927    date_of_diagnosis_mm = 99 and
928    date_of_death_mm ne . and
929    date_of_diagnosis_yyyy = date_of_death_yyyy
930  then date_of_diagnosis_mm = rand("INTEGER", date_of_death_mm);
931  run;

ERROR: The first argument of the RAND function must be a character string with a value of
       BERNOULLI, BETA, BINOMIAL, CAUCHY, CHISQUARE, ERLANG, EXPONENTIAL, F, GAMMA, GAUSSIAN,
       GEOMETRIC, HYPERGEOMETRIC, LOGNORMAL, NEGB, NORMAL, POISSON, T, TABLE, TRIANGULAR,
       UNIFORM, or WEIBULL.
NOTE: Argument 1 to function RAND('INTEGER',6) at line 930 column 29 is invalid.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2004 date_of_death_dd=26
date_of_death_mm=6 date_of_death_yyyy=2004 _ERROR_=1 _N_=2899
ERROR: The first argument of the RAND function must be a character string with a value of
       BERNOULLI, BETA, BINOMIAL, CAUCHY, CHISQUARE, ERLANG, EXPONENTIAL, F, GAMMA, GAUSSIAN,
       GEOMETRIC, HYPERGEOMETRIC, LOGNORMAL, NEGB, NORMAL, POISSON, T, TABLE, TRIANGULAR,
       UNIFORM, or WEIBULL.
NOTE: Argument 1 to function RAND('INTEGER',12) at line 930 column 29 is invalid.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2004 date_of_death_dd=21
date_of_death_mm=12 date_of_death_yyyy=2004 _ERROR_=1 _N_=3621
ERROR: The first argument of the RAND function must be a character string with a value of
       BERNOULLI, BETA, BINOMIAL, CAUCHY, CHISQUARE, ERLANG, EXPONENTIAL, F, GAMMA, GAUSSIAN,
       GEOMETRIC, HYPERGEOMETRIC, LOGNORMAL, NEGB, NORMAL, POISSON, T, TABLE, TRIANGULAR,
       UNIFORM, or WEIBULL.
NOTE: Argument 1 to function RAND('INTEGER',2) at line 930 column 29 is invalid.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2004 date_of_death_dd=28
date_of_death_mm=2 date_of_death_yyyy=2004 _ERROR_=1 _N_=3632
ERROR: The first argument of the RAND function must be a character string with a value of
       BERNOULLI, BETA, BINOMIAL, CAUCHY, CHISQUARE, ERLANG, EXPONENTIAL, F, GAMMA, GAUSSIAN,
       GEOMETRIC, HYPERGEOMETRIC, LOGNORMAL, NEGB, NORMAL, POISSON, T, TABLE, TRIANGULAR,
       UNIFORM, or WEIBULL.
NOTE: Argument 1 to function RAND('INTEGER',7) at line 930 column 29 is invalid.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2004 date_of_death_dd=11
date_of_death_mm=7 date_of_death_yyyy=2004 _ERROR_=1 _N_=3694
ERROR: The first argument of the RAND function must be a character string with a value of
       BERNOULLI, BETA, BINOMIAL, CAUCHY, CHISQUARE, ERLANG, EXPONENTIAL, F, GAMMA, GAUSSIAN,
       GEOMETRIC, HYPERGEOMETRIC, LOGNORMAL, NEGB, NORMAL, POISSON, T, TABLE, TRIANGULAR,
       UNIFORM, or WEIBULL.
NOTE: Argument 1 to function RAND('INTEGER',10) at line 930 column 29 is invalid.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2005 date_of_death_dd=18
date_of_death_mm=10 date_of_death_yyyy=2005 _ERROR_=1 _N_=6775
NOTE: Invalid argument to function RAND('INTEGER',6) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2005 date_of_death_dd=11
date_of_death_mm=6 date_of_death_yyyy=2005 _ERROR_=1 _N_=7226
NOTE: Invalid argument to function RAND('INTEGER',2) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2005 date_of_death_dd=8
date_of_death_mm=2 date_of_death_yyyy=2005 _ERROR_=1 _N_=7725
NOTE: Invalid argument to function RAND('INTEGER',9) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2005 date_of_death_dd=28
date_of_death_mm=9 date_of_death_yyyy=2005 _ERROR_=1 _N_=7769
NOTE: Invalid argument to function RAND('INTEGER',10) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2005 date_of_death_dd=31
date_of_death_mm=10 date_of_death_yyyy=2005 _ERROR_=1 _N_=7874
NOTE: Invalid argument to function RAND('INTEGER',7) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2006 date_of_death_dd=1
date_of_death_mm=7 date_of_death_yyyy=2006 _ERROR_=1 _N_=10607
NOTE: Invalid argument to function RAND('INTEGER',5) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2007 date_of_death_dd=31
date_of_death_mm=5 date_of_death_yyyy=2007 _ERROR_=1 _N_=12501
NOTE: Invalid argument to function RAND('INTEGER',3) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2007 date_of_death_dd=30
date_of_death_mm=3 date_of_death_yyyy=2007 _ERROR_=1 _N_=12580
NOTE: Invalid argument to function RAND('INTEGER',7) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2006 date_of_death_dd=10
date_of_death_mm=7 date_of_death_yyyy=2006 _ERROR_=1 _N_=13166
NOTE: Invalid argument to function RAND('INTEGER',4) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2006 date_of_death_dd=16
date_of_death_mm=4 date_of_death_yyyy=2006 _ERROR_=1 _N_=13167
NOTE: Invalid argument to function RAND('INTEGER',5) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2006 date_of_death_dd=29
date_of_death_mm=5 date_of_death_yyyy=2006 _ERROR_=1 _N_=13528
NOTE: Invalid argument to function RAND('INTEGER',6) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2007 date_of_death_dd=23
date_of_death_mm=6 date_of_death_yyyy=2007 _ERROR_=1 _N_=14045
NOTE: Invalid argument to function RAND('INTEGER',9) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2007 date_of_death_dd=25
date_of_death_mm=9 date_of_death_yyyy=2007 _ERROR_=1 _N_=14512
NOTE: Invalid argument to function RAND('INTEGER',4) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2007 date_of_death_dd=27
date_of_death_mm=4 date_of_death_yyyy=2007 _ERROR_=1 _N_=14719
NOTE: Invalid argument to function RAND('INTEGER',4) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2008 date_of_death_dd=22
date_of_death_mm=4 date_of_death_yyyy=2008 _ERROR_=1 _N_=15474
NOTE: Invalid argument to function RAND('INTEGER',6) at line 930 column 29.
WARNING: Limit set by ERRORS= option reached.  Further errors of this type will not be printed.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2007 date_of_death_dd=2
date_of_death_mm=6 date_of_death_yyyy=2007 _ERROR_=1 _N_=16148
NOTE: Mathematical operations could not be performed at the following places. The results of the
      operations have been set to missing values.
      Each place is given by: (Number of times) at (Line)Smiley SadColumn).
      116 at 930:29
NOTE: There were 40286 observations read from the data set CR.B.
NOTE: The data set CR.B1 has 40286 observations and 6 variables.
NOTE: DATA statement used (Total process time):
      real time           0.07 seconds
      cpu time            0.03 seconds


Super User
Posts: 9,880

Re: Replace missing by random with certain condition met

Which SAS version are you on? From digging through the docs, I see that 'INTEGER' must have been added with 9.4.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Posts: 9,880

Re: Replace missing by random with certain condition met

A month (1-12) on its own is a useless value. Only in connection with a year will it depict a certain real-world month. So you should expand your data accordingly, and then create your random diagnosis month within the last 12 months of the person's life. If you use intelligent data (working with SAS date values instead of year/month pairs), you can use the intnx() function, calculating from the person's death date.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super Contributor
Posts: 331

Re: Replace missing by random with certain condition met

Posted in reply to KurtBremser

Hi guys, attached is the sample from my data, as @KurtBremser suggested. Thanks for help! @PGStats @ChrisNZ

 

proc sql;
create table b(keep=date_of_death: date_of_diagnosisSmiley Happy as
select *
from data
where ranuni (0)<X;
quit;


proc import datafile="...\b.csv"
out=data
dbms=csv replace;
run;

Super User
Posts: 9,880

Re: Replace missing by random with certain condition met

When running your code

proc import datafile="$HOME/sascommunity/b.csv"
out=data
dbms=csv replace;
run;

proc sql;
create table b(keep=date_of_death: date_of_diagnosis:) as
select *
from data
where ranuni (0)<X;
quit;

, I get this ERROR in the SQL:

65         proc sql;
66         create table b(keep=date_of_death: date_of_diagnosis:) as
67         select *
68         from data
69         where ranuni (0)<X;
ERROR: The following columns were not found in the contributing tables: X.
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super Contributor
Posts: 331

Re: Replace missing by random with certain condition met

Posted in reply to KurtBremser

@KurtBremser proc sql part is just to show where my data came from. So ignore sql part plz.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 16 replies
  • 227 views
  • 1 like
  • 4 in conversation