I'm replacing missing values in data_of_diagnosis_mm by random integer 1 thru 12. However, integers replacing must be logical to death of month. Can you please help specify a condition in macro below where month of diagnosis occurs earlier than month of death? Without such condition specified I end up with bunch of people died before diagnosed.
%macro RandBetween(min, max);
(&min + floor((1+&max-&min)*rand("uniform")))
%mend;
data temp1; set temp;
if date_of_diagnosis_mm in ('99') then date_of_diagnosis_mm = %RandBetween(1, 12);
run;
where
date_of_death_mm>date_of_diagnosis_mm
Like this?
data _null_;
do DATE_OF_DEATH_MM = 1 to 12;
DATE_OF_DIAGNOSIS_MM = %RandBetween(1, DATE_OF_DEATH_MM );
putlog DATE_OF_DEATH_MM= DATE_OF_DIAGNOSIS_MM=;
end;
run;
DATE_OF_DEATH_MM=1 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=2 DATE_OF_DIAGNOSIS_MM=2
DATE_OF_DEATH_MM=3 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=4 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=5 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=6 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=7 DATE_OF_DIAGNOSIS_MM=7
DATE_OF_DEATH_MM=8 DATE_OF_DIAGNOSIS_MM=6
DATE_OF_DEATH_MM=9 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=10 DATE_OF_DIAGNOSIS_MM=3
DATE_OF_DEATH_MM=11 DATE_OF_DIAGNOSIS_MM=8
DATE_OF_DEATH_MM=12 DATE_OF_DIAGNOSIS_MM=2
Like this?
data _null_;
do DATE_OF_DEATH_MM = 1 to 12;
DATE_OF_DIAGNOSIS_MM = %RandBetween(1, DATE_OF_DEATH_MM );
putlog DATE_OF_DEATH_MM= DATE_OF_DIAGNOSIS_MM=;
end;
run;
DATE_OF_DEATH_MM=1 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=2 DATE_OF_DIAGNOSIS_MM=2
DATE_OF_DEATH_MM=3 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=4 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=5 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=6 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=7 DATE_OF_DIAGNOSIS_MM=7
DATE_OF_DEATH_MM=8 DATE_OF_DIAGNOSIS_MM=6
DATE_OF_DEATH_MM=9 DATE_OF_DIAGNOSIS_MM=1
DATE_OF_DEATH_MM=10 DATE_OF_DIAGNOSIS_MM=3
DATE_OF_DEATH_MM=11 DATE_OF_DIAGNOSIS_MM=8
DATE_OF_DEATH_MM=12 DATE_OF_DIAGNOSIS_MM=2
@ChrisNZ Thanks Chris.
The code eliminated 99 in date_of_diagnosis_mm. But, if you look at freq of "date_of_diagnosis_mm" before and after imputing freq of after values decreased after month 6, ie., 1329(before) vs 1192(after). Isn't freq after imputing supposed to increase because they're now added with portions of 99 allocated to them?
before imputing |
date_of_diagnosis_mm | Frequency | Percent |
---|---|---|
1 | 1776 | 12.38 |
2 | 1542 | 10.75 |
3 | 1586 | 11.06 |
4 | 1495 | 10.42 |
5 | 1360 | 9.48 |
6 | 1326 | 9.24 |
7 | 1153 | 8.04 |
8 | 1112 | 7.75 |
9 | 933 | 6.50 |
10 | 828 | 5.77 |
11 | 630 | 4.39 |
12 | 331 | 2.31 |
99 | 271 | 1.89 |
after imputing |
date_of_diagnosis_mm | Frequency | Percent |
---|---|---|
1 | 2634 | 18.36 |
2 | 2315 | 16.14 |
3 | 1894 | 13.21 |
4 | 1599 | 11.15 |
5 | 1389 | 9.68 |
6 | 1192 | 8.31 |
7 | 973 | 6.78 |
8 | 811 | 5.65 |
9 | 634 | 4.42 |
10 | 483 | 3.37 |
11 | 293 | 2.04 |
12 | 126 | 0.88 |
data b; set temp;
DATE_OF_DIAGNOSIS_MM = %RandBetween(1, DATE_OF_DEATH_MM );
where date_of_death_mm ne . and date_of_diagnosis_yyyy=date_of_death_yyyy;
run;
proc freq data=temp;
tables date_of_diagnosis_mm/list missing nocum;
where date_of_death_mm ne . and date_of_diagnosis_yyyy=date_of_death_yyyy;
title "before imputing";
run;
proc freq data=b;
tables date_of_diagnosis_mm/list missing nocum;
where date_of_death_mm ne . and date_of_diagnosis_yyyy=date_of_death_yyyy;
title "after imputing";
run;
Since you have a where condition, all observations where date_of_diagnosis and date_of_death do not fall into the same year are discarded, reducing the overall numbers.
See this slightly changed code, run against the import of your csv:
proc import datafile="$HOME/sascommunity/b.csv"
out=b
dbms=csv replace;
run;
proc freq data=b;
tables date_of_diagnosis_mm;
run;
data b1;
set b;
if
date_of_diagnosis_mm = 99 and
date_of_death_mm ne . and
date_of_diagnosis_yyyy = date_of_death_yyyy
then date_of_diagnosis_mm = rand("INTEGER", date_of_death_mm);
run;
proc freq data=b1;
tables date_of_diagnosis_mm;
run;
Result:
The FREQ Procedure date_of_ Cumulative Cumulative diagnosis_mm Frequency Percent Frequency Percent --------------------------------------------------------------------- 1 3316 8.23 3316 8.23 2 3066 7.61 6382 15.84 3 3510 8.71 9892 24.55 4 3435 8.53 13327 33.08 5 3511 8.72 16838 41.80 6 3445 8.55 20283 50.35 7 3306 8.21 23589 58.55 8 3304 8.20 26893 66.76 9 3245 8.05 30138 74.81 10 3347 8.31 33485 83.12 11 3113 7.73 36598 90.85 12 3091 7.67 39689 98.52 99 597 1.48 40286 100.00 The FREQ Procedure date_of_ Cumulative Cumulative diagnosis_mm Frequency Percent Frequency Percent --------------------------------------------------------------------- 1 3343 8.30 3343 8.30 2 3085 7.66 6428 15.96 3 3528 8.76 9956 24.71 4 3456 8.58 13412 33.29 5 3518 8.73 16930 42.02 6 3452 8.57 20382 50.59 7 3313 8.22 23695 58.82 8 3309 8.21 27004 67.03 9 3247 8.06 30251 75.09 10 3348 8.31 33599 83.40 11 3115 7.73 36714 91.13 12 3091 7.67 39805 98.81 99 481 1.19 40286 100.00
You can see that a small shift from 99 to the other months (ecept 12) has happened, with the cumulative sum staying the same.
There's still a lot of 99's left where there's no death date or where death happened in another year.
I used your approach like below: It appears that problem is SOLVED!!!
/*if you use data attached to this form:*/
proc import datafile="...\b.csv"
out=cr.b
dbms=csv replace;
run;
data b1; set b;
do DATE_OF_DEATH_MM = 1 to 12;
IF DATE_OF_DIAGNOSIS_MM=99 THEN DATE_OF_DIAGNOSIS_MM= %RandBetween(DATE_OF_DEATH_MM,12);
where date_of_death_mm ne . and
date_of_diagnosis_yyyy = date_of_death_yyyy;
end;
run;
proc freq data=b;
tables date_of_diagnosis_mm;
where date_of_death_mm ne . and
date_of_diagnosis_yyyy = date_of_death_yyyy;
run;
proc freq data=b1;
tables date_of_diagnosis_mm;
where date_of_death_mm ne . and
date_of_diagnosis_yyyy = date_of_death_yyyy;
run;
Result:
BEFORE date_of_diagnosis_mm |
n | % |
AFTER date_of_diagnosis_mm |
n | % |
1 | 679 | 11.7 | 1 | 687 | 11.88 |
2 | 645 | 11.2 | 2 | 654 | 11.31 |
3 | 617 | 10.7 | 3 | 625 | 10.81 |
4 | 618 | 10.7 | 4 | 627 | 10.84 |
5 | 562 | 9.72 | 5 | 568 | 9.82 |
6 | 530 | 9.17 | 6 | 535 | 9.25 |
7 | 472 | 8.16 | 7 | 488 | 8.44 |
8 | 449 | 7.77 | 8 | 450 | 7.78 |
9 | 372 | 6.43 | 9 | 387 | 6.69 |
10 | 329 | 5.69 | 10 | 338 | 5.85 |
11 | 257 | 4.44 | 11 | 269 | 4.65 |
12 | 136 | 2.35 | 12 | 154 | 2.66 |
99 | 116 | 2.01 |
Assuming that date_of_diagnosis_mm and date_of_death_mm are month numbers:
data temp1;
set temp;
if date_of_diagnosis_mm in (99) then
date_of_diagnosis_mm = rand("INTEGER", date_of_death_mm);
run;
Thanks. There's something wrong. The code replace 99 by . Any idea why?
Otherwise, i really like the idea.
%macro RandBetween(min, max);
(&min + floor((1+&max-&min)*rand("uniform")))
%mend;
data b1; set b;
if date_of_diagnosis_mm in (99) then
date_of_diagnosis_mm = rand("INTEGER", date_of_death_mm);
where date_of_death_mm ne . and date_of_diagnosis_yyyy=date_of_death_yyyy;
run;
proc freq data=b1;
tables date_of_diagnosis_mm*date_of_death_mm/list missing;
run;
date_of_diagnosis_mm | date_of_death_mm | Freq | Percent |
. | 1 | 8 | 0.06 |
. | 2 | 23 | 0.16 |
. | 3 | 18 | 0.13 |
. | 4 | 19 | 0.13 |
. | 5 | 21 | 0.15 |
. | 6 | 30 | 0.21 |
. | 7 | 25 | 0.17 |
. | 8 | 16 | 0.11 |
. | 9 | 27 | 0.19 |
. | 10 | 33 | 0.23 |
. | 11 | 17 | 0.12 |
. | 12 | 34 | 0.24 |
1 | 1 | 307 | 2.14 |
1 | 2 | 339 | 2.36 |
1 | 3 | 216 | 1.51 |
1 | 4 | 166 | 1.16 |
1 | 5 | 134 | 0.93 |
1 | 6 | 101 | 0.7 |
1 | 7 | 84 | 0.59 |
1 | 8 | 96 | 0.67 |
1 | 9 | 95 | 0.66 |
1 | 10 | 85 | 0.59 |
1 | 11 | 86 | 0.6 |
1 | 12 | 67 | 0.47 |
2 | 2 | 303 | 2.11 |
2 | 3 | 284 | 1.98 |
2 | 4 | 202 | 1.41 |
2 | 5 | 163 | 1.14 |
2 | 6 | 104 | 0.73 |
2 | 7 | 97 | 0.68 |
2 | 8 | 84 | 0.59 |
2 | 9 | 85 | 0.59 |
2 | 10 | 68 | 0.47 |
2 | 11 | 73 | 0.51 |
2 | 12 | 79 | 0.55 |
3 | 3 | 313 | 2.18 |
3 | 4 | 318 | 2.22 |
3 | 5 | 209 | 1.46 |
3 | 6 | 155 | 1.08 |
3 | 7 | 115 | 0.8 |
3 | 8 | 106 | 0.74 |
3 | 9 | 87 | 0.61 |
3 | 10 | 95 | 0.66 |
3 | 11 | 86 | 0.6 |
3 | 12 | 102 | 0.71 |
4 | 4 | 317 | 2.21 |
4 | 5 | 297 | 2.07 |
4 | 6 | 229 | 1.6 |
4 | 7 | 156 | 1.09 |
4 | 8 | 108 | 0.75 |
4 | 9 | 100 | 0.7 |
4 | 10 | 89 | 0.62 |
4 | 11 | 109 | 0.76 |
4 | 12 | 90 | 0.63 |
5 | 5 | 303 | 2.11 |
5 | 6 | 308 | 2.15 |
5 | 7 | 196 | 1.37 |
5 | 8 | 171 | 1.19 |
5 | 9 | 128 | 0.89 |
5 | 10 | 106 | 0.74 |
5 | 11 | 67 | 0.47 |
5 | 12 | 81 | 0.56 |
6 | 6 | 271 | 1.89 |
6 | 7 | 314 | 2.19 |
6 | 8 | 195 | 1.36 |
6 | 9 | 160 | 1.12 |
6 | 10 | 143 | 1 |
6 | 11 | 129 | 0.9 |
6 | 12 | 114 | 0.79 |
7 | 7 | 299 | 2.08 |
7 | 8 | 288 | 2.01 |
7 | 9 | 186 | 1.3 |
7 | 10 | 157 | 1.09 |
7 | 11 | 121 | 0.84 |
7 | 12 | 102 | 0.71 |
8 | 8 | 304 | 2.12 |
8 | 9 | 306 | 2.13 |
8 | 10 | 217 | 1.51 |
8 | 11 | 154 | 1.07 |
8 | 12 | 131 | 0.91 |
9 | 9 | 266 | 1.85 |
9 | 10 | 317 | 2.21 |
9 | 11 | 200 | 1.39 |
9 | 12 | 150 | 1.05 |
10 | 10 | 281 | 1.96 |
10 | 11 | 332 | 2.31 |
10 | 12 | 215 | 1.5 |
11 | 11 | 299 | 2.08 |
11 | 12 | 331 | 2.31 |
12 | 12 | 331 | 2.31 |
If you supply test data in usable form (data step, see my footnotes), we can test code before we suggest it to you.
And when something unexpected happens, posting the log is always a great idea.
this is ERROR log I get.
925 data cr.b1; set cr.b;
926 if
927 date_of_diagnosis_mm = 99 and
928 date_of_death_mm ne . and
929 date_of_diagnosis_yyyy = date_of_death_yyyy
930 then date_of_diagnosis_mm = rand("INTEGER", date_of_death_mm);
931 run;
ERROR: The first argument of the RAND function must be a character string with a value of
BERNOULLI, BETA, BINOMIAL, CAUCHY, CHISQUARE, ERLANG, EXPONENTIAL, F, GAMMA, GAUSSIAN,
GEOMETRIC, HYPERGEOMETRIC, LOGNORMAL, NEGB, NORMAL, POISSON, T, TABLE, TRIANGULAR,
UNIFORM, or WEIBULL.
NOTE: Argument 1 to function RAND('INTEGER',6) at line 930 column 29 is invalid.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2004 date_of_death_dd=26
date_of_death_mm=6 date_of_death_yyyy=2004 _ERROR_=1 _N_=2899
ERROR: The first argument of the RAND function must be a character string with a value of
BERNOULLI, BETA, BINOMIAL, CAUCHY, CHISQUARE, ERLANG, EXPONENTIAL, F, GAMMA, GAUSSIAN,
GEOMETRIC, HYPERGEOMETRIC, LOGNORMAL, NEGB, NORMAL, POISSON, T, TABLE, TRIANGULAR,
UNIFORM, or WEIBULL.
NOTE: Argument 1 to function RAND('INTEGER',12) at line 930 column 29 is invalid.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2004 date_of_death_dd=21
date_of_death_mm=12 date_of_death_yyyy=2004 _ERROR_=1 _N_=3621
ERROR: The first argument of the RAND function must be a character string with a value of
BERNOULLI, BETA, BINOMIAL, CAUCHY, CHISQUARE, ERLANG, EXPONENTIAL, F, GAMMA, GAUSSIAN,
GEOMETRIC, HYPERGEOMETRIC, LOGNORMAL, NEGB, NORMAL, POISSON, T, TABLE, TRIANGULAR,
UNIFORM, or WEIBULL.
NOTE: Argument 1 to function RAND('INTEGER',2) at line 930 column 29 is invalid.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2004 date_of_death_dd=28
date_of_death_mm=2 date_of_death_yyyy=2004 _ERROR_=1 _N_=3632
ERROR: The first argument of the RAND function must be a character string with a value of
BERNOULLI, BETA, BINOMIAL, CAUCHY, CHISQUARE, ERLANG, EXPONENTIAL, F, GAMMA, GAUSSIAN,
GEOMETRIC, HYPERGEOMETRIC, LOGNORMAL, NEGB, NORMAL, POISSON, T, TABLE, TRIANGULAR,
UNIFORM, or WEIBULL.
NOTE: Argument 1 to function RAND('INTEGER',7) at line 930 column 29 is invalid.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2004 date_of_death_dd=11
date_of_death_mm=7 date_of_death_yyyy=2004 _ERROR_=1 _N_=3694
ERROR: The first argument of the RAND function must be a character string with a value of
BERNOULLI, BETA, BINOMIAL, CAUCHY, CHISQUARE, ERLANG, EXPONENTIAL, F, GAMMA, GAUSSIAN,
GEOMETRIC, HYPERGEOMETRIC, LOGNORMAL, NEGB, NORMAL, POISSON, T, TABLE, TRIANGULAR,
UNIFORM, or WEIBULL.
NOTE: Argument 1 to function RAND('INTEGER',10) at line 930 column 29 is invalid.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2005 date_of_death_dd=18
date_of_death_mm=10 date_of_death_yyyy=2005 _ERROR_=1 _N_=6775
NOTE: Invalid argument to function RAND('INTEGER',6) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2005 date_of_death_dd=11
date_of_death_mm=6 date_of_death_yyyy=2005 _ERROR_=1 _N_=7226
NOTE: Invalid argument to function RAND('INTEGER',2) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2005 date_of_death_dd=8
date_of_death_mm=2 date_of_death_yyyy=2005 _ERROR_=1 _N_=7725
NOTE: Invalid argument to function RAND('INTEGER',9) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2005 date_of_death_dd=28
date_of_death_mm=9 date_of_death_yyyy=2005 _ERROR_=1 _N_=7769
NOTE: Invalid argument to function RAND('INTEGER',10) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2005 date_of_death_dd=31
date_of_death_mm=10 date_of_death_yyyy=2005 _ERROR_=1 _N_=7874
NOTE: Invalid argument to function RAND('INTEGER',7) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2006 date_of_death_dd=1
date_of_death_mm=7 date_of_death_yyyy=2006 _ERROR_=1 _N_=10607
NOTE: Invalid argument to function RAND('INTEGER',5) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2007 date_of_death_dd=31
date_of_death_mm=5 date_of_death_yyyy=2007 _ERROR_=1 _N_=12501
NOTE: Invalid argument to function RAND('INTEGER',3) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2007 date_of_death_dd=30
date_of_death_mm=3 date_of_death_yyyy=2007 _ERROR_=1 _N_=12580
NOTE: Invalid argument to function RAND('INTEGER',7) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2006 date_of_death_dd=10
date_of_death_mm=7 date_of_death_yyyy=2006 _ERROR_=1 _N_=13166
NOTE: Invalid argument to function RAND('INTEGER',4) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2006 date_of_death_dd=16
date_of_death_mm=4 date_of_death_yyyy=2006 _ERROR_=1 _N_=13167
NOTE: Invalid argument to function RAND('INTEGER',5) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2006 date_of_death_dd=29
date_of_death_mm=5 date_of_death_yyyy=2006 _ERROR_=1 _N_=13528
NOTE: Invalid argument to function RAND('INTEGER',6) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2007 date_of_death_dd=23
date_of_death_mm=6 date_of_death_yyyy=2007 _ERROR_=1 _N_=14045
NOTE: Invalid argument to function RAND('INTEGER',9) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2007 date_of_death_dd=25
date_of_death_mm=9 date_of_death_yyyy=2007 _ERROR_=1 _N_=14512
NOTE: Invalid argument to function RAND('INTEGER',4) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2007 date_of_death_dd=27
date_of_death_mm=4 date_of_death_yyyy=2007 _ERROR_=1 _N_=14719
NOTE: Invalid argument to function RAND('INTEGER',4) at line 930 column 29.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2008 date_of_death_dd=22
date_of_death_mm=4 date_of_death_yyyy=2008 _ERROR_=1 _N_=15474
NOTE: Invalid argument to function RAND('INTEGER',6) at line 930 column 29.
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.
date_of_diagnosis_dd=99 date_of_diagnosis_mm=. date_of_diagnosis_yyyy=2007 date_of_death_dd=2
date_of_death_mm=6 date_of_death_yyyy=2007 _ERROR_=1 _N_=16148
NOTE: Mathematical operations could not be performed at the following places. The results of the
operations have been set to missing values.
Each place is given by: (Number of times) at (Line):(Column).
116 at 930:29
NOTE: There were 40286 observations read from the data set CR.B.
NOTE: The data set CR.B1 has 40286 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.03 seconds
Which SAS version are you on? From digging through the docs, I see that 'INTEGER' must have been added with 9.4.
A month (1-12) on its own is a useless value. Only in connection with a year will it depict a certain real-world month. So you should expand your data accordingly, and then create your random diagnosis month within the last 12 months of the person's life. If you use intelligent data (working with SAS date values instead of year/month pairs), you can use the intnx() function, calculating from the person's death date.
Hi guys, attached is the sample from my data, as @Kurt_Bremser suggested. Thanks for help! @PGStats @ChrisNZ
proc sql;
create table b(keep=date_of_death: date_of_diagnosis:) as
select *
from data
where ranuni (0)<X;
quit;
proc import datafile="...\b.csv"
out=data
dbms=csv replace;
run;
When running your code
proc import datafile="$HOME/sascommunity/b.csv"
out=data
dbms=csv replace;
run;
proc sql;
create table b(keep=date_of_death: date_of_diagnosis:) as
select *
from data
where ranuni (0)<X;
quit;
, I get this ERROR in the SQL:
65 proc sql; 66 create table b(keep=date_of_death: date_of_diagnosis:) as 67 select * 68 from data 69 where ranuni (0)<X; ERROR: The following columns were not found in the contributing tables: X.
@Kurt_Bremser proc sql part is just to show where my data came from. So ignore sql part plz.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.