I will like to write a program that creates a file with 19 records with several columns. The issue is that in some cases a column willb e based on a value for a subsequent record, not yet created.
For example: here for the 3rd record (ageg=5-9), the value of dx is based on the lx value for the 4th record - the lx value for the 3rd record. Does SAS allow you to some these kind of calculations?
Ageg | x | n | ax | pop |
|
Mx | qx | px | lx | dx | |
<1 | 0 | 1 | 0.1 | 73200 | 321 | 0.004385 | 0.0043680 | 0.995632 | 100000 | 437 | |
1-4 | 1 | 4 | 0.5 | 292855 | 57 | 0.000195 | 0.0007782 | 0.999222 | 99563 | 77 | |
5-9 | 5 | 5 | 0.5 | 376322 | 32 | 0.000085 | 0.0004251 | 0.999575 | 99486 | 42 | |
10-14 | 10 | 5 | 0.5 | 398163 |
|
0.000100 | 0.0005022 | 0.999498 | 99443 | 50 | |
15-19 | 15 | 5 | 0.5 | 457196 | 131 | 0.000287 | 0.0014316 | 0.998568 | 99393 | 142 | |
20-24 | 20 | 5 | 0.5 | 491392 | 310 | 0.000631 | 0.0031493 | 0.996851 | 99251 | 313 | |
25-29 | 25 | 5 | 0.5 | 482730 | 497 | 0.001030 | 0.0051346 | 0.994865 | 98939 | 508 | |
30-34 | 30 | 5 | 0.5 | 447672 | 508 | 0.001135 | 0.0056577 | 0.994342 | 98431 | 557 | |
35-39 | 35 | 5 | 0.5 | 404949 | 521 | 0.001287 | 0.0064123 | 0.993588 | 97874 | 628 | |
40-44 | 40 | 5 | 0.5 | 432783 | 709 | 0.001638 | 0.0081578 | 0.991842 | 97246 | 793 | |
45-49 | 45 | 5 | 0.5 | 475213 |
|
0.002595 | 0.0128895 | 0.987110 | 96453 | 1243 | |
50-54 | 50 | 5 | 0.5 | 511398 | 1975 | 0.003862 | 0.0191252 | 0.980875 | 95210 | 1821 | |
55-59 | 55 | 5 | 0.5 | 479748 | 2787 | 0.005809 | 0.0286307 | 0.971369 | 93389 | 2674 | |
60-64 | 60 | 5 | 0.5 | 405550 | 3219 | 0.007937 | 0.0389146 | 0.961085 | 90715 | 3530 | |
65-69 | 65 | 5 | 0.5 | 330937 | 4086 | 0.012347 | 0.0598853 | 0.940115 | 87185 | 5221 | |
70-74 | 70 | 5 | 0.5 | 231859 | 4592 | 0.019805 | 0.0943540 | 0.905646 | 81964 | 7734 | |
75-79 | 75 | 5 | 0.5 | 165604 | 5300 | 0.032004 | 0.1481655 | 0.851834 | 74230 | 10998 | |
80-85 | 80 | 5 | 0.5 | 130643 | 7484 | 0.057286 | 0.2505474 | 0.749453 | 63232 | 15843 | |
85+ | 85 | 15 | 0.5 | 157194 | 21355 | 0.135851 | 1.0000000 | 0.000000 | 47389 | 47389 |
Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.
I'm not going to try to guess from an Excel file what the needed processing may be.
How are you going to calculate them? Any calculation code based on missing values will generally generate a missing value.
Since you don't show any missing values in the example data or which variables are dependent on others it is a bit difficult to see what you are doing.
How many of the variables to you have as a starting point and how many do you want to calculate from the data you have? 🙂
Hi Malena,
You can create a new variable with the value of the next record doing this:
data have;
length ageg $5. dx 8.;
infile datalines dlm="|";
input ageg dx;
datalines;
<1|436.800659691464
1-4|77.4839981349214
5-9|42.2891335719469
10-14|49.9385713503871
15-19|142.293749725199
20-24|312.575508974929
25-29|508.009167195865
30-34|556.895428355652
35-39|627.594050495114
40-44|793.310623042926
45-49|1243.23053041441
50-54|1820.89859881431
55-59|2673.78223114848
60-64|3530.13806725502
65-69|5221.08807892018
70-74|7733.59757376662
75-79|10998.3392139526
80-85|15842.5442862508
85+|47389.1905289392
;
run;
data want ;
retain dx;
set have (rename=(dx=next_dx)) end=end ;
if _n_ > 1 then output;
dx = next_dx;
if not end then return;
call missing(next_dx);
output;
run ;
This example creates a variable, next_dx, with the subsequent value of dx. Later you can use if-then logic according the age group to keep that value or not.
Regards,
You have a table of survivor counts (Lx) for each age group and you want the count of deaths (dx) in each age group calculated by subtracting Lx for next age group from Lx for the current age group. So "merge" the dataset have with itself, in which the 2nd reference to have has only one variable: Lx renamed to next_lx. Also offset the second reference to have by one record ("firstobs=2").
data want;
merge have
have (firstobs=2 keep=Lx rename=(Lx=next_Lx));
dx= Lx-coalesce(next_lx,0);
drop next_lx;
run;
The coalesce function returns the first non-missing value of next_lx and zero. This is relevant only for the last record, in which there is no next_lx (i.e. it has a missing value). Since everybody must die in the oldest age group, subtract zero from Lx to get 100% deaths in the 85+ cohorts.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.