I will like to write a program that creates a file with 19 records with several columns. The issue is that in some cases a column willb e based on a value for a subsequent record, not yet created.
For example: here for the 3rd record (ageg=5-9), the value of dx is based on the lx value for the 4th record - the lx value for the 3rd record. Does SAS allow you to some these kind of calculations?
Ageg | x | n | ax | pop |
|
Mx | qx | px | lx | dx | |
<1 | 0 | 1 | 0.1 | 73200 | 321 | 0.004385 | 0.0043680 | 0.995632 | 100000 | 437 | |
1-4 | 1 | 4 | 0.5 | 292855 | 57 | 0.000195 | 0.0007782 | 0.999222 | 99563 | 77 | |
5-9 | 5 | 5 | 0.5 | 376322 | 32 | 0.000085 | 0.0004251 | 0.999575 | 99486 | 42 | |
10-14 | 10 | 5 | 0.5 | 398163 |
|
0.000100 | 0.0005022 | 0.999498 | 99443 | 50 | |
15-19 | 15 | 5 | 0.5 | 457196 | 131 | 0.000287 | 0.0014316 | 0.998568 | 99393 | 142 | |
20-24 | 20 | 5 | 0.5 | 491392 | 310 | 0.000631 | 0.0031493 | 0.996851 | 99251 | 313 | |
25-29 | 25 | 5 | 0.5 | 482730 | 497 | 0.001030 | 0.0051346 | 0.994865 | 98939 | 508 | |
30-34 | 30 | 5 | 0.5 | 447672 | 508 | 0.001135 | 0.0056577 | 0.994342 | 98431 | 557 | |
35-39 | 35 | 5 | 0.5 | 404949 | 521 | 0.001287 | 0.0064123 | 0.993588 | 97874 | 628 | |
40-44 | 40 | 5 | 0.5 | 432783 | 709 | 0.001638 | 0.0081578 | 0.991842 | 97246 | 793 | |
45-49 | 45 | 5 | 0.5 | 475213 |
|
0.002595 | 0.0128895 | 0.987110 | 96453 | 1243 | |
50-54 | 50 | 5 | 0.5 | 511398 | 1975 | 0.003862 | 0.0191252 | 0.980875 | 95210 | 1821 | |
55-59 | 55 | 5 | 0.5 | 479748 | 2787 | 0.005809 | 0.0286307 | 0.971369 | 93389 | 2674 | |
60-64 | 60 | 5 | 0.5 | 405550 | 3219 | 0.007937 | 0.0389146 | 0.961085 | 90715 | 3530 | |
65-69 | 65 | 5 | 0.5 | 330937 | 4086 | 0.012347 | 0.0598853 | 0.940115 | 87185 | 5221 | |
70-74 | 70 | 5 | 0.5 | 231859 | 4592 | 0.019805 | 0.0943540 | 0.905646 | 81964 | 7734 | |
75-79 | 75 | 5 | 0.5 | 165604 | 5300 | 0.032004 | 0.1481655 | 0.851834 | 74230 | 10998 | |
80-85 | 80 | 5 | 0.5 | 130643 | 7484 | 0.057286 | 0.2505474 | 0.749453 | 63232 | 15843 | |
85+ | 85 | 15 | 0.5 | 157194 | 21355 | 0.135851 | 1.0000000 | 0.000000 | 47389 | 47389 |
Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.
I'm not going to try to guess from an Excel file what the needed processing may be.
How are you going to calculate them? Any calculation code based on missing values will generally generate a missing value.
Since you don't show any missing values in the example data or which variables are dependent on others it is a bit difficult to see what you are doing.
How many of the variables to you have as a starting point and how many do you want to calculate from the data you have? 🙂
Hi Malena,
You can create a new variable with the value of the next record doing this:
data have;
length ageg $5. dx 8.;
infile datalines dlm="|";
input ageg dx;
datalines;
<1|436.800659691464
1-4|77.4839981349214
5-9|42.2891335719469
10-14|49.9385713503871
15-19|142.293749725199
20-24|312.575508974929
25-29|508.009167195865
30-34|556.895428355652
35-39|627.594050495114
40-44|793.310623042926
45-49|1243.23053041441
50-54|1820.89859881431
55-59|2673.78223114848
60-64|3530.13806725502
65-69|5221.08807892018
70-74|7733.59757376662
75-79|10998.3392139526
80-85|15842.5442862508
85+|47389.1905289392
;
run;
data want ;
retain dx;
set have (rename=(dx=next_dx)) end=end ;
if _n_ > 1 then output;
dx = next_dx;
if not end then return;
call missing(next_dx);
output;
run ;
This example creates a variable, next_dx, with the subsequent value of dx. Later you can use if-then logic according the age group to keep that value or not.
Regards,
You have a table of survivor counts (Lx) for each age group and you want the count of deaths (dx) in each age group calculated by subtracting Lx for next age group from Lx for the current age group. So "merge" the dataset have with itself, in which the 2nd reference to have has only one variable: Lx renamed to next_lx. Also offset the second reference to have by one record ("firstobs=2").
data want;
merge have
have (firstobs=2 keep=Lx rename=(Lx=next_Lx));
dx= Lx-coalesce(next_lx,0);
drop next_lx;
run;
The coalesce function returns the first non-missing value of next_lx and zero. This is relevant only for the last record, in which there is no next_lx (i.e. it has a missing value). Since everybody must die in the oldest age group, subtract zero from Lx to get 100% deaths in the 85+ cohorts.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.