DATA Step, Macro, Functions and more

Creating a new record based on a subsequent record

Reply
Contributor
Posts: 40

Creating a new record based on a subsequent record

[ Edited ]

I will like to write a program that creates a file with 19 records with several columns. The issue is that in some cases a column willb e based on a value for a subsequent record, not yet created.

 

For example: here for the 3rd record (ageg=5-9), the value of dx is based on the lx value for the 4th record - the lx value for the 3rd record. Does SAS allow you to some these kind of calculations?

 

 Ageg x n ax pop
death
Mx qx px lx dx
<1 0 1 0.1 73200 321 0.004385 0.0043680 0.995632 100000 437
1-4 1 4 0.5 292855 57 0.000195 0.0007782 0.999222 99563 77
5-9 5 5 0.5 376322 32 0.000085 0.0004251 0.999575 99486 42
10-14 10 5 0.5 398163
40
0.000100 0.0005022 0.999498 99443 50
15-19 15 5 0.5 457196 131 0.000287 0.0014316 0.998568 99393 142
20-24 20 5 0.5 491392 310 0.000631 0.0031493 0.996851 99251 313
25-29 25 5 0.5 482730 497 0.001030 0.0051346 0.994865 98939 508
30-34 30 5 0.5 447672 508 0.001135 0.0056577 0.994342 98431 557
35-39 35 5 0.5 404949 521 0.001287 0.0064123 0.993588 97874 628
40-44 40 5 0.5 432783 709 0.001638 0.0081578 0.991842 97246 793
45-49 45 5 0.5 475213
1233
0.002595 0.0128895 0.987110 96453 1243
50-54 50 5 0.5 511398 1975 0.003862 0.0191252 0.980875 95210 1821
55-59 55 5 0.5 479748 2787 0.005809 0.0286307 0.971369 93389 2674
60-64 60 5 0.5 405550 3219 0.007937 0.0389146 0.961085 90715 3530
65-69 65 5 0.5 330937 4086 0.012347 0.0598853 0.940115 87185 5221
70-74 70 5 0.5 231859 4592 0.019805 0.0943540 0.905646 81964 7734
75-79 75 5 0.5 165604 5300 0.032004 0.1481655 0.851834 74230 10998
80-85 80 5 0.5 130643 7484 0.057286 0.2505474 0.749453 63232 15843
85+ 85 15 0.5 157194 21355 0.135851 1.0000000 0.000000 47389 47389
Super User
Posts: 11,343

Re: Creating a new record based on a subsequent record

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

I'm not going to try to guess from an Excel file what the needed processing may be.

 

How are you going to calculate them? Any calculation code based on missing values will generally generate a missing value.

 

Since you don't show any missing values in the example data or which variables are dependent on others it is a bit difficult to see what you are doing.

PROC Star
Posts: 746

Re: Creating a new record based on a subsequent record

How many of the variables to you have as a starting point and how many do you want to calculate from the data you have? Smiley Happy

SAS Employee
Posts: 13

Re: Creating a new record based on a subsequent record

[ Edited ]

Hi Malena,

You can create a new variable with the value of the next record doing this:

 

data have;
length ageg $5. dx 8.;
infile datalines dlm="|";
input ageg dx;
datalines;
<1|436.800659691464
1-4|77.4839981349214
5-9|42.2891335719469
10-14|49.9385713503871
15-19|142.293749725199
20-24|312.575508974929
25-29|508.009167195865
30-34|556.895428355652
35-39|627.594050495114
40-44|793.310623042926
45-49|1243.23053041441
50-54|1820.89859881431
55-59|2673.78223114848
60-64|3530.13806725502
65-69|5221.08807892018
70-74|7733.59757376662
75-79|10998.3392139526
80-85|15842.5442862508
85+|47389.1905289392
;
run;

data want ;
retain dx;
set have (rename=(dx=next_dx)) end=end ;
if _n_ > 1 then output;
dx = next_dx;
if not end then return;
call missing(next_dx);
output;
run ;

 

This example creates a variable, next_dx, with the subsequent value of dx. Later you can use if-then logic according the age group to keep that value or not.

 

Regards,

Trusted Advisor
Posts: 1,022

Re: Creating a new record based on a subsequent record

You have a table of survivor counts (Lx) for each age group and you want the count of deaths (dx) in each age  group calculated by subtracting Lx for next age group from Lx for the current age group. So "merge" the dataset have with itself, in which the 2nd reference to have has only one variable: Lx renamed to next_lx. Also offset the second reference to have by one record ("firstobs=2").

 

data want;

   merge have

              have (firstobs=2 keep=Lx rename=(Lx=next_Lx));

  dx= Lx-coalesce(next_lx,0);

  drop next_lx;

run;

 

The coalesce function returns the first non-missing value of next_lx and zero.  This is relevant only for the last record, in which there is no next_lx (i.e. it has a missing value). Since everybody must die in the oldest age group, subtract zero from Lx to get 100% deaths in the 85+ cohorts.

Ask a Question
Discussion stats
  • 4 replies
  • 157 views
  • 0 likes
  • 5 in conversation