The first dataset in the UPDATE statement, ds(obs=0), is an empty (zero-observation) version of the dataset ds, i.e., it has all the variables, but does not contain any values. It is used as the "master file" (as it is called in the documentation). The second dataset in the UPDATE statement, ds, i.e., the full version of your input data with four observations containing missing or non-missing values of the variables, serves as the "transaction dataset" in the update process.
This means: For each BY group (i.e., group of observations with the same value of variable id) the first observation read from the transaction dataset becomes the basis for an observation in the output data set WANT because there is no observation in the master dataset. (Otherwise, the existing observation in the master dataset for that BY group would be "updated" with the first observation of the transaction dataset.) The update process then continues with the second observation of the transaction dataset: Non-missing values, here: the value of variable date2, replace existing values in the same variable (here: the missing value of date2 from the first observation). Missing values, however, do not overwrite existing values. This is why the missing value of date1 in the second observation of each BY group leaves the existing non-missing value (that was copied from the first observation) unchanged. After the last observation of a BY group has been processed, the observation created by the update process is written to dataset WANT. So, for each ID, variable date1 contains the last non-missing date1 value read from the transaction dataset and, similarly, date2 the last non-missing date2 value.
See DATA Step Processing with the UPDATE Statement for more details and also the examples in the documentation of the UPDATE statement.
... View more