Hi Everyone
I have a very simple issue that I don't understand what the problem is. I have a data step that uses two 'if' statements and 'missing' to detect if another variable is present or not and populate another variable with a 'y' or 'n' flag based on this.
When I run this, the 'conferenced' variable always comes out 'n' when this is clearly not the case: there are many instances of 'ConfDate' having a date populated. Is there something really simple I am missing?
Paul
data tendayconf4
(rename=(cnty_name=County docno1=Docket start=PlaceStart stop=PlaceStop durat=TimeInCare
filing1=FilingDate adjud1=AdjudDate dur_adjud1=FilingToAdjud
dispo1=DispoDate dur_dispo1=FilingToDispo issuejoindate=IJDate
appearance_date=ConfDate IjDateConfDate=IJDateToConfDate Col3=Part));
length cnty_name $ 50 docno1 $ 20 CohortYear 4 start 8 stop 8 Status $ 50
durat 8 filing1 8 adjud1 8 dur_adjud1 8 dispo1 8
dur_dispo1 8 issuejoindate 8 appearance_date 8
IjDateConfDate 8 Col3 $ 50;
set tendayconf3;
if missing(ConfDate) then Conferenced="N";
if not missing(ConfDate) then Conferenced="Y";
run;
You "generate" the ConfDate variable when writing to the output dataset tendayconf4 (through renaming appearance_date).
"Inside" the data step it is still appearance date. The way you wrote that statement, ConfDate is always empty (because it most probably is not present in the input dataset).
You "generate" the ConfDate variable when writing to the output dataset tendayconf4 (through renaming appearance_date).
"Inside" the data step it is still appearance date. The way you wrote that statement, ConfDate is always empty (because it most probably is not present in the input dataset).
Thank you Kurt!! I am so stupid. Thanks again!
Paul
You are renaming variables in the wrong place
data tendayconf4;
length cnty_name $ 50 docno1 $ 20 CohortYear 4 start 8 stop 8 Status $ 50 durat 8 filing1 8 adjud1 8 dur_adjud1 8 dispo1 8
dur_dispo1 8 issuejoindate 8 appearance_date 8 IjDateConfDate 8 Col3 $ 50;
set tendayconf3(rename=(cnty_name=County docno1=Docket start=PlaceStart stop=PlaceStop durat=TimeInCare
filing1=FilingDate adjud1=AdjudDate dur_adjud1=FilingToAdjud
dispo1=DispoDate dur_dispo1=FilingToDispo issuejoindate=IJDate
appearance_date=ConfDate IjDateConfDate=IJDateToConfDate Col3=Part));
if missing(ConfDate) then Conferenced="N";
if not missing(ConfDate) then Conferenced="Y";
run;
Hi,
Sorry for the bother, but just thought of adding an extra point in the understanding. Looking at Paigemiller and Kurt's response, I wanted to mention what happens in the PDV when SAS reads variables from an input dataset and when SAS writes variables to the output dataset. When you use any dataset option such as (rename= drop=) in the set statement, SAS executes or in your case renames these variables before the variables are brought into the PDV whereas when you use the same in the data statement, SAS just writes the renamed variables to the output dataset at the end of the exceution.
Also, using dataset option is very powerful as opposed to rename/drop statement in the datastep, it helps you filter and alter your needs at an early stage before the variables are read in the memory area PDV, thus saving a lot of execution time.
I hope that helps,
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.