BookmarkSubscribeRSS Feed
summerkk
Calcite | Level 5

Hi all,

I have a set of survey data which gives each person's personal identification code (lets call this variable ID for now). Most people were called more than once, and each time they were called, they are added as a separate entry in the data set, therefore there are many duplicate personal identification codes, but with different values for other variables (which are the questions they were asked).

I collapsed the ID variable so each unique id only shows up once in the data set.

I created a variable (ref_teen_sex_questions_firstdate) that outputs the date (variable name is calldatetime) of the first instance of CALL_RESULT_CODE (another variable) = 'RT'  within an ID.

Here is the code I have written:

data new (keep=ID calldatetime ref_teen_sex_questions);

set old;

by ID;

retain ref_teen_sex_questions;

if first.ID then do;

end;

if CALL_RESULT_CODE='RT' then do; ref _teen_sex_questions + 1;

end;

if CALL_RESULT_CODE NE 'RT' then do; ref _teen_sex_questions=0;

end;

if last.ID then output;

run;

data newer (keep=ID ref _teen_sex_questions_firstdate); set new;

if ref_teen_sex_questions>0 then ref _teen_sex_questions_firstdate=calldatetime;

if ref_teen_sex_questions=0 then ref _teen_sex_questions_firstdate=',';

run;

My question is, when I run a proc freq on CALL_RESULT_CODE='RT' on my original data set, I get observations. However, after I create this variable, there are no observations in the new dataset. I have used this same code to create other variables for other different values of CALL_RESULT_CODE and nothing is wrong. It seems that only this value of CALL_RESULT_CODE is giving me problems.

Does anyone know why this would be?


Thanks!!

5 REPLIES 5
art297
Opal | Level 21

I'd hope the following was just a typo:

by ID;

retain ref_teen_sex_questions;

if first.BASEID then do;


If not, you should have gotten an error as first. can ONLY be used with a variable identified in a by statement.


summerkk
Calcite | Level 5

sorry, you are right, it was a typo in the question, but not in my code. the actual variable name is BASEID, I just wanted to shorten it for this post

Tom
Super User Tom
Super User

Your logic doesn't make any sense.  The IF FIRST.... DO... END; does nothing since there are no statements between the DO and END.

This should get you the new variable and you can merge it onto you list of valid ID's to fill in those that did NOT have a date.

data newer ;

  keep id calldatetime;

  rename calldatetime = ref_teen_sex_questions_firstdate ;

  set old ;

  by id calldatetime ;

  where CALL_RESULT_CODE='RT';

  if first.id ;

run;

summerkk
Calcite | Level 5

thanks!

However because of the where statement, it deletes all the ID's which don't have CALL_RESULT_CODE='RT'. I would like to keep them in the ref_teen_sex_questions_firstdate variable and show them as missing .

Tom
Super User Tom
Super User

data newer ;

  keep id ref_teen_sex_questions_firstdate ;

  retain ref_teen_sex_questions_firstdate ;

  format ref_teen_sex_questions_firstdate DATE9. ;

  set old ;

  by id calldatetime ;

  if first.id then ref_teen_sex_questions_firstdate = . ;

  if CALL_RESULT_CODE='RT' and ref_teen_sex_questions_firstdate = . then ref_teen_sex_questions_firstdate = calldatetime;

if last.id ;


run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1613 views
  • 0 likes
  • 3 in conversation