BookmarkSubscribeRSS Feed
harleyworm
Calcite | Level 5

HI everyone, thanks in advance for your help!!

I am working with call history data from a survey. A household may be called several times throughout the survey, and I need to generate a variable to capture:

Given that the household completed the adult screener interview, how many calls did it take?

Therefore, I need to know a) whether the household completed the adult screener interview at all, and b) if so, how many calls did it take before they completed it.

The variables are CALL_CODE= 'CS' (stands for completed screener), and the household ID is called ID.

I know I need to do something with retain, first, and last, but I am stuck.

I hope this wasn't too unclear!

14 REPLIES 14
Reeza
Super User

I think you do as well. However without some sample data, input and expected output it's hard to say more Smiley Happy

harleyworm
Calcite | Level 5

how about for example;

ID          CALL_CODE

1              CA

1              IO

1              CS

How would I count that it took two calls within this ID to get 'CS'?

Astounding
PROC Star

Why would you say that it took 2 calls?  It looks like it took 3 calls.  Do you want to stop counting at the first CS?

Reeza
Super User

You'll have to provide more cases. What happens if CS is the first occurrence? The second and there are calls after?

If its only the last then count the number per ID and subtract one.

harleyworm
Calcite | Level 5

Thanks so much! I will try the code. However, will the 'then delete' part delete all the observations within the household prior to 'CS'? I would like to keep these because I am going to use them to construct other variables later.

morgalr
Obsidian | Level 7

Please forgive the unchecked code, but I'm just about out the door for the night.

data completed_calls;

set myData;

if call_code ^= 'CS' then delete;

keep ID;

run;

proc sql;

  create table myAnswer as

    select ID, count(1) as call_count from (select t1.ID from myData t1 join completed_calls t2 on t1.ID = t2.ID) group by ID;

quit;

Assumption: call_code = 'CS' will show up on a valid call ID line, if not, then just subtract 1 from the count in the SQL Code. You may have to create your destination table for the sql  first depending on your data source, then use a insert/select sql statement.

harleyworm
Calcite | Level 5

Thanks for the prompt replies! Sorry about the confusing example earlier, I want to count cases up to and including 'CS'. Since there could be as many as 40 calls to the same household, 'CS' could occur in the first call, the last, or anywhere in the middle. I would like to have a variable that counts all the calls up to and including the first occurrence of 'CS'. For example:

ID    CALL_CODE           NEWVARIABLE

1          IO                             5

1          CA                            5

1         CC                            5

1          C1                            5

1          CS                           5

1          R1                            5

2          CS                            1

2          NA                            1

2          CA                            1

2           IO                            1

2          GA                            1

2           CC                           1

2           TO                            1

Reeza
Super User

There are many ways, but this is an easy to understand method - first find the CS and then merge the results back in.

Also, you say 5 here for when you identify CS, but previously you would have counted it as 4.

data have;

input ID    CALL_CODE  $;

cards;

1          IO        

1          CA        

1         CC         

1          C1        

1          CS        

1          R1        

2          CS        

2          NA        

2          CA        

2           IO       

2          GA        

2           CC       

2           TO

;

run;

data CS;

set have;

by id;

retain count;

if first.id then do;

  count=0;

end;

count+1;

if call_code='CS';

run;

data want;

merge have cs;

by id;

run;

harleyworm
Calcite | Level 5

Thanks so much!! this worked. Now to piggyback on this, how would I code if I wanted to count how many calls up to and including CALL_CODE='2' or '02' or 'R1'? That is, I would like to stop counting once the CALL_CODE is in '2' or '02' or 'R1' within the ID.

Would it be wrong to do....

data CS;

set have;

by id;

retain count;

if first.id then do;

  count=0;

end;

count+1;

if call_code in ('2' '02' 'R1');

??

Reeza
Super User

Post sample data/output.

harleyworm
Calcite | Level 5

My specification: create variable that counts number of calls up to and including first instance of CALL_CODE='2', '02', or 'R1

For example:

ID        CALL_CODE      NEW VARIABLE

1              IO                         4

1             CC                        4

1             TO                         4

1             2                            4

1             R1                         4

2             TI                          2

2             02                          2

2             2                           2

2             R1                         2

2             CS                        2

2            CA                         2

Reeza
Super User

If you have multiple occurrences in your data you'll get more than 1 record per ID for your CS data set.

Add in a data step before the merge to keep the first per ID.

harleyworm
Calcite | Level 5

Thanks so much! I got it to work!

Ksharp
Super User

Modify Reeza's code a little bit.

data CS;

set have;

by id;

retain found ;

if first.id then do;

  count=0;found=0;

end;

count+1;

if not found and call_code in ('2'  '02' 'R1') then do;

found=1;

  output;

end;

run;

data want;

merge have cs;

by id;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 14 replies
  • 2543 views
  • 10 likes
  • 5 in conversation