DATA Step, Macro, Functions and more

Concatenate if same month

Reply
Contributor
Posts: 37

Concatenate if same month

I have a dataset with id, release_dare, and a a variable called datenum

Id $4
release_date mmddyy.
Datenum $4

There are several rows per id since each can have many release dates.

I want to create a one row per person with the following:

Id, month1-month24 that each contains the datenum variable for the corresponding month of the release from release_date. If there sre 2 or more on the same month, concatenate the datenum.

So if i have this as input:

Id. Release_date. Datenum
1 03/02/11 1245
2 01/01/11 3423
2 01/10/11 5432

At the end, i want this

Id. month1. month2. month3.
1. 1245
2. 34235432
Valued Guide
Posts: 632

Re: Concatenate if same month

I am not sure what you are asking for.  Your WANT looks like multiple variables for month, but that does not make sense if concatenating.  The following concatenates the DATENUM values into one long character string.

 

data have;
input Id Release_date :mmddyy8. Datenum;
datalines;
1 03/02/11 1245
2 01/01/11 3423
2 01/10/11 5432
run;
proc sort data=have;
   by id release_date;
   run;
data want(keep=id monthnum);
   set have;
   by id;
   length monthnum $40;
   retain monthnum ;
   if first.id then monthnum=' ';
   monthnum=cats(monthnum,datenum);
   if last.id then output want;
   run;

If you actually want the values in separate variables you may want to do a PROC TRANSPOSE.

PROC Star
Posts: 63

Re: Concatenate if same month

Malena,

I I understand you correctly, you want something like this:

data have;
input Id Release_date :mmddyy8. Datenum;
datalines;
1 03/02/11 1245
2 01/01/11 3423
2 01/10/11 5432
run;

proc sort data=have;
  by id Release_date;
run;

Data want;
  do until(last.id);
    set have;
    by id;
    array months (0:23) $20 month1-month24;
    if first.id then 
      date0=Release_date;
    idx=intck('month',Release_date,date0);
    if idx>23 then 
Error 'Date out of range, more than 24 months since first release!'; else
call cats(months(idx),Datenum); end; drop Release_date date0 idx; run;

A couple of notes:

Instead of adding 1 to the array index IDX in every calculation, I zero-based the array (using (0:23) in the array definition). So, in the actual datastep, the month[0] is the month1 variable, etc.

I put in a check for array index out of range, just in case. SAS would also throws an errror message if you referenced the array with an index larger than 24, but I think it is better to get a reasonable informative error message.

 

Regards,

Søren

Ask a Question
Discussion stats
  • 2 replies
  • 122 views
  • 0 likes
  • 3 in conversation