Retain + Looping among multiple row records

Accepted Solution Solved
Reply
Occasional Contributor LLW
Occasional Contributor
Posts: 15
Accepted Solution

Retain + Looping among multiple row records

Hi everyone,

I'm a student who is relatively new to SAS, and I haven't been able to solve this problem. Thank you in advance!! I have a dataset composed of a couple hundred thousand entries grouped by ID. I want to be able to use the retain function by id and for each of the entries.

An example of my dataset:

IDDaystoEvent
A100
A301
A499
B200
B250
C100

What I would like in my dataset:
IDDaystoEvent1_DaystoEvent2_DaystoEvent3_DaystoEvent
A100100301499
A301100301499
A499100301499
B200200250.
B250200250.
C100100..

Basically, I'd like to use the "retain" function for the variable DaystoEvent for ALL the row records by "ID". I cannot create a simple array because each "ID" has a different number of entries, and my dataset is over 300,000 "Ids". I have figured out how to use the retain function for the first.ID (example below), but I cannot figure out how to do it for all records by ID.

data work;

set work;

by I

Retain R_Daystoevent;

If first.ID then do;

R_Daystoevent=Daystoevent;

end;

Thank you!

Louise


Accepted Solutions
Solution
‎04-07-2015 12:44 AM
Frequent Contributor
Posts: 115

Re: Retain + Looping among multiple row records

data work;

  input ID $ DaystoEvent;

  cards;

A 100

A 301

A 499

B 200

B 250

C 100

;


proc sql; /* copied from Reeza's*/
select max(count) into :max_count separated by ' ' from
(select count(ID) as count from work
  group by ID);
quit;


data want;
do _n_=1 by 1 until(last.id);
  set work;
  by id;
  array days(*) Daystoevent1-Daystoevent&max_count;
  days(_n_)=DaystoEvent;
end;

do until(last.id);
  set work;
  by id;
  output;
end;
run;

Regards,

Naveen Srinivasan

L&T Infotech

View solution in original post


All Replies
Grand Advisor
Posts: 17,444

Re: Retain + Looping among multiple row records

Do you absolutely need the second column?

If not proc transpose is a direct solution.


I also recommend naming your variables Days_to_event1 Days_to_event2. This way you can use the shortcut reference for variables if required.


proc sort data=have; by ID daystoEvent; run;


proc transpose data=have out=want prefix=Days_to_event;

by ID;

var DaysToEvent;

run;

Occasional Contributor LLW
Occasional Contributor
Posts: 15

Re: Retain + Looping among multiple row records

Hi Fareeza,

Thank you so much--really appreciate it. Unfortunately, I need to keep the second variable (DaysToEvent) because I need to take the difference between DaytoEvent_1- DaystoEvent, DaystoEvent_2-DaystoEvent, etc. Is there a way to keep the DaystoEvent column?

Thank you so much,

Louise

Grand Advisor
Posts: 17,444

Re: Retain + Looping among multiple row records

I think this is the same: DaysToEvent(n)-daystoevent(1) be the same  as DaysToEvent(n)-DaysToEvent?


*Get max count of records;

proc sql;

select max(count) into :max_count from

(select count(ID) as count from Have

group by ID);

quit;


*Transpose;

proc sort data=have; by ID daystoEvent; run;


proc transpose data=have out=want prefix=Days_to_event;

by ID;

var DaysToEvent;

run;


*calculate differences;


data want2;

set want;

array days2(*) days_to_event1-days_to_event&max_count;

array diff(*) days_diff1-days_diff&max_count;


do i=1 to dim(days2);

diff(i)=days2(i)-days2(1);

end;

run;

Grand Advisor
Posts: 9,596

Re: Retain + Looping among multiple row records

Did I miss something ? Why not just a simple MERGE ?

data have;
  input ID $ DaystoEvent;
  cards;
A 100
A 301
A 499
B 200
B 250
C 100
;
run;

proc transpose data=have out=temp(drop=_name_) prefix=Days_to_event;
by ID;
var DaysToEvent;
run;
data want;
 merge have temp;
 by id;
run;

Xia Keshan

Esteemed Advisor
Posts: 7,300

Re: Retain + Looping among multiple row records

Similar to Fareeza's code, but producing the extra lines as well:

data work;

  input ID $ DaystoEvent;

  cards;

A 100

A 301

A 499

B 200

B 250

C 100

;

proc transpose data=work out=need(drop=_Smiley Happy prefix=DaystoEvent;

  by id;

run;

data want (drop=counter);

  length id $5;

  format daystoevent best12.;

  set need;

  by ID;

  array days(3) DaystoEvent1-DaystoEvent3;

  counter=1;

  do while(counter le 3 and not missing(days(counter)));

    Daystoevent=Days(counter);

    counter+1;

    output;

  end;

run;

Occasional Contributor LLW
Occasional Contributor
Posts: 15

Re: Retain + Looping among multiple row records

Hi Arthur,

Thank you so much! Is there a way to make the array number for DaystoEvent a variable as opposed to just 3? This is because in my sample data, I know that the largest number of records per ID is 3. However because I have such a large dataset (300,000 IDs) with varied #'s of records per ID, I'm not sure sure how many variables of DaystoEvent I will need for the array in the actual dataset.

Thank you so much,

Louise

Esteemed Advisor
Posts: 7,300

Re: Retain + Looping among multiple row records

You could use:

data work;

  input ID $ DaystoEvent;

  cards;

A 100

A 301

A 499

B 200

B 250

C 100

;

proc transpose data=work out=need(drop=_Smiley Happy prefix=DaystoEvent_;

  by id;

run;

data want (drop=counter);

  length id $5;

  format daystoevent best12.;

  set need;

  by ID;

  array days(*) DaystoEvent_:;

  counter=1;

  do while(counter le dim(days) and not missing(days(counter)));

    Daystoevent=Days(counter);

    counter+1;

    output;

  end;

run;

Solution
‎04-07-2015 12:44 AM
Frequent Contributor
Posts: 115

Re: Retain + Looping among multiple row records

data work;

  input ID $ DaystoEvent;

  cards;

A 100

A 301

A 499

B 200

B 250

C 100

;


proc sql; /* copied from Reeza's*/
select max(count) into :max_count separated by ' ' from
(select count(ID) as count from work
  group by ID);
quit;


data want;
do _n_=1 by 1 until(last.id);
  set work;
  by id;
  array days(*) Daystoevent1-Daystoevent&max_count;
  days(_n_)=DaystoEvent;
end;

do until(last.id);
  set work;
  by id;
  output;
end;
run;

Regards,

Naveen Srinivasan

L&T Infotech

Esteemed Advisor
Esteemed Advisor
Posts: 7,253

Re: Retain + Looping among multiple row records

Hi,

Sorry, still not clear why you need a transposed dataset.  As you state, there could be many per id, so you could end up with thousands of columns.

Why not just assign a number in the dataset, then using merging to get days from:

data have;

  input ID $ DaystoEvent;

datalines;

A 100

A 301

A 499

B 200

B 250

C 100

;

run;

data have;

  set have;

  by id;

  retain no;

  if _n_=1 or first.id then no=1;

  else no=no+1;

run;

proc sql;

  create table WANT as

  select  A.ID,

          A.NO,

          A.DAYSTOEVENT,

          A.DAYSTOEVENT - B.DAYSTOEVENT as RES

  from    WORK.HAVE A

  left join (select * from WORK.HAVE where NO=1) B

  on      A.ID=B.ID

  order by A.ID,

           A.NO;

quit;

Respected Advisor
Posts: 5,001

Re: Retain + Looping among multiple row records

At least 90% of the time, this sort of restructuring of the data is not necessary.  If you were to post a different question, describing the analysis you would like to perform, you would likely receive many answers that utilize your original structure to the data.

Good luck.

Occasional Contributor LLW
Occasional Contributor
Posts: 15

Re: Retain + Looping among multiple row records

Hi everyone!

This was my first time posting a question, and thank you to everyone for your help. I have received so many different ways of getting what I wanted. Part of the reason that I wanted all the "DaysToEvents" in columns was because I need to find the difference between all combinations of the difference of DaystoEvents by ID, and I thought that the best way to take differences was to use columns. But Astounding is right, I will have very many columns of new DaystoEvents (233 to be exact!).

I will think about how to move forward with the analysis that I want using this data structure, given the large columns.

Thank you to everyone!

Louise

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 380 views
  • 10 likes
  • 7 in conversation