Re: Transpose long to wide with multiple variables & create an indicat...

Sam20001 · Posted 08-04-2022 05:42 PM

Hello,

I have spent a few days reading articles about how to reshape data but cannot seem to figure out what to do. My dataset consists of emergency and hospital data with >18,000 observations. Variables: studyID, visit_type (ED or Hospital visit), visitdatetime (date of visit in datetime format). I also want to create indicator variables between every 2 visits to determine whether the ED visit precedes the hospital visit (0 =No, 1 = Yes). Some subjects had more than 100 visits over the course of 3 years (2018-2021). The data are sorted by studyID and visitdatetime because I need to reshape it so that visitdatetimes are in ascending order for each studyID. This requirement has me confused, so that's why I've turned to this community for help. Thank you.

Reeza · Posted 08-04-2022 05:45 PM

Is this your final goal, or is this something you think you need to get to a final result?
What's the final summary you're looking for? This data structure is possible but cumbersome, I suspect there's an easier to get what you want.

Sam20001 · Posted 08-04-2022 06:19 PM

Thank you for your reply @Reeza . My goal is to exclude an ED visit if it's within 48 hours, 7 days, or 14 days before a hospitalization for each subject. I think I might be able to create the 3 additional variables (within_48hr, within_7, within_14) after I get the data in the right format. I have reinserted photos of the data because I can't see them in my original post. Sorry I'm very new to this forum. I appreciate you help.

Reeza · Posted 08-04-2022 06:31 PM

@Sam20001 wrote:

Thank you for your reply @Reeza . My goal is to exclude an ED visit if it's within 48 hours, 7 days, or 14 days before a hospitalization for each subject.

That's much easier to do via SQL than this approach. Please repost your data as text and we can help you get that coded. No need to flip it to that structure.

Sam20001 · Posted 08-04-2022 07:03 PM

Thank you @Reeza . I have included text data below.

Current data (sorted by studyID and visitdatetime)

studyID visit_type visitdatetime

1 ED 05-22-2018 10:10:00

1 Hospital 06-27-2019 11:00:19

2 Hospital 01-16-2020 12:17:34

2 ED 06-24-2020 08:14:36

3 ED 09-24-2020 09:30:41

3 Hospital 09-24-2020 11:47:12

3 ED 01-10-2021 12:00:01

4 ED 07-16-2019 04:45:36

4 ED 12-22-2020 08:17:40

5 Hospital 06-19-2021 09:12:22

ballardw · Posted 08-04-2022 06:13 PM

" want to create indicator variables between every 2 visits" should likely be clarified.

Do you mean every sequential pair?

Other wise if you have 6 visits then you are looking a 5*4*3*2*1 combinations to get "every pair". Which by the time you play with 100 visits is an extremely large number of combinations (look up (n-1)! , ! meaning factorial Hint: 50! is on the order of 3.0414093E64 )

You might consider exactly what you are saying with that indicator variable. If you have a Hospital record that has 0 for the indicator then you say the Hospital visit does NOT follow an ED visit when in effect it does. What about 2 ED visits followed by Hospital? Would only the second indicator for the ED be set to 1? What if both of the ED visits are on the same date?

Can you provide an example of the type(s) of reports you expect to write with the new data set?

You can sequentially process data and look at previous values using either the LAG functions or Retain values from record to record.

Sam20001 · Posted 08-04-2022 06:47 PM

Hi @ballardw . Yes. I mean every sequential pair. The indicator variable suppose to identify an ED visit that comes before a hospitalization (1 if the ED comes before the hospital, 0 if the ED visit doesn't come before the hospital visit. I think I entered them backwards).

"What about 2 ED visits followed by Hospital? Would only the second indicator for the ED be set to 1?" You are correct. If a subject has 2 ED visits followed by hospital then only the second indicator would be set to 1.

"What if both of the ED visits are on the same date?" The dataset is ordered by studyID and visitdatetime. If the visitdatetime of the ED is before the visitdatetime of the hospital then the indicator would be set to 1.

Thanks for recommending the lag function. I actually tried using lag in the long file so that visitdatetimes of ED and hospital visits would be on the same row for each subject. I thought it worked until my supervisor reviewed the data and stated that the lag function in some cases filled in a visitdatetime from the previous record when it was for a different subject. I have not tried the retain function. I hope this makes sense.

Reeza · Posted 08-04-2022 06:16 PM

PS. If you do need help with programming, its very helpful to post your data as text not images. To work with images we have to either type it in or pass it through a text extractor which can always be problematic and more work.

Ksharp · Posted 08-05-2022 08:01 AM

data have;
input studyID          visit_type  $         visitdatetime $40.;
cards;
1                     ED                    05-22-2018 10:10:00
1                     Hospital            06-27-2019 11:00:19
2                     Hospital            01-16-2020 12:17:34
2                     ED                    06-24-2020 08:14:36
3                     ED                    09-24-2020 09:30:41
3                     Hospital            09-24-2020 11:47:12
3                     ED                    01-10-2021 12:00:01
3                     Hospital            09-24-2022 11:47:12
4                     ED                    07-16-2019  04:45:36
4                     ED                   12-22-2020 08:17:40
5                     Hospital           06-19-2021 09:12:22
;

proc sql noprint;
select max(n) into :n separated by ' '
 from (select count(*) as n from have group by studyID);

select ceil(max(n)/2) into :n2 separated by ' '
 from (select count(*) as n from have group by studyID);
quit;

proc summary data=have ;
by studyID;
output out=temp(drop=_:) idgroup(out[&n.] (visit_type visitdatetime)=);
run;

data want;
 set temp;
 array x{*} visit_type:;
 array y{*} ED_before_Hosp1-ED_before_Hosp&n2. ;
 n=0;
 do i=1 to dim(x) by 2;
   n+1;
   if i+1<=dim(x) then do;
     if x{i}='ED' and x{i+1}='Hospital' then y{n}=1;
       else  y{n}=0;
   end;
 end;
 drop n i;
run;

Sam20001 · Posted 08-05-2022 10:04 AM

Thank you very much for your help @Ksharp . However when I run the PROC Summary portion of the code, I get this error the following error message: "the parameter value 206 is not within the required range of 1 and 100." The full log is below. Is there another option? Thank you.

proc sql noprint;
select max(n) into :n separated by ' '
from (select count(*) as n from test group by studyID);

select ceil(max(n)/2) into :n2 separated by ' '
from (select count(*) as n from test group by studyID);
quit;

proc summary data=test;
by studyID;
output out=temp(drop=_:) idgroup(out[&n.] (visit_type visitdatetime)=);
run;

/*Log*/
137 proc sql noprint;
138 select max(n) into :n separated by ' '
139 from (select count(*) as n from test group by studyID);
140 select ceil(max(n)/2) into :n2 separated by ' '
141 from (select count(*) as n from test group by studyID);
142 quit;
/*NOTE: PROCEDURE SQL used (Total process time):*/
/* real time 28.14 seconds*/
/* cpu time 0.10 seconds*/

143 proc summary data=test ;
144 by studyID;
145 output out=temp(drop=_:) idgroup(out[&n.] (visit_type visitdatetime)=);
SYMBOLGEN: Macro variable N resolves to 206
ERROR: The parameter value 206 is not within the required range of 1 and 100.
146 run;

Ksharp · Posted 08-06-2022 05:50 AM

Sure. Of course.

data have;
input studyID          visit_type  $         visitdatetime $40.;
cards;
1                     ED                    05-22-2018 10:10:00
1                     Hospital            06-27-2019 11:00:19
2                     Hospital            01-16-2020 12:17:34
2                     ED                    06-24-2020 08:14:36
3                     ED                    09-24-2020 09:30:41
3                     Hospital            09-24-2020 11:47:12
3                     ED                    01-10-2021 12:00:01
3                     Hospital            09-24-2022 11:47:12
4                     ED                    07-16-2019  04:45:36
4                     ED                   12-22-2020 08:17:40
5                     Hospital           06-19-2021 09:12:22
;




data have;
 set have;
 by studyID;
 if first.studyID then n=0;
 n+1;
run;

proc sql noprint;
select max(n) into :n separated by ' '
 from (select count(*) as n from have group by studyID);

select ceil(max(n)/2) into :n2 separated by ' '
 from (select count(*) as n from have group by studyID);

select distinct catt('have(where=(n=',n,') 
 rename=(visit_type=visit_type_',n,' visitdatetime=visitdatetime_',n,'))')
into :merge separated by ' ' 
 from have ;
quit;

data temp;
 merge &merge.;
 by studyID;
 drop n;
run;


data want;
 set temp;
 array x{*} visit_type:;
 array y{*} ED_before_Hosp1-ED_before_Hosp&n2. ;
 n=0;
 do i=1 to dim(x) by 2;
   n+1;
   if i+1<=dim(x) then do;
     if x{i}='ED' and x{i+1}='Hospital' then y{n}=1;
       else  y{n}=0;
   end;
 end;
 drop n i;
run;

Transpose long to wide with multiple variables & create an indicator variable

Re: Transpose long to wide with multiple variables & create an indicator variable

Re: Transpose long to wide with multiple variables & create an indicator variable

Re: Transpose long to wide with multiple variables & create an indicator variable

Re: Transpose long to wide with multiple variables & create an indicator variable

Re: Transpose long to wide with multiple variables & create an indicator variable

Re: Transpose long to wide with multiple variables & create an indicator variable

Re: Transpose long to wide with multiple variables & create an indicator variable

Re: Transpose long to wide with multiple variables & create an indicator variable

Re: Transpose long to wide with multiple variables & create an indicator variable

Re: Transpose long to wide with multiple variables & create an indicator variable

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away