SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Transpose then merge the two data sets

Reply
Occasional Contributor
Posts: 17

Transpose then merge the two data sets

data offenders;
   input  id   offence_DATE mmddyy10. sex$;
   format offence_DATE  mmddyy10.;
   datalines;
1 05/23/05  M 
1 08/01/09  M
2 04/18/01  F
2 06/10/12  F
3 01/01/11  M
;
run;


proc transpose data=offenders out=offenders1 (drop=_name_ _label_) prefix=offence_DATE;
 by id;
 var offence_DATE;
run; 
 


data HOSPITAL;
   input  id   ADMIT_date mmddyy10. ;
   format ADMIT_date mmddyy10.;
   datalines;
1 10/21/03      
1 06/29/05      
1 02/03/07 
1 09/21/08
1 08/17/09         
3 12/31/10      
3 01/02/13      
;
run;

proc transpose data=HOSPITAL out=HOSPITAL1 (drop=_name_ _label_) prefix=ADMIT_date;
 by id;
 var ADMIT_date;
run; 

I tried the following code, but  it doesn't work because I have a multiple admit_date

 

/*

data want;

merge offenders1 HOSPITAL1;

where i=1 to end(admit_date);

by id;

if offence_DATE1 < ADMIT_date_i  < offence_DATE2 then flag_admit=1;

else flag_admit=0;

run;

*/

The result will be one row for each id such as

id    offence_date1      offence_date2      admit_date1  admit_date2   admit_date3...

1

2

3

 

Could you please suggest for me the best method for doing it.

PROC Star
Posts: 618

Re: Transpose then merge the two data sets

Posted in reply to elias_2020

May i ask what does this   where i=1 to end(admit_date); statement accomplish?

 

/*

data want;

merge offenders1 HOSPITAL1;

where i=1 to end(admit_date);

by id;

if offence_DATE1 < ADMIT_date_i  < offence_DATE2 then flag_admit=1;

else flag_admit=0;

run;

*/

The result will be one row for each id such as

id    offence_date1      offence_date2      admit_date1  admit_date2   admit_date3...

1

2

3

Contributor
Posts: 32

Re: Transpose then merge the two data sets

Posted in reply to novinosrin

I am also quite mystified as to what you're aiming to achieve with the statement:

where i=1 to end(admit_date);

 

The variable i doesn't appear in either of your input datasets.  My guess is you're trying to analyse the correspondence between offence_date and offence_date using array processing, but don't have an array statement.  In any case, array are not good at handling highly variable data (you need to define a fixed number of array elements in the datastep).

 

It is a good idea when asking for technical assistance in a forum like this, to step back and describe your problem in business terms rather than in technical details of the approach you're first attempting.  This allows others to offer solutions that may be better able to solve your underlying problem.

 

If you are trying to analyse the correspondence between offence_date and offence_date, then I'd suggest renaming, or duplicating the date variables (to event_date ?), merging by id event_date, and maybe using lag functions to assist in identifying the correspondence between offence and hospital admission.

 

Hope this helps.

Occasional Contributor
Posts: 17

Re: Transpose then merge the two data sets

Posted in reply to DaveBirch

Could you please show me how I can do it.

Contributor
Posts: 32

Re: Transpose then merge the two data sets

Posted in reply to elias_2020

You could try something like this:

 

proc sort data=offenders;
  by id offence_date;
run;
data offenders1;
  set offenders;
  by id offence_date;
  prev_offence_date = lag(offence_date);
  if first.id then delete;
  /* assuming only interested in offenders with multiple offences */
  /* otherwise we may need the following line and others */
  /*if first.id then prev_offence_date = .;*/
run;

 

proc sort data=hospital;
  by id admit_date;
run;

 

/* SQL handles M:N instances better than data step MERGE */
proc sql noprint;
  create table want1 as
  select id, prev_offence_date, offence_date,
     admit_date, sex
  from offenders1 t1
  left join hospital t2
  on (t1.id eq t2.id)
  where (t1.prev_offence_date le t2.admit_date lt t1.offence_date)
  order by t1.id, t1.offence_date, t2.admit_date;
quit;

 

You now have a dataset with a row for every hospital admission that occurs "between" offences for offenders.  This is in many ways easier to analyse (and more flexible) than a more "flattened" structure.  If you do desire to flatten it, there are many business issues still to be resolved to determine the best approach.

 

Cheers 

Super User
Posts: 11,793

Re: Transpose then merge the two data sets

[ Edited ]
Posted in reply to elias_2020

You might describe what you are attempting to accomplish.

Combining two data sets has many approaches depending on what the desired result may be.

 

Hopefully your result is small enough that you could post a third data set showing what the result should look like and describe the purpose.

Such as should anything special be done with records with a hospital date but not and offense date?

Offense date without hospital date? What about a hospital date between multiple offenses?

 

Since it appears that you may be wanting to compare multiple values of offense and admin dates then you likely want one array to hold each of those. And since it would not make sense to have a single flag to hold all of the comparisons you likely need another array to hold the flags. But to make a more concrete example we really should see what the full set of comparison results would look like.

 

Occasional Contributor
Posts: 17

Re: Transpose then merge the two data sets

[ Edited ]
Posted in reply to elias_2020

sorry, but I'm trying to make as loop

after transpose the data from multiple row for each id to multiple column (one row)  for each id.

I have such as ADMIT_date1  ADMIT_date2  ADMIT_date3  ....

I would like to know if the offender has  any admission to the hospital between the first offence and the second offence.

if YES, when ? how many times ?

 

 

Thanks

 

 

PROC Star
Posts: 618

Re: Transpose then merge the two data sets

[ Edited ]
Posted in reply to elias_2020

Please give us(the community-you included) a sample data set of your input and a sample data set of your wanted output. We can help you with the code you may require

PS give a brief description of the logic for us to implement

Occasional Contributor
Posts: 17

Re: Transpose then merge the two data sets

[ Edited ]
Posted in reply to novinosrin
data offenders;
   input  id   offence_DATE mmddyy10. sex$;
   format offence_DATE  mmddyy10.;
   datalines;
1 05/23/05  M 
1 08/01/09  M
2 04/18/01  F
2 06/10/12  F
3 01/01/11  M
;
run;


proc transpose data=offenders out=offenders1 (drop=_name_ _label_) prefix=offence_DATE;
 by id;
 var offence_DATE;
run; 
 


data HOSPITAL;
   input  id   ADMIT_date mmddyy10. ;
   format ADMIT_date mmddyy10.;
   datalines;
1 10/21/03      
1 06/29/05      
1 02/03/07 
1 12/28/09         
3 12/31/10      
3 01/02/13      
;
run;

proc transpose data=HOSPITAL out=HOSPITAL1 (drop=_name_ _label_) prefix=ADMIT_date;
 by id;
 var ADMIT_date;
run; 




the result shop be look like as below:

 

id    offence_DATE1   offence_DATE2    ADMIT_date1   ADMIT_date2   ADMIT_date3     contact between offences      #

1      05/23/05                08/01/09                 10/21/03          06/29/05              12/28/09                       yes                          1

2       04/18/01               06/10/11                   .                       .                          .                                   no                             0

3      01/01/11                  .                           12/31/10              01/02/13            .                                    yes                          1

 

 

 

 

/*** # means the count number of admission to the hospital between the two offences if the offender has two or after the first offence when the offender didn't have the second offence*/

 

 

Thank you

Super User
Super User
Posts: 7,255

Re: Transpose then merge the two data sets

Posted in reply to elias_2020

It is probably going to be easier to combine the files in their original format instead of the transposed format. Perhaps something like this?

data want ;
  row+1;
  length id date_no 8;
  set offenders(in=in1 rename=(offence_date=DATE))
      hospital (in=in2 rename=(admit_date=DATE))
  ;
  by id date ;
  if in1 then offence_date=date ;
  if in2 then admit_date=date ;
  date_no + first.date;
  if first.id then date_no=1 ;
  format date admit_date offence_date yymmdd10. ;
run;

image.png

So what does your new FLAG variable represent?  Which rows do you want to FLAG for this sample data?  

Ask a Question
Discussion stats
  • 9 replies
  • 194 views
  • 1 like
  • 5 in conversation