Re: Insert rows into a data set

JasonL · Posted 12-12-2018 09:19 AM

Hi there,

I have a data set like this (The blank rows between different Patient_IDs are just for easy reading):

Patient_ID	Service_Type	Service_Volume
001	Serv_01	5
001	Serv_02	8
001	Serv_03	10

002	Serv_01	5

003	Serv_03	6

Many patients did not receive all three service types. I need to insert rows for each patient so that they all have three Service_Types and are with the value of 0 for the Service_Types that they did not receive service on. The resulting data set would be something like the following table:

Patient_ID	Service_Type	Service_Volume
001	Serv_01	5
001	Serv_02	8
001	Serv_03	10

002	Serv_01	5
002	Serv_02	0
002	Serv_03	0

003	Serv_01	0
003	Serv_02	0
003	Serv_03	6

How should I be accomplish this?

Thanks a lot!

Jason

RW9 · Posted 12-12-2018 09:50 AM

Sorry, why would you put blank rows in a dataset? That is not the way to use datasets and will make further programming harder. It is simple enough to put blanks in when you report the data out, but don't keep it in your dataset.

As for expanding the data, simplest method is to create a template dataset and then merge back e.g (note this is not tested as you have not put test data in the form of a datastep, we are not here to do data entry!):

proc sort data=have out=tmp nodupkey;
  by patient_id;
run;
data tmp (drop=i);
  set tmp;
  do i=1 to 3;
    service_type=cats("Serv_",put(i,z2.));
    output;
  end;
run;
data want;
  merge have tmp;
  by patient_id service_type;
run;

Note of course, as I mentioned above, that blanks will play havoc with this (and any other programming you might do, bad idea!).

JasonL · Posted 12-12-2018 10:02 AM

Thanks lot RW9!

I've solved the problem with your suggested method of creating a template data set.

Jason

PaigeMiller · Posted 12-12-2018 09:59 AM

proc sql;
	create table all_patient as select distinct patient_id from have;
	create table all_service_type as select distinct service_type from have;
	create table all_possible as select * from all_patient,all_service_type;
	create table want as select a.*,
            case when missing(h.service_volume)
            then 0 else h.service_volume end as service_volume
        from all_possible as a left join have as h 
        on a.patient_id=h.patient_id and a.service_type=h.service_type;
quit;

--
Paige Miller

JasonL · Posted 12-12-2018 10:05 AM

Thank you Paige!

r_behata · Posted 12-12-2018 10:01 AM

data patient;
input patient_id $ service_type $ service_volume;
cards;
001 serv_01 5
001 serv_02 8
001 serv_03 10
002 serv_01 5
003 serv_03 6
;
run;

proc sql;
create table pat as
	select distinct  patient_id from patient;
create table serv as
	select distinct  service_type from patient;

create table pat_serv as select a.patient_id,b.service_type
	from pat  a
	cross join serv b;
quit;

data want;
	merge patient(rename=(service_volume=service_volume_)) pat_serv;
	by patient_id service_type;

	if missing(service_volume_) then service_volume=0;
		else service_volume=service_volume_;

	drop service_volume_;
run;

JasonL · Posted 12-12-2018 10:06 AM

Thanks r_behata!

novinosrin · Posted 12-12-2018 10:17 AM

Hi @JasonL If you may play with this too

data have;
input Patient_ID$	Service_Type $	Service_Volume;
cards;
001	Serv_01	5
001	Serv_02	8
001	Serv_03	10
002	Serv_01	5
003	Serv_03	6
;
proc sql;
create table want as
select  Patient_ID,a.Service_Type ,((b.Service_Type=a.Service_Type)*Service_Volume) as Service_Volume 
from (select distinct Service_Type from have) a ,   have b
group by Patient_ID,a.Service_Type
having Service_Volume=max(Service_Volume)
order by Patient_ID,2;
quit;

Reeza · Posted 12-12-2018 10:50 AM

proc freq data=have noprint;
     table patient_id*service_type / out=want sparse;
     weight service_volume;
run;

You can use the SPARSE option on PROC FREQ to get this easily. Set the value to a weight so that you get 0 for no entries.

@JasonL wrote:

Hi there,

I have a data set like this (The blank rows between different Patient_IDs are just for easy reading):

Patient_ID Service_Type Service_Volume

001 Serv_01 5

001 Serv_02 8

001 Serv_03 10

002 Serv_01 5

003 Serv_03 6

Many patients did not receive all three service types. I need to insert rows for each patient so that they all have three Service_Types and are with the value of 0 for the Service_Types that they did not receive service on. The resulting data set would be something like the following table:

Patient_ID Service_Type Service_Volume

001 Serv_01 5

001 Serv_02 8

001 Serv_03 10

002 Serv_01 5

002 Serv_02 0

002 Serv_03 0

003 Serv_01 0

003 Serv_02 0

003 Serv_03 6

How should I be accomplish this?

Thanks a lot!

Jason

JasonL · Posted 12-12-2018 10:53 AM

Thank you everyone for your suggestions and for your time!

I have now solved the problem.

Jason

Reeza · Posted 12-12-2018 11:03 AM

Please mark the question as solved, by selecting an answer.

@JasonL wrote:

Thank you everyone for your suggestions and for your time!

I have now solved the problem.

Jason

Registration is open