BookmarkSubscribeRSS Feed
JasonL
Quartz | Level 8

Hi there,

 

I have a data set like this (The blank rows between different Patient_IDs are just for easy reading):

 

Patient_ID Service_Type Service_Volume
001 Serv_01 5
001 Serv_02 8
001 Serv_03 10
     
002 Serv_01 5
     
003 Serv_03 6

 

Many patients did not receive all three service types.  I need to insert rows for each patient so that they all have three Service_Types and are with the value of 0 for the Service_Types that they did not receive service on.  The resulting data set would be something like the following table:

 

Patient_ID Service_Type Service_Volume
001 Serv_01 5
001 Serv_02 8
001 Serv_03 10
     
002 Serv_01 5
002 Serv_02 0
002 Serv_03 0
     
003 Serv_01 0
003 Serv_02 0
003 Serv_03 6

 

How should I be accomplish this?

 

Thanks a lot!

Jason

10 REPLIES 10
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Sorry, why would you put blank rows in a dataset?  That is not the way to use datasets and will make further programming harder.  It is simple enough to put blanks in when you report the data out, but don't keep it in your dataset.

As for expanding the data, simplest method is to create a template dataset and then merge back e.g (note this is not tested as you have not put test data in the form of a datastep, we are not here to do data entry!):

proc sort data=have out=tmp nodupkey;
  by patient_id;
run;
data tmp (drop=i);
  set tmp;
  do i=1 to 3;
    service_type=cats("Serv_",put(i,z2.));
    output;
  end;
run;
data want;
  merge have tmp;
  by patient_id service_type;
run;

Note of course, as I mentioned above, that blanks will play havoc with this (and any other programming you might do, bad idea!).

 

JasonL
Quartz | Level 8

Thanks lot RW9!

 

I've solved the problem with your suggested method of creating a template data set.

 

Jason

PaigeMiller
Diamond | Level 26
proc sql;
	create table all_patient as select distinct patient_id from have;
	create table all_service_type as select distinct service_type from have;
	create table all_possible as select * from all_patient,all_service_type;
	create table want as select a.*,
            case when missing(h.service_volume)
            then 0 else h.service_volume end as service_volume
        from all_possible as a left join have as h 
        on a.patient_id=h.patient_id and a.service_type=h.service_type;
quit;
--
Paige Miller
JasonL
Quartz | Level 8
Thank you Paige!
r_behata
Barite | Level 11
data patient;
input patient_id $ service_type $ service_volume;
cards;
001 serv_01 5
001 serv_02 8
001 serv_03 10
002 serv_01 5
003 serv_03 6
;
run;

proc sql;
create table pat as
	select distinct  patient_id from patient;
create table serv as
	select distinct  service_type from patient;

create table pat_serv as select a.patient_id,b.service_type
	from pat  a
	cross join serv b;
quit;

data want;
	merge patient(rename=(service_volume=service_volume_)) pat_serv;
	by patient_id service_type;

	if missing(service_volume_) then service_volume=0;
		else service_volume=service_volume_;

	drop service_volume_;
run;
JasonL
Quartz | Level 8

Thanks r_behata!

novinosrin
Tourmaline | Level 20

Hi @JasonL If you may play with this too

 

data have;
input Patient_ID$	Service_Type $	Service_Volume;
cards;
001	Serv_01	5
001	Serv_02	8
001	Serv_03	10
002	Serv_01	5
003	Serv_03	6
;
proc sql;
create table want as
select  Patient_ID,a.Service_Type ,((b.Service_Type=a.Service_Type)*Service_Volume) as Service_Volume 
from (select distinct Service_Type from have) a ,   have b
group by Patient_ID,a.Service_Type
having Service_Volume=max(Service_Volume)
order by Patient_ID,2;
quit;
Reeza
Super User
proc freq data=have noprint;
     table patient_id*service_type / out=want sparse;
     weight service_volume;
run;

You can use the SPARSE option on PROC FREQ to get this easily. Set the value to a weight so that you get 0 for no entries. 

 


@JasonL wrote:

Hi there,

 

I have a data set like this (The blank rows between different Patient_IDs are just for easy reading):

 

Patient_ID Service_Type Service_Volume
001 Serv_01 5
001 Serv_02 8
001 Serv_03 10
     
002 Serv_01 5
     
003 Serv_03 6

 

Many patients did not receive all three service types.  I need to insert rows for each patient so that they all have three Service_Types and are with the value of 0 for the Service_Types that they did not receive service on.  The resulting data set would be something like the following table:

 

Patient_ID Service_Type Service_Volume
001 Serv_01 5
001 Serv_02 8
001 Serv_03 10
     
002 Serv_01 5
002 Serv_02 0
002 Serv_03 0
     
003 Serv_01 0
003 Serv_02 0
003 Serv_03 6

 

How should I be accomplish this?

 

Thanks a lot!

Jason


 

JasonL
Quartz | Level 8

Thank you everyone for your suggestions and for your time!

 

I have now solved the problem.

Jason

Reeza
Super User

Please mark the question as solved, by selecting an answer.

 


@JasonL wrote:

Thank you everyone for your suggestions and for your time!

 

I have now solved the problem.

Jason


 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1386 views
  • 1 like
  • 6 in conversation