Hi there,
I have a data set like this (The blank rows between different Patient_IDs are just for easy reading):
Patient_ID | Service_Type | Service_Volume |
001 | Serv_01 | 5 |
001 | Serv_02 | 8 |
001 | Serv_03 | 10 |
002 | Serv_01 | 5 |
003 | Serv_03 | 6 |
Many patients did not receive all three service types. I need to insert rows for each patient so that they all have three Service_Types and are with the value of 0 for the Service_Types that they did not receive service on. The resulting data set would be something like the following table:
Patient_ID | Service_Type | Service_Volume |
001 | Serv_01 | 5 |
001 | Serv_02 | 8 |
001 | Serv_03 | 10 |
002 | Serv_01 | 5 |
002 | Serv_02 | 0 |
002 | Serv_03 | 0 |
003 | Serv_01 | 0 |
003 | Serv_02 | 0 |
003 | Serv_03 | 6 |
How should I be accomplish this?
Thanks a lot!
Jason
Sorry, why would you put blank rows in a dataset? That is not the way to use datasets and will make further programming harder. It is simple enough to put blanks in when you report the data out, but don't keep it in your dataset.
As for expanding the data, simplest method is to create a template dataset and then merge back e.g (note this is not tested as you have not put test data in the form of a datastep, we are not here to do data entry!):
proc sort data=have out=tmp nodupkey; by patient_id; run; data tmp (drop=i); set tmp; do i=1 to 3; service_type=cats("Serv_",put(i,z2.)); output; end; run; data want; merge have tmp; by patient_id service_type; run;
Note of course, as I mentioned above, that blanks will play havoc with this (and any other programming you might do, bad idea!).
Thanks lot RW9!
I've solved the problem with your suggested method of creating a template data set.
Jason
proc sql;
create table all_patient as select distinct patient_id from have;
create table all_service_type as select distinct service_type from have;
create table all_possible as select * from all_patient,all_service_type;
create table want as select a.*,
case when missing(h.service_volume)
then 0 else h.service_volume end as service_volume
from all_possible as a left join have as h
on a.patient_id=h.patient_id and a.service_type=h.service_type;
quit;
data patient;
input patient_id $ service_type $ service_volume;
cards;
001 serv_01 5
001 serv_02 8
001 serv_03 10
002 serv_01 5
003 serv_03 6
;
run;
proc sql;
create table pat as
select distinct patient_id from patient;
create table serv as
select distinct service_type from patient;
create table pat_serv as select a.patient_id,b.service_type
from pat a
cross join serv b;
quit;
data want;
merge patient(rename=(service_volume=service_volume_)) pat_serv;
by patient_id service_type;
if missing(service_volume_) then service_volume=0;
else service_volume=service_volume_;
drop service_volume_;
run;
Thanks r_behata!
Hi @JasonL If you may play with this too
data have;
input Patient_ID$ Service_Type $ Service_Volume;
cards;
001 Serv_01 5
001 Serv_02 8
001 Serv_03 10
002 Serv_01 5
003 Serv_03 6
;
proc sql;
create table want as
select Patient_ID,a.Service_Type ,((b.Service_Type=a.Service_Type)*Service_Volume) as Service_Volume
from (select distinct Service_Type from have) a , have b
group by Patient_ID,a.Service_Type
having Service_Volume=max(Service_Volume)
order by Patient_ID,2;
quit;
proc freq data=have noprint; table patient_id*service_type / out=want sparse; weight service_volume; run;
You can use the SPARSE option on PROC FREQ to get this easily. Set the value to a weight so that you get 0 for no entries.
@JasonL wrote:
Hi there,
I have a data set like this (The blank rows between different Patient_IDs are just for easy reading):
Patient_ID Service_Type Service_Volume 001 Serv_01 5 001 Serv_02 8 001 Serv_03 10 002 Serv_01 5 003 Serv_03 6
Many patients did not receive all three service types. I need to insert rows for each patient so that they all have three Service_Types and are with the value of 0 for the Service_Types that they did not receive service on. The resulting data set would be something like the following table:
Patient_ID Service_Type Service_Volume 001 Serv_01 5 001 Serv_02 8 001 Serv_03 10 002 Serv_01 5 002 Serv_02 0 002 Serv_03 0 003 Serv_01 0 003 Serv_02 0 003 Serv_03 6
How should I be accomplish this?
Thanks a lot!
Jason
Thank you everyone for your suggestions and for your time!
I have now solved the problem.
Jason
Please mark the question as solved, by selecting an answer.
@JasonL wrote:
Thank you everyone for your suggestions and for your time!
I have now solved the problem.
Jason
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.