How can I use PROC SQL to create dummy rows based on certain conditions?
I have a table that looks like this:
ID | Tech Clients | Legal Clients |
12345 | 54 | 35 |
87564 | 20 | 100 |
I left join into this table a table that has commission information which creates a table that looks like this:
ID | Tech Clients | Legal Clients | Commission | Commission Type |
12345 | 54 | 35 | 10000 | Tech |
12345 | 54 | 35 | 5000 | Legal |
87564 | 0 | 100 | 8000 | Legal |
However, if someone is missing a commission type in the commission table, such as with ID 87564 who doesn't have a tech commission, they simply don't have a row for tech commission in the joined table. How do I create a table like the one below where, if a person is missing Tech or Legal commission in the commissions table, a row is still created for them where the commission amount is 0?
ID | Tech Clients | Legal Clients | Commission | Commission Type |
12345 | 54 | 35 | 10000 | Tech |
12345 | 54 | 35 | 5000 | Legal |
87564 | 0 | 100 | 0 | Tech |
87564 | 0 | 100 | 8000 | Legal |
Is it a requirement to use Proc SQL?
I wrote that because the rest of my query is written with PROC SQL, but it's not necessarily a requirement
Ok. Of course, you can do this directly in the join process, but you haven't posted the data being joined against.
So, in the original question. Here is an easy way to get the desired number of observations. From there, you can easily insert the desired amounts and counts in the 'inserted' obs.
data have;
input ID TechClients LegalClients Commission CommissionType $;
datalines;
12345 54 35 10000 Tech
12345 54 35 5000 Legal
87564 0 100 8000 Legal
;
proc summary data = have completetypes nway;
class ID CommissionType;
var TechClients LegalClients Commission;
output out = want(drop = _:) sum=;
run;
I have a sneaking suspicion that if you turn this
ID | Tech Clients | Legal Clients |
12345 | 54 | 35 |
87564 | 20 | 100 |
into something that looks more like
ID Client_type Count,
where Client_type has values of "Tech" and "Legal" that likely joining on ID and Client_type would make more sense in the long run. You don't show what the table with the commission information actually looks like. Your desired table is likely going to be a bit awkward to work with for many purposes.
If this were my data I would do something more like:
data have; input ID $ TechClients LegalClients; datalines; 12345 54 35 87564 20 100 ; proc transpose data=have out=trans; by id; run; data need; set trans; length ClientType $ 8; if _name_='TechClients' then ClientType='Tech'; if _name_='LegalClients' then ClientType='Legal'; Rename Col1=Clients; drop _name_; run; Data commission; input id $ Commision_type $ Commission; datalines; 12345 Legal 5000 12345 Tech 10000 87564 Legal 8000 ; proc sql; create table want as select a.id,a.Commision_type, a.commission, b.clients from commission as a left join need as b on a.id=b.id and b.clienttype=a.Commision_type ; quit;
Which would allow me to display "dummy row" with something like this if were needed:
proc tabulate data=want; class id Commision_type; var commission clients; table id='', Commision_type*(commission clients)*sum='' /misstext='0' box='Id' ; run;
Hope this helps
data Clients;
input ID TechClients LegalClients ;
datalines;
12345 54 35
87564 20 100
;
run;
data ClientType;
input CType$;
datalines;
Tech
Legal
;
run;
Proc sql;
Create table ClientsWithClientType as
Select * from Clients
cross join ClientType
;Quit;
data Commission;
input ID Commission CommissionType $;
datalines;
12345 10000 Tech
12345 5000 Legal
87564 8000 Legal
;
run;
Proc sql;
Create table Output as
Select
a.Id
, case when b.Commission=. then 0 else a.TechClients end as TechClients
, a.LegalClients
, coalesce(b.Commission,0) as Commission
, a.CType
from ClientsWithClientType as a
left join Commission as b
on a.Id=b.Id and a.Ctype=b.CommissionType
order by Id, Ctype desc
;Quit;
Output
ID | TechClients | LegalClients | Commission | CType |
12345 | 54 | 35 | 10000 | Tech |
12345 | 54 | 35 | 5000 | Legal |
87564 | 0 | 100 | 0 | Tech |
87564 | 20 | 100 | 8000 | Legal |
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.