Hi,
I have a data set with several variable. Out of this, I want to create a separate data set with some existing variables and some new variables, that are produced out of an existing variable. This works pretty fine with this SAS code:
proc sql;
create table dataset_new as
select id,
year,
count(distinct medication) as n_medication,
case
when count(distinct medication) = 1 then max(medication)
end as medication_count_1,
case
when count(distinct medication) = 2 then catx(', ', put(min(medication), 8.), put(max(medication), 8.))
end as medication_count_2,
case
when count(distinct medication) = 3 then '1, 2, 3'
end as medication_count_3,
case
when count(distinct medication) not in (1, 2, 3) then 'Unknown'
end as medication_count_other
from dataset_have
group by id, year;
quit;
medication is eighter 1, 2, or 3.
Now I would like to include another variable ("factor") into the dataset_new, next to "id" and "year", I tried to addapt the code as followed:
proc sql;
create table dataset_new as
select id,
year, factor,
count(distinct medication) as n_medication,
case
when count(distinct medication) = 1 then max(medication)
end as medication_count_1,
case
when count(distinct medication) = 2 then catx(', ', put(min(medication), 8.), put(max(medication), 8.))
end as medication_count_2,
case
when count(distinct medication) = 3 then '1, 2, 3'
end as medication_count_3,
case
when count(distinct medication) not in (1, 2, 3) then 'Unknown'
end as medication_count_other
from dataset_have
group by id, year;
quit;
But, now the dataset_new has way more records / rows than before and "id"s appear more than once per year (should only be once per year, as before).
Why is that, where is the fault? How can I fix the code?
Thank you very much!
... View more