Hello, First time poster so apologies if I am unclear. I am working with a large study cohort which has some claims data. In the data, there can be multiple (duplicate) observations per day. I want to define a duplicate as an observation with the same studyID and dispensing date and aggregate the data based on these two variables. I also want to sum the payment variables ('pay', 'ded') that occur in these separate observations. I want to keep some of the other variables in the data as well (eg diag1, diag2, age, region, SES). I have tried: proc sql; create table test as select distinct studyid, diag1, diag2, age, region, SES, dispensedate, (sum(pay)) as totalpay, (sum(ded)) as totalded, from studydata group by studyid, dispensedate; quit; Looking at the proc means min/max, I do not think the payment variables were summed...
... View more