About jusjolly

jusjolly · ‎05-03-2022

Hello, First time poster so apologies if I am unclear. I am working with a large study cohort which has some claims data. In the data, there can be multiple (duplicate) observations per day. I want to define a duplicate as an observation with the same studyID and dispensing date and aggregate the data based on these two variables. I also want to sum the payment variables ('pay', 'ded') that occur in these separate observations. I want to keep some of the other variables in the data as well (eg diag1, diag2, age, region, SES). I have tried: proc sql; create table test as select distinct studyid, diag1, diag2, age, region, SES, dispensedate, (sum(pay)) as totalpay, (sum(ded)) as totalded, from studydata group by studyid, dispensedate; quit; Looking at the proc means min/max, I do not think the payment variables were summed...

jusjolly · ‎10-25-2021

Hi folks, The data looks like this caseID servicedate indicator sBP 10000 2020-01-10 1 110 10000 2020-01-11 0 99 10000 2020-03-10 1 101 10000 2020-04-11 0 124 10001 2020-01-11 1 127 10001 2020-02-20 1 98 10001 2020-03-15 0 88 10001 2020-03-29 1 109 I am trying to tabulate (1) the number of unique case IDs which had a sBP measurement per month (it does not matter what the sBP measurement was, just that it occurred) where the indicator variable equals 1. In the above example, I would want to count row 1, 3, 5, 6 and 8. As well as (2) the total number of sBP measurements overall per month (again the value of the measurement is not important). Here, we would like to count all rows. The date format is yymmdn6. I was hoping to somehow specify the month in the servicedate

Online Status	Offline
Date Last Visited	‎05-04-2022 08:27 PM

Remove multiple observations, aggregate on studyID and dispensing date...

How to count distinct case IDs based on month and indicator variable

Remove multiple observations, aggregate on studyID and dispensing date...

How to count distinct case IDs based on month and indicator variable