I'm working on NPCR data on cancer incidence. It turned out I accidentally count more cancer events than what CDC/SEER rules would. After some extensive search, what happened is that sequence 00 is correct for the person’s only primary, however, sequence 01 could, for example, include both a first lung primary and also the first colon primary for the same person, which CDC sometimes excludes or combines under site-specific rules.
In other words, by including all “01” in cancer count, I double-(or triple-) charging a personal who got two separate primaries, whereas CDC’s rule set may only one of those two as an “incident” for the site/time window selected. For example, if a person had a lung primary in 2015 and a colon primary in 2017, my code will add 2 to cancer count, but CDC’s “2015-2020 lung + colon rate” might only count 1 (only the first lung, for example) depending on the site-specific multiple primary rules.
I am trying to identify a table referred to as “Multiple Primary (MP)” decision table, which will help decide whether to count as a primary cancer occurrence. So far hasn't been able to. Anyone knows how to identify this table and incorporate it as a Marco in SAS programming?
Here are some links to papers that you may be able to use:
https://pmc.ncbi.nlm.nih.gov/articles/PMC3638182/
If you are looking for basic counts, then you may be able to use first.byvar logic in a DATA step to add a count variable that is 1 for "sequence 00" and 0 for "sequence 01". However, you would need to include some sample data and show what you are expecting to get to determine if that is an option.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.