I would like to efficiently output frequencies, percent, and cumulative time for a number of categorical variables.
My dataset variables include:
ID number
Many variables for genes with the prefix "rs", with three possible categories per variable
Case (1 or 0)
Years of follow-up per case
Death (1 or 0)
I would like my final output to look like this for each gene:
rs#: Cases # (% of total cases): PYears: Deaths # (% of total deaths):
Category1
Category2
Category3
My program to calculate Years for each gene and category:
proc sort data=analytic; by case rs1; run;
PROC MEANS DATA=analytic NWAY noprint;
VAR Years ;
by case rs1;
OUTPUT OUT=survival SUM=PYears;
RUN;
I would be grateful if someone could post a macro to do this efficiently since I have many genes to consider.
If you have a program that is working for one gene, it's just a hop, skip, and a jump away. Right now you are sorting/processing:
by case rs1;
Instead, you can process:
by gene case rs1;
For pre written macros search lexjansen.com and clinical reports or summary tables.
You'll find a ton of papers with detailed explanations.
There's one that's a great reference: Creating Complex Reports by Cynthia Zender.
FYI - as phrased, your question isn't coming across as help me do something I can't do, but can you please do my work? Obviously this is up to people how they respond to questions of this nature.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.