I have to use proc tabulate to create the report that I am attaching in the reply to this message. I don't care if it is 3 different tables I just need the code to reflect the report.
The final result does not have to be in EXCEL. I attached the excel dataset, to this message, because that it what we will always be using to create the report. I am not certain how to pull only values from a variable that I need. A little more about the dataset. All TB "contacts" are listed by the StateCaseNo. Repeated StateCaseNo means that is the same contact for one TB case. The purpose of the report is to use the contact's datasheet, which I have attached, to make a report showing how many contacts, how many cases, how many contacts got the disease (disease status is listed in the DispositionClass_Desc variable; blanks indicate not infected), how many contact got the infection (once again, it's listed in DispositionClass_Desc), how many started treatment (I'm assuming we could tell this by looking a started treatment date variable), and how many completed treatment (i used the therapy_stop_reason variable). The "others" variable on the report are referring to kids ages 2 and under. I do not know how to get the table to format like the report I attached on my initial post. I was going to use the Therapy_Stop_Reason variable for the middle table of the report. The last table calculates rates using the same variables from the 1st table. This is the part where I have no idea what to do. The (b1/a1),% next to the numbers in the table I am not concerned about. I am really just wanting the calculation rates. Please let me know if I cleared this up a little or if it is still vague. I have copied and pasted the first few lines of the dataset below. The enitre dataset is attached. Thank you!
StateCaseNo | CaseCount | SPUTUM_CULTURE_RESULT | SPUTUM_SMEAR_RESULT | CASE_VERIFICATION | SPUTUM_SMEAR_COLLECT_DATE | INIT_REGIMEN_START_DATE | CASE_STATUS | SPUTUM_CULT_COLLECT_DATE | CALC_DISEASE_SITE | AGE_REPORTED | CHEST_XRAY_CAVITY_EVIDENCE | CHEST_XRAY_MILIARY_EVIDENCE | CHEST_XRAY_RESULT | CONFIRMATION_DATE | COUNTRY_OF_VERIFIED_CASE | COUNT_DATE | COUNT_STATUS | DATE_ARRIVED_IN_US | DATE_REPORTED | DATE_SUBMITTED | DISEASE_SITE_1 | FINAL_ISOLATE_COLLECT_DATE | FINAL_ISOLATE_IS_SPUTUM_IND | HOMELESS_IND | INIT_SUSCEPT_ETHAMBUTOL | INIT_SUSCEPT_ISONIAZID | INIT_SUSCEPT_PYRAZINAMIDE | INIT_SUSCEPT_RIFAMPIN | INIT_SUSCEPT_STREPTOMYCIN | OCCUPATION_RISK | PATIENT_BIRTH_SEX | PATIENT_BIRTH_COUNTRY | PATIENT_ETHNICITY | PATIENT_DOB | PATIENT_ZIP | PATIENT_WITHIN_CITY_LIMITS | PREVIOUS_DIAGNOSIS_IND | RACE_CALCULATED | SMR_EXAM_TY_1 | STATUS_AT_DIAGNOSIS | TST_MM_INDURATION | TST_RESULT | PATIENT_COUNTY | HIV_STATUS | THERAPY_STOP_DATE | THERAPY_STOP_REASON | THERAPY_STOP_CAUSE_OF_DEATH | THERAPY_EXTEND_GT_12_OTHER | INIT_REGIMEN_ETHAMBUTOL | INIT_REGIMEN_ISONIAZID | INIT_REGIMEN_PYRAZINAMIDE | INIT_REGIMEN_RIFABUTIN | INIT_REGIMEN_RIFAMPIN | INIT_REGIMEN_STREPTOMYCIN | CORRECTIONAL_FACIL_RESIDENT | CORRECTIONAL_FACIL_TY | CT_SCAN_CAVITY_EVIDENCE | CT_SCAN_RESULT | CULT_TISSUE_RESULT | CULT_TISSUE_RESULT_RPT_DATE | CULT_TISSUE_RESULT_RPT_LAB_TY | CULT_TISSUE_SITE | EXCESS_ALCOHOL_USE_PAST_YEAR | IGRA_RESULT | NAA_RESULT | NONINJECT_DRUG_USE_PAST_YEAR | PATIENT_CITY | PATIENT_COUNTRY | PATIENT_STREET_ADDRESS_1 | PATIENT_STREET_ADDRESS_2 | PREGNANT | PREVIOUS_DIAGNOSIS_YEAR | RACE_ASIAN_ALL | RACE_NAT_HI_ALL | SMR_EXAM_TY_ALL | SMR_PATH_CYTO_RESULT | TB_SPUTUM_CULTURE_NEGATIVE_DAT | US_BORN_IND | Relationship_Desc | NaturalBornUSCitizen | ContactCity | ContactCounty | ContactState | ContactZip | ContactBirthCountry | ContactRace | Ethnicity | ContactGender | ContactCountryofOrigin | ContactArrivedUS | InformantCase | InformantContact | InformantThirdParty | InformantPersonalObservation | Exposure | TheTime | FirstExposureDate | ContactBrokenDate | Impression | Ventilation | TheSize | Place | RelevantTSTDate | RelevantTSTResult | RelevantTSTInduration | RelevantIGRADate | RelevantIGRAResult | RelevantIGRAType | RelevantIGRAValue | DispositionClass_Desc | DispositionClassDate |
2014ARAR1400001 | 1 | Positive | Positive | 1 Positive Culture | 12JAN14:00:00:00 | 13JAN14:00:00:00 | Confirmed | 19JAN14:00:00:00 | Pulmonary | 25 | No | No | Abnormal | 17JAN14:00:00:00 | Count as a TB Case | . | 17JAN14:00:00:00 | 17JAN14:00:00:00 | Pulmonary | 04JUN14:00:00:00 | Yes | No | Susceptible | Susceptible | Susceptible | Susceptible | Susceptible | Unemployed | Male | UNITED STATES | Not Hispanic or Latino | 28JUN88:00:00:00 | 71801 | Y | No | Black or African American | Alive | 18 | Positive | Hempstead County | Positive | 27JAN15:00:00:00 | Completed Therapy | Yes | Yes | Yes | No | Yes | No | No | Not Done | Not Done | . | No | Not Done | Positive | Yes | Hope | United States | 903 North Walker | Not Done | 18JUN14:00:00:00 | Yes | Household | No | AR | . | UNITED STATES | Black | Non-Hispanic | Male | No | No | No | No | Close | House | Household | . | . | 4-Mar-14 | Positive | T-Spot | . | |||||||||||||||||||||||||||
2014ARAR1400001 | Positive | Positive | 1 Positive Culture | 12JAN14:00:00:00 | 13JAN14:00:00:00 | Confirmed | 19JAN14:00:00:00 | Pulmonary | 25 | No | No | Abnormal | 17JAN14:00:00:00 | Count as a TB |
Attached is the report I am trying to create.
Please post what you've tried. Is there a specific issue that you're having trouble with?
Posting a request like the above comes across more like a request of do my work rather than a request for help.
options center ps=60 ls=120;
/************************************************************** */
/*File: AutoCode to convert Excel Data to SAS data set */
/*Date: 01/21/2016 */
/*Data: Data from C:\Users\Shaquina\ */
*/ AutoCode.xlsx */
*/Create: C:\Users\Shaquina\AutoCode.sas7bdat */
/****************************************************************/
;
* First: Import data using the import wizard
1. Save file to be imported as a CSV(MS-DOS) file.
2. In SAS, click on File, Import Data...
3. From the drop down menu, click on Comma Separated Values (*.csv).
4. Choose the file destination and click next.
5. Choose library and name the new sas file and click next.
6. Click finish.
7. Proc print data= XX.XXX (obs=5) *This will confirm you successfully uploaded the file**;
******************************************************************************************************;
libname Danielle 'C:\Users\Shaquina\';
Title color= black "Tuberculosis Control Program-Arkansas";
title2 color= black "Preliminary";
title3 color = black "Aggregate Reports for Tuberculosis Program Evaluation:";
title4 color = black "Follow-up and Treatment for Contacts to Tuberculosis Cases";
title5 color = black "Reporting Area AR0000";
title6 color = black "Cohort Year: 2014";
title7 color = black "Closure Date for Follow-up: 08/15/2015";
title8 color = black "Total TB Cases Reported: 93";
run;
footnote ' ';
*merging the data;
PROC PRINT data = Danielle.AutoCode1 (obs=5);
run;
proc print data = Danielle.AutoCode_2 (obs=5); run;
proc sort data=Danielle.Autocode1;
by StateCaseNo;
run;
proc sort data= Danielle.Autocode_2;
by StateCaseNo;
run;
data Danielle.AutoCodeAll_;
merge Danielle.AutoCode1 Danielle.AutoCode_2;
by StateCaseNo;
run;
proc print data = Danielle.AutoCodeAll (obs=5); run;
Data Danielle.NewAutoCode_1;
set Danielle.AutocodeAll;
if Sputum_Smear_Result= 'Positive' or Sputum_Smear_Result= 'Negative';
if Therapy_Stop_Reason = 'Completed Therapy';
run;
proc contents data = Danielle.NewAutoCode_1; run;
*Part 1. Cases and Contacts;
proc tabulate data =Danielle.NewAutoCode_1 missing;
TITLE9 "Types of Cases for Investigation" ;
table DispositionClass_Desc="Cases For Investigation" Therapy_Stop_Reason,
Sputum_Smear_Result="Sputum Smear";
Class Sputum_Smear_Result DispositionClass_Desc Therapy_Stop_Reason;
run;
*Reasons Treatment Not Completed;
title '';
data Danielle.NewAutoCode_2;
set Danielle.AutoCodeAll;
if Sputum_Smear_Result='Positive' or Sputum_Smear_Result='Negative';
run;
proc tabulate data = Danielle.NewAutoCode_2;
table Therapy_Stop_Reason='Reason Treatment Not Completed:', Sputum_Smear_Result='Sputum Smear';
Class Sputum_Smear_Result Therapy_Stop_Reason;
run;
Footnote ' Public reporting burden for this collection of information is estimated to average 3 hours per manual by data clerks; 30 minutes per hour
manual response by program managers; 30 minutes per electronic response by data clers and program managers, including the time for reviewing instructions,
searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. An agency may
not conduct or sponsor, and a person is not required to respond to a collection of information unless it displays a currently valid OMB control
number. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this
burden to CDC/ASTDR Information Collection Review Office, 1600 Clfton Road, MS-D-24 Atlanta, GA 30333, ATTN: PRA(0920-0457). Do not send the completed form to this address.';
run;
I forgot to put the code I already have. I have attached it above. My code is printing out the first two tables (the best I could get it to mimic the report). The last table is computing variables that I do not know how to do.
If my question seems too vague...I am mostly concerned with how to compute the variables in the third table printed on the ARPE report.
options center ps=60 ls=120;
/************************************************************** */
/*File: AutoCode to convert Excel Data to SAS data set */
/*Date: 01/21/2016 */
/*Data: Data from C:\Users\Shaquina\ */
*/ AutoCode.xlsx */
*/Create: C:\Users\Shaquina\AutoCode.sas7bdat */
/****************************************************************/
;
* First: Import data using the import wizard
1. Save file to be imported as a CSV(MS-DOS) file.
2. In SAS, click on File, Import Data...
3. From the drop down menu, click on Comma Separated Values (*.csv).
4. Choose the file destination and click next.
5. Choose library and name the new sas file and click next.
6. Click finish.
7. Proc print data= XX.XXX (obs=5) *This will confirm you successfully uploaded the file**;
******************************************************************************************************;
libname Danielle 'C:\Users\Shaquina\';
Title color= black "Tuberculosis Control Program-Arkansas";
title2 color= black "Preliminary";
title3 color = black "Aggregate Reports for Tuberculosis Program Evaluation:";
title4 color = black "Follow-up and Treatment for Contacts to Tuberculosis Cases";
title5 color = black "Reporting Area AR0000";
title6 color = black "Cohort Year: 2014";
title7 color = black "Closure Date for Follow-up: 08/15/2015";
title8 color = black "Total TB Cases Reported: 93";
run;
footnote ' ';
*merging the data;
PROC PRINT data = Danielle.AutoCode1 (obs=5);
run;
proc print data = Danielle.AutoCode_2 (obs=5); run;
proc sort data=Danielle.Autocode1;
by StateCaseNo;
run;
proc sort data= Danielle.Autocode_2;
by StateCaseNo;
run;
data Danielle.AutoCodeAll_;
merge Danielle.AutoCode1 Danielle.AutoCode_2;
by StateCaseNo;
run;
proc print data = Danielle.AutoCodeAll (obs=5); run;
Data Danielle.NewAutoCode_1;
set Danielle.AutocodeAll;
if Sputum_Smear_Result= 'Positive' or Sputum_Smear_Result= 'Negative';
if Therapy_Stop_Reason = 'Completed Therapy';
run;
proc contents data = Danielle.NewAutoCode_1; run;
*Part 1. Cases and Contacts;
proc tabulate data =Danielle.NewAutoCode_1 missing;
TITLE9 "Types of Cases for Investigation" ;
table DispositionClass_Desc="Cases For Investigation" Therapy_Stop_Reason,
Sputum_Smear_Result="Sputum Smear";
Class Sputum_Smear_Result DispositionClass_Desc Therapy_Stop_Reason;
run;
*Reasons Treatment Not Completed;
title '';
data Danielle.NewAutoCode_2;
set Danielle.AutoCodeAll;
if Sputum_Smear_Result='Positive' or Sputum_Smear_Result='Negative';
run;
proc tabulate data = Danielle.NewAutoCode_2;
table Therapy_Stop_Reason='Reason Treatment Not Completed:', Sputum_Smear_Result='Sputum Smear';
Class Sputum_Smear_Result Therapy_Stop_Reason;
run;
Footnote ' Public reporting burden for this collection of information is estimated to average 3 hours per manual by data clerks; 30 minutes per hour
manual response by program managers; 30 minutes per electronic response by data clers and program managers, including the time for reviewing instructions,
searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. An agency may
not conduct or sponsor, and a person is not required to respond to a collection of information unless it displays a currently valid OMB control
number. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this
burden to CDC/ASTDR Information Collection Review Office, 1600 Clfton Road, MS-D-24 Atlanta, GA 30333, ATTN: PRA(0920-0457). Do not send the completed form to this address.';
run;
Code I already have.
We don't have the definitions of what the numerator or denominator might be for those rates or which variables might go into those.
I think what you may be looking for is after you read in the data to add some variables that are coded as 0, 1 or missing for each record in your data.
Dummy code of a possibly code example:
Select (THERAPY_STOP_REASON);
when ("Completed Therapy") Completion = 1;
when ("Other","Uncooperative or Refused") Completion = 0; /* assumes other means not completed*/
when ("Died"," ") Completion = .; /* assuming death should not be included in the rate at all*/
otherwise ; /*may want to put a message about unexpected value here*/
end;
In Proc tabulate Completion would be a VAR variable. Ask for the mean and format as a percent. The No contacts rate should use a -1 instead of 1.
var Completion;
table Completion="Completion Rate" *mean=""*format=percent6.0 ,
Sputum_Smear_Result='Sputum Smear';
With multiple of these index variables if you want the same statistic and format then put them in ( )
table
(Evaluation="Evaluation Rate" Disease="Disease Rate" Latent="Latent Infection Rate" Treatment="Treatment rate" Completion="Completion Rate" ) *mean=""*format=percent6.0 ,
Or add the label in a data set or label statement in proc tabulate to clean up the code.
I can't tell quickly from the example data how the number of contacts per case would be derived. For a single call to Tabulate you would want to have a variable in the data that has the count per contact appear ONCE for whatever a "Case" may be in this context.
You would want a mean but use Format=F5.1 or similar
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.