BookmarkSubscribeRSS Feed
sdwhite1989
Calcite | Level 5

Hello, I am working on creating a SAS code that will automate a report on Tuberculosis cases and contacts. I have some sample code that I have been working on but it is not looking how I the report looks. I have attached report that I am trying to create. I have also attached my SAS code so far. I can attach the dataset if someone needs it. 

options center ps=60 ls=120;

/**************************************************************	*/
/*File: 	AutoCode to convert Excel Data to SAS data set		*/
/*Date:		01/21/2016											*/
/*Data:		Data from C:\Users\Shaquina\	 */
*/			AutoCode.xlsx										*/
*/Create:	C:\Users\Shaquina\AutoCode.sas7bdat					*/
/****************************************************************/
;


* First: Import data using the import wizard
	1. Save file to be imported as a CSV(MS-DOS) file.
	2. In SAS, click on File, Import Data...
	3. From the drop down menu, click on Comma Separated Values (*.csv).
	4. Choose the file destination and click next.
	5. Choose library and name the new sas file and click next.
	6. Click finish.
	7. Proc print data= XX.XXX (obs=5) *This will confirm you successfully uploaded the file**;
******************************************************************************************************;

libname Danielle 'C:\Users\Shaquina\';


Title color=	black "Tuberculosis Control Program-Arkansas";
title2 color= black  "Preliminary";
title3 color = black "Aggregate Reports for Tuberculosis Program Evaluation:";
title4 color = black "Follow-up and Treatment for Contacts to Tuberculosis Cases";
title5 color = black "Reporting Area AR0000";
title6 color = black "Cohort Year: 2014";
title7 color = black "Closure Date for Follow-up: 08/15/2015";
title8 color = black "Total TB Cases Reported: 93";

run; 
footnote ' ';

*merging the data;

PROC PRINT data = Danielle.AutoCode1 (obs=5);
run; 

proc print data = Danielle.AutoCode_2 (obs=5); run; 

proc sort data=Danielle.Autocode1;
by StateCaseNo;
run;

proc sort data= Danielle.Autocode_2;
by StateCaseNo;
run; 

data Danielle.AutoCodeAll_;
merge Danielle.AutoCode1 Danielle.AutoCode_2;
by StateCaseNo;
run; 


proc print data = Danielle.AutoCodeAll (obs=5); run;
Data Danielle.NewAutoCode_1;
set Danielle.AutocodeAll;
if Sputum_Smear_Result= 'Positive' or Sputum_Smear_Result= 'Negative';
if Therapy_Stop_Reason = 'Completed Therapy';
run;

 proc contents data = Danielle.NewAutoCode_1; run; 

 *Part 1. Cases and Contacts;


proc tabulate data =Danielle.NewAutoCode_1 missing;
TITLE9 "Types of Cases for Investigation" ;
table 	DispositionClass_Desc="Cases For Investigation"  Therapy_Stop_Reason, 
		Sputum_Smear_Result="Sputum Smear";
Class  Sputum_Smear_Result DispositionClass_Desc Therapy_Stop_Reason;

run; 






*Reasons Treatment Not Completed; 
title '';
data Danielle.NewAutoCode_2;
set Danielle.AutoCodeAll;
if Sputum_Smear_Result='Positive' or Sputum_Smear_Result='Negative';
run;
proc tabulate data = Danielle.NewAutoCode_2;

table Therapy_Stop_Reason='Reason Treatment Not Completed:', Sputum_Smear_Result='Sputum Smear';	
Class Sputum_Smear_Result Therapy_Stop_Reason;
run; 

Footnote ' Public reporting burden for this collection of information is estimated to average 3 hours per manual by data clerks; 30 minutes per hour
manual response by program managers; 30 minutes per electronic response by data clers and program managers, including the time for reviewing instructions,
searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. An agency may
not conduct or sponsor, and a person is not required to respond to a collection of information unless it displays a currently valid OMB control 
number. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this
burden to CDC/ASTDR Information Collection Review Office, 1600 Clfton Road, MS-D-24 Atlanta, GA 30333, ATTN: PRA(0920-0457). Do not send the completed form to this address.';
run;

4 REPLIES 4
paulnevinthomas
Calcite | Level 5

In principle, you can't create this type of output as a default part of the TABULATE function; in essence, you are asking for two different table definitions. Anything you do with the SAS syntax will basically amount to adding more dimensions to the table, but it won't fix your core problem.

You can use this code to get the tables you want, but they're still different tables:

PROC TABULATE DATA=want NOSEPS;
VAR stake winnings margin;
TABLE (stake winnings),(N SUM MEAN MEDIAN STDDEV MIN MAX);
TABLE (margin),(N MEAN MEDIAN STDDEV MIN MAX);
RUN;
There are some guides out there on hacking ODS to do what you want (namely, create "stacked tables" where several child tables are assembled into a single table. Check out here for an example. If you Google "SAS stack tables" you'll find more examples.

I've done this in HTML by creating a new tagset - basically, a special ODS destination that removes spaces between tables, etc. I don't have the code that I used anymore, unfortunately; I moved to R to do automated reporting.

[url=http://goo.gl/maoapk]informatica training in chennai[/url]
jklaverstijn
Rhodochrosite | Level 12

I advise you to be specific in what you want from the forum. If you ask an actual question you stand a better chance of getting feedback.

 

Reg's Jan

ballardw
Super User

Please be explicit about which part(s) "is not looking how the report looks".

Without actual data it will be next to impossible to say what may be needed to generate your report. If you provide some data please make in in the form of a data step which reads datalines. If we have to "import" an excel file the results may be different.

Also anything appearing in the body of a report is likely to need a variable of some sort. So the parenthetical elements such as (a1) may be difficult to incorporate.

 

Also does the final result have to be in Excel? There can be major headaches getting some strickly appearance options such as the dot leaders ( ......... ) to appear as desired.

 

And a not about repeated processes:

First: Import data using the import wizard
	1. Save file to be imported as a CSV(MS-DOS) file.
	2. In SAS, click on File, Import Data...
	3. From the drop down menu, click on Comma Separated Values (*.csv).
	4. Choose the file destination and click next.
	5. Choose library and name the new sas file and click next.
	6. Click finish.
	7. Proc print data= XX.XXX (obs=5) *This will confirm you successfully uploaded the file**;

If you are going to do this with multiple files, either different data sources or over time, with the same layout do 1 through 5 one time.

 

The log will have data step code that you can extract and save as a program file and then you only have to change the source file, output set and possible the library. You control informats and especially length of character variables and whether a variable is numeric or character as well as setting variable labels formats, add some data checking for ranges or such and possibly calculating additional needed variables. If you keep using proc import I am willing to make a small wager that eventually something doesn't come in as expected because one of your data sources provides unexpected values an a variable that was defaulting to numeric suddenly becomes character. Which may mean a failure later on.

 

Whether you could hope to make such tables in a single pass through Proc tabulate depends on your data and many definitions. I've had multiple go-arounds with CDC over what the appropriate numerator and denominators may be for percentages. Often they lead to serious preprocessing before getting to actually preparing the tables.

sdwhite1989
Calcite | Level 5

To clarify my question:

 

 

The final result does not have to be in EXCEL. I attached the excel dataset, rather than datalines, because that it what we will always be using to create the report. When I stated it is not looking how the report looks, I was referring to the table values. I am not certain how to pull only values from a variable that I need. A little more about the dataset. All TB "contacts" are listed by the StateCaseNo. Repeated StateCaseNo means that is the same contact for one TB case. The purpose of the report is to use the contact's datasheet, which I have attached, to make a report showing how many contacts, how many cases, how many contacts got the disease (disease status is listed in the DispositionClass_Desc variable; blanks indicate not infected), how many contact got the infection (once again, it's listed in DispositionClass_Desc), how many started treatment (I'm assuming we could tell this by looking a started treatment date variable), and how many completed treatment (i used the therapy_stop_reason variable). The "others" variable on the report are referring to kids ages 2 and under. I do not know how to get the table to format like the report I attached on my initial post. I was going to use the Therapy_Stop_Reason variable for the middle table of the report. The last table calculates rates using the same variables from the 1st table. This is the part where I have no idea what to do. The (b1/a1),% next to the numbers in the table I am not concerned about. I am really just wanting the calculation rates. Please let me know if I cleared this up a little or if it is still vague. I have copied and pasted the first few lines of the dataset below. The enitre dataset is attached. Thank you!

 

StateCaseNoCaseCountSPUTUM_CULTURE_RESULTSPUTUM_SMEAR_RESULTCASE_VERIFICATIONSPUTUM_SMEAR_COLLECT_DATEINIT_REGIMEN_START_DATECASE_STATUSSPUTUM_CULT_COLLECT_DATECALC_DISEASE_SITEAGE_REPORTEDCHEST_XRAY_CAVITY_EVIDENCECHEST_XRAY_MILIARY_EVIDENCECHEST_XRAY_RESULTCONFIRMATION_DATECOUNTRY_OF_VERIFIED_CASECOUNT_DATECOUNT_STATUSDATE_ARRIVED_IN_USDATE_REPORTEDDATE_SUBMITTEDDISEASE_SITE_1FINAL_ISOLATE_COLLECT_DATEFINAL_ISOLATE_IS_SPUTUM_INDHOMELESS_INDINIT_SUSCEPT_ETHAMBUTOLINIT_SUSCEPT_ISONIAZIDINIT_SUSCEPT_PYRAZINAMIDEINIT_SUSCEPT_RIFAMPININIT_SUSCEPT_STREPTOMYCINOCCUPATION_RISKPATIENT_BIRTH_SEXPATIENT_BIRTH_COUNTRYPATIENT_ETHNICITYPATIENT_DOBPATIENT_ZIPPATIENT_WITHIN_CITY_LIMITSPREVIOUS_DIAGNOSIS_INDRACE_CALCULATEDSMR_EXAM_TY_1STATUS_AT_DIAGNOSISTST_MM_INDURATIONTST_RESULTPATIENT_COUNTYHIV_STATUSTHERAPY_STOP_DATETHERAPY_STOP_REASONTHERAPY_STOP_CAUSE_OF_DEATHTHERAPY_EXTEND_GT_12_OTHERINIT_REGIMEN_ETHAMBUTOLINIT_REGIMEN_ISONIAZIDINIT_REGIMEN_PYRAZINAMIDEINIT_REGIMEN_RIFABUTININIT_REGIMEN_RIFAMPININIT_REGIMEN_STREPTOMYCINCORRECTIONAL_FACIL_RESIDENTCORRECTIONAL_FACIL_TYCT_SCAN_CAVITY_EVIDENCECT_SCAN_RESULTCULT_TISSUE_RESULTCULT_TISSUE_RESULT_RPT_DATECULT_TISSUE_RESULT_RPT_LAB_TYCULT_TISSUE_SITEEXCESS_ALCOHOL_USE_PAST_YEARIGRA_RESULTNAA_RESULTNONINJECT_DRUG_USE_PAST_YEARPATIENT_CITYPATIENT_COUNTRYPATIENT_STREET_ADDRESS_1PATIENT_STREET_ADDRESS_2PREGNANTPREVIOUS_DIAGNOSIS_YEARRACE_ASIAN_ALLRACE_NAT_HI_ALLSMR_EXAM_TY_ALLSMR_PATH_CYTO_RESULTTB_SPUTUM_CULTURE_NEGATIVE_DATUS_BORN_INDRelationship_DescNaturalBornUSCitizenContactCityContactCountyContactStateContactZipContactBirthCountryContactRaceEthnicityContactGenderContactCountryofOriginContactArrivedUSInformantCaseInformantContactInformantThirdPartyInformantPersonalObservationExposureTheTimeFirstExposureDateContactBrokenDateImpressionVentilationTheSizePlaceRelevantTSTDateRelevantTSTResultRelevantTSTIndurationRelevantIGRADateRelevantIGRAResultRelevantIGRATypeRelevantIGRAValueDispositionClass_DescDispositionClassDate
2014ARAR14000011PositivePositive1 Positive Culture12JAN14:00:00:0013JAN14:00:00:00Confirmed19JAN14:00:00:00Pulmonary25NoNoAbnormal 17JAN14:00:00:00Count as a TB Case.17JAN14:00:00:0017JAN14:00:00:00Pulmonary04JUN14:00:00:00YesNoSusceptibleSusceptibleSusceptibleSusceptibleSusceptibleUnemployedMaleUNITED STATESNot Hispanic or Latino28JUN88:00:00:0071801YNoBlack or African AmericanAlive18PositiveHempstead CountyPositive27JAN15:00:00:00Completed Therapy YesYesYesNoYesNoNo  Not DoneNot Done.  NoNot DonePositiveYesHopeUnited States903 North Walker     Not Done18JUN14:00:00:00YesHouseholdNo  AR.UNITED STATESBlackNon-HispanicMale  NoNoNoNo    Close HouseHousehold. .4-Mar-14PositiveT-Spot  .
2014ARAR1400001PositivePositive1 Positive Culture12JAN14:00:00:0013JAN14:00:00:00Confirmed19JAN14:00:00:00Pulmonary25NoNoAbnormal 17JAN14:00:00:00Count as a TB

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1325 views
  • 1 like
  • 4 in conversation