About Astounding

Astounding · ‎02-10-2012

One approach would be to make all your &RISK variables global. If that sounds acceptable, here is one way to do it. Here's what you started with: %do i=1 %to &ngroups; %let i=%sysfunc(compress(&i)); proc sql noprint; select left into :risk&i.00-:risk&i.&maxtick. from estimate where stratum=&i.; quit; %end; Here's a replacement. (Other changes, such as moving the PROC statement and removing a %LET statement are made on purpose not by accident): %local i j; proc sql noprint; %do i=1 %to &ngroups; %do j=0 %go &maxtick; %global risk&i%sysfunc(putn(&j,z2)); %end; select left into : risk&i.00-:risk&i.%sysfunc(putn(&maxtick,z2)) from estimate where stratum=&i; %end; quit; Once the &RISK variables are global, I suspect the rest of the code would work. Good luck.

Astounding · ‎02-09-2012

hnam, What you are noticing is that the fields are in alphabetical order. While you could conceivably change your variable names in TAB4B, it's just as easy to use this variation. Add to your OUTPUT statement within PROC MEANS: mean = day003 day007 day015 day030 day090 day180 day360 day720 That will change the variable names in TAB4C, without having to change TAB4B. Also, notice how the word MEAN on the PROC statement is affecting the printed report, but does not affect the output data set TAB4C. If you use this change that I'm recommending, you may have to change your subsequent program. For example, there will only be one observation in TAB4C, so when you transpose you will end up using COL1 instead of COL4. But it's worth the effort to familiarize yourself with the structure of output data sets from PROC MEANS. Good luck.

Astounding · ‎02-09-2012

bbb_NG, There are a few options that are related to your question. Any of these could be added to the PROC TABULATE statement. MISSING Normally, PROC TABULATE removes any observations that have a missing value for a CLASS variable. MISSING requests that they remain in the analysis. CLASSDATA= To use this, you have to create a SAS data set holding all the combinations of the CLASS variables that you would like to see in your table (even if no observations actually exist for some of the combinations). That will get the full set of combinations to appear on the table. I suspect this is the one that you are looking for, but you will need to create that "shell" SAS data set first. EXCLUSIVE In combination with CLASSDATA=, specifies that combinations that appear in the data, but are not in your CLASSDATA= data set, should be removed. Good luck.

Astounding · ‎02-08-2012

Sorry, I can make that clearer. Here's a more complete example: proc format; value outer 15 - 25 = 'Small' 28 - 32 = 'Medium' Other = [inner.}; value inner 20 - 30 = 'Young'; run; Now a few sample mappings would be: 25 = Small 29 = Medium 27 = Young The full inner range of 20 - 30 cannot be used. Because inner is the OTHER= definition for outer, it should only be applied to values that outer has not defined. In this case, that would limit the application of inner to values greater than 25 and less than 28.

Astounding · ‎02-08-2012

There's definitely a question or two about what needs to be solved. If this is a one-time effort that has to process ranges, and should not use multilabel formats, and can stand to have human intervention applied, here's a pragmatic approach. Combine the two CNTLOUT= data sets, but eliminate from the INNER definition just the records that identify exactly the same range (match on BOTH START and END). Then try to use the result as a CNTLIN= data set and let SAS give you an error message for the ranges that overlap. Modify the CNTLIN= data set to eliminate the overlap and try again. Repeat until the errors stop. Here's an example of a difficult situation when ranges come into play. The outer format definition includes: 15 - 25 = 'Small' 28 - 32 = 'Medium' And the inner format definition includes: 20 - 30 = 'Young' So the inner range has to change to the equivalent of: 25 <- < 28 = 'Young' There's no easy, automated solution that I see. Tip of the hat to anyone who can do it.

Astounding · ‎02-08-2012

The discussion seems to have died out a bit here. Here's a set of assumptions, plus the program format that you can work with. The assumptions are at least within the realm of reason: - The "outer" format defines values only, not ranges. - The "outer" format uses the "inner" format for its OTHER= definition. - The "inner" format defines values only, not ranges, except that it can define an OTHER= category. I know those are restrictive, and may not fit what you need to do. But if the assumptions are valid, the code is pretty straightforward. proc format on the "outer" format, CNTLOUT=outer (where=(HLO ne 'O')); proc format on the "inner" format, CNTLOUT=inner; If necessary (I forget the order of CNTLOUT= data sets): proc sort data=outer; by start; run; proc sort data=inner; by start; run; data combine; set inner outer; by start; if last.start; fmtname='Combined'; run; proc format cntlin=combine; run; My recollection is that if you use a multilabel format, you get different results. The same observation can be counted twice, and contribute to the count of two separate cells of your report. Good luck.

Astounding · ‎02-08-2012

If you try this approach, you may notice that you should consider changing your original question. You started out asking how to achieve your goal with a minimum number of steps. Instead, consider how to achieve your goal with the fastest-running program. They're not really the same thing. You'll have to learn, if you want to do this. In this case, you'll need to examine the structure of the output data set from PROC SUMMARY. The strategy (using your original variable names) would be to summarize your large data set once, getting counts for each STATE/COUNTY combination. Save the output data set from that summary. Re-summarize it later to get counts for each STATE, or counts for each COUNTY. That way, you end up processing the large data set once, and processing the smaller summary data set multiple times. The program may be longer, but it will run faster if your summary is considerably smaller than the original. If you adopt this approach, you may need to change the statistics you save in the summary data set. If you want to end up with means, you may save the N and SUM statistics in your summary data set. Aggregate those later, and then compute an aggregated mean. In similar fashion, but with a more complex formula, you can save the sum of the squared values in your output data set. The standard deviation can be computed later, using aggregated versions of the sum of squared values, plus the N and SUM statistics. SAS Press publishes a book on efficiency in common programming situations (author is Bob Virgile). Good luck.

Astounding · ‎02-07-2012

Art, The two I mentioned are well-respected, and I know they have written on the subject at one time or another. But you're certainly right. There must be others as well.

Astounding · ‎02-07-2012

Two authors you can look for: Ron Cody and Ben Cochran. You may have to search through the titles, though, to find the papers you're looking for. Good luck.

Astounding · ‎02-07-2012

Guilty as charged. I was hoping for an easy solution. If your formats define values only (not ranges), it shouldn't be too difficult to combine them into a single format. (If either one contains ranges, this becomes a nasty problem.) Create a CNTLOUT data set for each format, and eliminate any reference to the nested format in the "outer" CNTLOUT data set. Then merge the two CNTLOUT data sets BY START. If the nested format was the OTHER= definition for the "outer" format, take the outer format's LABEL when there is a match. Use the result as a CNTLIN= data set. That gives you a single format you can preload. Does this sound about right?

Astounding · ‎02-07-2012

I'm not sure you can do that, but there may be a workaround. It might help to know a little more about how you are using the format. For example, if it is with a CLASS variable you might be able to apply the format ahead of time: data temp / view=temp; set my_original_data; new_var = put(old_var, my_nested_format.); run; proc summary data=temp; class new_var; This would change the order of the output categories. But it's not clear if that is an issue or not (and there might be another workaround for that using ID variables). Good luck.

Astounding · ‎02-06-2012

Another approach is possible, which is to create your own format using your own "business day" scale. Assuming you have a list of holidays, you would need to create the equivalent of: value busdt today='1' tomorrow='2' nextday='3' etc. You have already made one key decision, which is if a date falls on a holiday or weekend, it should map to the same outcome as the next (rather than the previous) business day. In that case, the programming will be easier if you work backwards. Start with the final day of the time period you want to measure, equivalent to: value busdt finalday = '9999' prior day = '9998' etc. The details of such a program are complex but relatively short. The advantage of such a format as that you can save it permanently and use it (relatively) easily: interval = input( put(ending_day, busdt.), 4. ) - input( put(starting_day , busdt.), 4.); Using cntlin= data sets, it would be equally plausible to set up a separate format for each subject in a study where the "holidays" vary by subject. The cntlin= data set can generate a separate, permanent format for each subject, using values for FMTNAME like pid001_, pid002_, etc. Using these formats would be a little trickier. You would have to switch from PUT to PUTN, to allow the name of the format to be data-driven based on the patient ID. Good luck.

Astounding · ‎02-03-2012

Sorry, forgot to add the code. You have: and (filedate between "&FILEDATE_MIN"d and "&FILEDATE_MAX"d); Here's one possible replacement: and %if %length(&termdate_min) > 0 %then (termdate between "&TERMDATE_MIN"d and "&TERMDATE_MAX"d); %else (filedate between "&FILEDATE_MIN"d and "&FILEDATE_MAX"d); ; Don't omit that final semicolon. As long as the user entered something for &TERMDATE_MIN, the program uses TERMDATE in the WHERE clause. Otherwise it uses FILEDATE. Good luck.

Astounding · ‎02-03-2012

Ashley, The programming part is easy. The part that you have to work through is setting up the rules. In real life, an end user might enter anywhere from 0 to 4 pieces of information. You'll need to spell out what should happen in each case. There's no right and wrong here, from a programming point of view. For example, if the end user enters only FILEDATE_MIN, what should happen? Should the program send a message to the user, and allow the user a second chance? Should the program shut down? Should it send a warning to the user and select all records from FILEDATE_MIN going forward? These decisions are much harder then the programming part. It's fairly trivial to program that if the user enters all four, only use one set and ignore the other set. Good luck.

Astounding · ‎02-02-2012

Assuming I understand the question, this would do the trick. %macro subset (variable_to_use=); where filedate between "&&&variable_to_use._MIN"d and "&&&variable_to_use._MAX"d; %mend subset; You might add more code inside the macro if you would like, and you need to values assigned to &VAR1_MIN, &VAR1_MAX, &VAR2_MIN, and &VAR2_MAX. Again if I understand the question properly, the user would call the macro using either of these: %SUBSET (variable_to_use=VAR1) %SUBSET (variable_to_use=VAR2) And the macro generates the appropriate WHERE statement. Good luck.

Online Status	Offline
Date Last Visited	an hour ago

Re: classify each customer to 1-100 groups based on percentiles withou...

Re: classify each customer to 1-100 groups based on percentiles withou...

Re: classify each customer to 1-100 groups based on percentiles withou...

Re: How to output to differing data sets if a value is present across ...

Re: w.d foramt decimal

Re: Finding a character field that only contains integers

Re: Macro MINOPERATOR Help

Re: Macro MINOPERATOR Help

Re: How to create a weekday/weekend indicator variable based on date?

Re: Array- PCT change from month to next month

Re: set and merge

Re: Farewell and Thank You!

Farewell and Thank You!

Re: How to group userid, cardid with multiple transaction codes

Re: How to call macro variable within a looping code

Re: Fix statement %do YYMM=2306 %to 2403

Re: How to stop a SAS Program from running

Re: How to use array and calculate date variable

Re: documentation on how to write 'brief' codes

Re: problem with proc formats

Re: Why Are the Macro Variables Not Resolved?

Proc Means and sgplot

How to keep those missing in Proc tabulate?

Re: PreLoadFmt and Nested Value Formats.

PreLoadFmt and Nested Value Formats.

PreLoadFmt and Nested Value Formats.

Best way for calculations

Re: Help with LAG or RETAIN?

Re: Help with LAG or RETAIN?

PreLoadFmt and Nested Value Formats.

PreLoadFmt and Nested Value Formats.

finding the number of business days

OR operator and where clause

OR operator and where clause

Re: OR operator and where clause