06-11-2018
Adam_Black
Obsidian | Level 7
Member since
08-24-2015
- 13 Posts
- 7 Likes Given
- 0 Solutions
- 3 Likes Received
-
Latest posts by Adam_Black
Subject Views Posted 1838 07-27-2017 10:31 PM 1926 07-20-2017 05:11 PM 4628 10-31-2016 12:38 PM 5773 07-06-2016 12:54 PM 5803 07-06-2016 08:37 AM 1945 05-18-2016 04:49 PM 2010 12-08-2015 09:15 AM 4056 08-25-2015 02:10 PM 3967 08-25-2015 02:04 PM 4056 08-24-2015 05:37 PM -
Activity Feed for Adam_Black
- Posted Re: How to aggregate columns based on plurality? on SAS Programming. 07-27-2017 10:31 PM
- Liked Re: How to aggregate columns based on plurality? for Reeza. 07-27-2017 10:28 PM
- Tagged How to aggregate columns based on plurality? on SAS Programming. 07-20-2017 05:12 PM
- Posted How to aggregate columns based on plurality? on SAS Programming. 07-20-2017 05:11 PM
- Liked Re: Indexing vs. Sorting for LinusH. 10-31-2016 01:20 PM
- Liked Re: Indexing vs. Sorting for Astounding. 10-31-2016 01:20 PM
- Posted Indexing vs. Sorting on SAS Programming. 10-31-2016 12:38 PM
- Tagged Indexing vs. Sorting on SAS Programming. 10-31-2016 12:38 PM
- Tagged Indexing vs. Sorting on SAS Programming. 10-31-2016 12:38 PM
- Tagged Indexing vs. Sorting on SAS Programming. 10-31-2016 12:38 PM
- Tagged Indexing vs. Sorting on SAS Programming. 10-31-2016 12:38 PM
- Tagged Indexing vs. Sorting on SAS Programming. 10-31-2016 12:38 PM
- Posted Re: Confused by how sas handles missing values on SAS Programming. 07-06-2016 12:54 PM
- Liked Re: Confused by how sas handles missing values for Loko. 07-06-2016 12:44 PM
- Liked Re: Confused by how sas handles missing values for KachiM. 07-06-2016 12:44 PM
- Tagged Confused by how sas handles missing values on SAS Programming. 07-06-2016 08:38 AM
- Tagged Confused by how sas handles missing values on SAS Programming. 07-06-2016 08:38 AM
- Posted Confused by how sas handles missing values on SAS Programming. 07-06-2016 08:37 AM
- Got a Like for Re: Group option with PROC STDRATE when using indirect method. 05-18-2016 09:18 PM
- Got a Like for Re: Group option with PROC STDRATE when using indirect method. 05-18-2016 05:04 PM
-
Posts I Liked
Subject Likes Author Latest Post 1 1 1 1 1 -
My Liked Posts
Subject Likes Posted 3 05-18-2016 04:49 PM
07-27-2017
10:31 PM
Yes, the mode is what I am after. So I guess I have to do this aggregation one column at a time and then merge the results since there is no MODE aggregate function in PROC SQL. Thanks for the help!
... View more
07-20-2017
05:11 PM
I would like to aggreate the columns of a dataset based on the plurality of non-missing values for each column. Suppose my dataset was Name Color Food Jane Red Sushi Jane Blue Jane Red John Green Yogurt John Green Sushi John Green Yogurt John Red I would like to summarize my dataset using something like this: proc sql;
select Name, plurality(Color) as Color, plurality(Food) as Food
from raw_data
group by Name; The result would be Name Color Food Jane Red Sushi John Green Yogurt The plurality function would return the value that occurs most often after missing values are removed. Ties could be handled using alphabetical order. What is the best way to accomplish this data transformation in SAS (version 9.3 or 9.4)? (Is it possible to combine a user defined function with proc sql to accomplish this?)
... View more
- Tags:
- sql
10-31-2016
12:38 PM
Hi, I would like to add two simple indexes to a large dataset based on "column1" and "column2". Is adding the two simple indexes effectively the same thing as sorting the dataset on "column1" and adding an index based on "column2" assuming that the datasets will not be sorted again in the future? The options I'm considering are: proc datasets library=mylib;
modify largeDataset;
index create column1;
index create column2;
quit; vs. proc sort data=mylib.largeDataset;
by column1;
run;
proc datasets library=mylib;
modify largeDataset;
index create column2;
quit; Wouldn't the second option be more space efficient than the first? Thanks for your help! Adam SAS version 9.3
... View more
07-06-2016
12:54 PM
Thanks for the documentation references. I was expecting all three of the the methods that add one to the variable 'a' to give me the same result. Also I mistakenly thought that the sum statement, "a+1;", is equivalent to "a = a +1;" In fact, the sum statement is equivalent to using the SUM function and the RETAIN statement, as shown here: retain variable 0;
variable=sum(variable,expression); Thanks.
... View more
07-06-2016
08:37 AM
I'm confused about the way SAS handles missing values. I've recently realized I have to be very careful about assuming what SAS will do when it encounters a missing value. Here is a simple example. data _null_;
a = .;
b = a + 1;
c = sum(a,1);
a+1;
put a= b= c=;
run; The result is: a=1 b=. c=1 This means that adding 1 to missing with + results in missing, but adding 1 to missing with either the sum function or increment operator results in 1. Is there any logical reason for this behavior? Thanks!
... View more
05-18-2016
04:49 PM
3 Likes
For future reference, in case anyone else has the same issue, the solution below is what I was really after. Its output is a single table with statistics for each area. This is very helpful if you have many areas. I'm gradually learning how to use the ODS! ods exclude all;
proc stdrate data=counts refdata=aggregate
method=indirect
stat=rate(mult=100)
plots=smr
;
population event=death total=denom;
reference event=death total=denom;
strata Age;
by area;
ods output smr=Smr_Cs;
run;
ods exclude none;
proc print data=Smr_Cs;
run;
... View more
12-08-2015
09:15 AM
I would like to reproduce this simple example using PROC STDRATE. I do not see any way to specify a group option when using the indirect standardization method. http://www.dartmouthatlas.org/downloads/methods/indirect_adjustment.pdf Here is my code so far. data counts;
input area $ age $ death denom ;
datalines;
Area1 65-69 6 500
Area1 70-74 15 300
Area1 75-79 20 200
Area2 65-69 3 300
Area2 70-74 12 300
Area2 75-79 36 400
; run;
proc sql;
create table aggregate as
select age, sum(death) as death, sum(denom) as denom
from counts group by age;
quit;
/* I need an option to group by area */
proc stdrate data=counts refdata=aggregate
method=indirect
stat=rate(mult=100)
;
population event=death total=denom;
reference event=death total=denom;
strata Age;
run; Thanks for your help!
... View more
08-25-2015
02:10 PM
In the following code... data all; input x @@; datalines; 1 2 3 4 5 6 7 8 9 10 ; data even; do _n_=1 to howmany; set all nobs=howmany; if ^mod(x,2) then output; end; run; I would love it if you could explain the control flow of the second data step.Namely, is there an implied loop created by the set statement or is the implied loop overridden by the outer do loop? Thanks again for your help!
... View more
08-25-2015
02:04 PM
In response to "Values on the RIGHT of the = are known (variables or constants). Values on the LEFT of the = can be new or already known variables." The behavior that I think is odd is that SAS allows new variables on the RIGHT side of the =. For example.. data out; new_var1 = new_var2; run; NOTE: Variable new_var2 is uninitialized. NOTE: The data set WORK.OUT has 1 observations and 2 variables. SAS does print a note telling me that new_var is uninitialized but allows it nevertheless. This note could get lost in the log a large program making a variable name typo a hard error to find.
... View more
08-24-2015
05:37 PM
Thank you all for your help. Here are a couple examples of what I am talking about. The simplest example is the following. data output; if new_var = . then put "new_var exists and was never declared"; run; A more complicated example comes from a problem I was trying to solve involving a sohisticated merge. Imagine we have a dataset with babies and the days they were born. We also have a dataset with doctors containing flags for the days they worked at the hospital. I wanted to create a dataset that would list all the possible baby-doctor combinations such that the doctor might have delivered the baby. ie. The doctor worked on the baby's birthday. Below is the solution which I adapted from code someone posted online in response to this question. data babies; input baby_name $ birth_day birth_day_name $; datalines; Jake 1 day1 Sonny 4 day4 North 5 day5 Apple 6 day6 ; run; data doctors; input DrLastname $ day1 day2 day3 day4 day5 day6; datalines; Jones 1 0 0 1 1 1 Lewis 1 1 1 0 0 1 Smith 0 1 1 1 0 1 ; run; data babies_doctors_array; array drnames[3] $10 _temporary_; array drdays[3,6] _temporary_; /* load doctors dataset into temp arrays */ if _n_=1 then do i = 1 to nobs_doctors; set doctors point=i nobs=nobs_doctors; array days day1-day6; drnames=DrLastname; do j = 1 to dim(days); drdays[i,j]=days ; end; end; /* go through babies to find doctors that worked on thei birthday*/ set babies; do k = 1 to nobs_doctors; if drdays[k,birth_day]=1 then do; babys_doctor = drnames ; output; end; end; keep baby_name birth_day babys_doctor; run; proc print data=babies_doctors_array; run; The variable nobs_doctors is used in the do loop before the set statement in which it is declared. The most recent case of this I've encountered that prompted me to start this discussion looks like it is a coding error to me. Here is a really stripped down version of the code. data raw; format dos date9.; input id dos mmddyy. comp1 comp2 comp3; datalines; 1 121299 1 0 0 1 121299 0 1 0 1 101103 0 1 0 2 030400 1 1 0 2 030400 0 0 0 2 040400 0 0 1 3 041190 0 1 0 4 092090 0 0 1 4 051589 0 1 0 5 040300 0 0 0 5 071710 1 0 0 5 070899 0 1 0 6 030299 0 1 0 7 121200 1 0 0 ; run; proc print data=raw;run; proc sort data=raw; by id dos; run; data fin; set raw; by id dos; /* not sure about using compsum before it is defined */ if compsum = 0 then no_comps = 1; compsum = sum(comp1, comp2, comp3); run; This just looks like a mistake to me and illustrates why I think this behavior is dangerous. It makes this kind of coding error hard to catch. Thanks again for all your help. -Adam
... View more
08-24-2015
04:29 PM
Thank you for the references! I have found "The SAS Supervisor..." paper particularly helpful. I was aware of the different ways to declare variables in SAS but did not understand how to think about undeclared variables as in the following example. data output; if new_var = . then put "new_var exists but was never declared"; run; It sounds like in this example the SAS supervisor creates new_var and initializes it to missing at compile time. Then the if statement is performed during execution. This seems like dangerous behavior to me. I could imagine that a typo in a variable name would be a difficult error to find since it would not create an warning. Instead SAS automatically defines a new variable. Thanks for the help.
... View more
08-24-2015
10:20 AM
I have encountered the following situation a handful of times and it has always confused me. As I read through a datastep I notice that a variable is used before it is declared or assigned an initial value. That is, the first mention of a variable as I read the code from top to bottom is in a statement that assumes the variable already has a value. I think I remember reading that the datastep does some pre-processing, perhaps in which all variables are created, before any statements are executed.Would someone please explain when referring to a variable before it is declared is allowed in a datastep and how to correctly think about this situation. Thanks! Adam Black
... View more