09-28-2020
dr2014
Quartz | Level 8
Member since
07-22-2014
- 162 Posts
- 22 Likes Given
- 2 Solutions
- 1 Likes Received
-
Latest posts by dr2014
Subject Views Posted 1480 09-28-2020 01:33 PM 1523 09-28-2020 11:19 AM 1387 06-04-2020 09:44 AM 5687 04-27-2020 01:33 PM 5731 04-27-2020 12:47 PM 2232 11-21-2018 10:44 AM 2264 11-21-2018 09:46 AM 2273 11-21-2018 09:36 AM 2289 11-21-2018 09:09 AM 929 11-21-2018 09:02 AM -
Activity Feed for dr2014
- Posted Re: How do I plot line graphs for multiple groups with gplot? on Statistical Procedures. 09-28-2020 01:33 PM
- Posted How do I plot line graphs for multiple groups with gplot? on Statistical Procedures. 09-28-2020 11:19 AM
- Posted Adding title to gplot on Statistical Procedures. 06-04-2020 09:44 AM
- Posted Re: Remove exponents from the values on SAS Programming. 04-27-2020 01:33 PM
- Posted Remove exponents from the values on SAS Programming. 04-27-2020 12:47 PM
- Posted Re: Running frequency on Statistical Procedures. 11-21-2018 10:44 AM
- Posted Re: Running frequency on Statistical Procedures. 11-21-2018 09:46 AM
- Posted Re: Running frequency on Statistical Procedures. 11-21-2018 09:36 AM
- Posted Running frequency on Statistical Procedures. 11-21-2018 09:09 AM
- Posted Re: creating unique window periods on SAS Programming. 11-21-2018 09:02 AM
- Posted Re: creating unique window periods on SAS Programming. 11-16-2018 12:48 PM
- Posted creating unique window periods on SAS Programming. 11-16-2018 12:26 PM
- Got a Like for Re: convert year date value from character to numeric. 11-12-2018 12:08 PM
- Posted Re: convert year date value from character to numeric on SAS Programming. 11-12-2018 12:04 PM
- Posted Re: convert year date value from character to numeric on SAS Programming. 11-12-2018 12:04 PM
- Posted convert year date value from character to numeric on SAS Programming. 11-12-2018 11:53 AM
- Posted Re: proc sql using where or case and when clause on SAS Data Management. 06-04-2018 10:38 PM
- Posted proc sql using where or case and when clause on SAS Data Management. 05-23-2018 12:47 PM
- Posted Re: Accessing big data using SAS PC on SAS Data Management. 04-30-2018 11:22 AM
- Liked Re: Accessing big data using SAS PC for Kurt_Bremser. 04-30-2018 11:20 AM
-
Posts I Liked
Subject Likes Author Latest Post 1 1 1 1 1 -
My Liked Posts
Subject Likes Posted 1 11-12-2018 12:04 PM
04-25-2018
10:49 AM
Thanks for your reply @Tom Yes, there was an error in my value for the 'new_variable' in the last sample and also it doesn't make a difference if the values for new_variable is character or numeric. Let me try your code. An array could be a better solution than the macro. I will get back to the post in a bit.
... View more
04-25-2018
10:41 AM
Thanks for your reply @Patrick, however &og is not pa pa1 pa2 pa3 pa4 but the values in it which are many in the real dataset. Please review my reply to @novinosrin
... View more
04-25-2018
10:34 AM
@novinosrinThe macro was available and I wanted to make use of it.
This is what I am trying to accomplish. Here are the basic statements...
%let og=(5,6,7);
%let tw=(9,3);
data want;
set have;
if ( a in (&og.) or
pa1 in (&og.) or
pa2 in (&og.) or
pa3 in (&og.) or
pa4 in (&og.) ) and
( lt=1 and lp in (&tw.) ) then new_variable =1;
else new_variable=0;
run;
I wanted to avoid using the multiple 'or' syntax hence the macro.
Please let me know if there is way to avoid that and/or make use of the macro. Also I had an error in the dataset 'want' and I have copy pasted both below. The 'have' dataset have is large and there are multiple records for each id.
have
id
pa
pa1
pa2
pa3
pa4
lt
lp
1
5
5
5
5
1
9
2
6
6
6
1
3
3
7
7
7
7
7
2
2
want
id
pa
pa1
pa2
pa3
pa4
lt
lp
new_variable
1
5
5
5
5
1
9
1
2
6
6
6
1
3
1
3
7
7
7
7
7
2
2
0
... View more
04-24-2018
07:09 PM
Hi all,
I am using SAS EG and a macro with 'if then else' and 'do loop' to create a 'new variable' with values of '1' or '0' based on the values of more than 1 variables (pa, pa1 - pa4, lt and lp). Below is the structure of my dataset called 'have'
id
pa
pa1
pa2
pa3
pa4
lt
lp
001
5
5
5
5
1
9
002
6
6
6
1
3
003
7
7
7
7
7
2
2
bellow is the dataset I need called 'want'
id
pa
pa1
pa2
pa3
pa4
lt
lp
new_variable
001
5
5
5
5
1
9
1
002
6
6
6
1
3
0
003
7
7
7
7
7
2
2
1
%macro midds;
data want;
set have;
if (pa in (&og.) %do i=1 %to 4; or pa&i. in (&og.) %end; and
(lt='1' and (lp in (&tw.))then new_variable='1';
else new_variable='0';
run;
%mend;
%midds;
&og is a macro for all required values in pa, pa1- pa4
&tw is a macro for all required values in lp
When I run the code, I get an error which says 'no matching if then clause'. Could you please point out to me what is wrong with my code? Thanks.
... View more
08-17-2017
09:32 PM
@ChrisBrooks I got what I needed. I wasn't going about it the right way. To get to table_c , here is the new approach . I first compressed the perod_id and date fields to get a unique id , relabeled the variables code1 and code2 in table_a and table_b respectively as 'code' and then stacked table_a and table_b . I then used conditional logic using the unique id and its associated unique values in the code variable to get to table_c as I needed an addtional derived variabale as well.
Thanks!
... View more
08-17-2017
07:11 PM
Hi @Reeza and @ballardw,
I provided the tables as examples. The full code is here:
proc sql;
create table table_c as
select *
from table_a c full join table_b t
on c.id=t.id and c.date=t.date;
quit;
I ran the above code and it gave me an error saying there are duplicates in 'id' column. I also realized joins do not overlay columns. I was wondering if there was way through joins to go about getting table_c with unique rows for values in code1 and code2. Hope this explains what I am trying to acheive. Thanks.
... View more
08-17-2017
05:57 PM
Hi,
I am looking to use a full join versus a merge. Below are 2 datasets Table_a and Table_b that I want to do a full join on to get Table_c:
table_a
id
Code1
date
source
1
h
d1
abc
1
h
d2
abc
1
c1
d3
abc
1
c2
d4
abc
table_b
id
Code2
date
source
1
n1
D1
abc
1
n1
D4
abc
1
n2
D5
abc
table_c
id
Code1
Code2
date
source
1
h
ndc1
d1
abc
1
h
ndc1
d2
abc
1
c1
ndc2
d3
abc
1
c2
d4
abc
would the below code help me acheive table_c?
select *
from table_a c full join table_b t
on c.id=t.id and c.date=t.date
Please let me know asap. Thanks.
... View more
05-31-2017
12:18 PM
@ballardw Thanks much for the solution! I realized you had provided it already and it was vey helpful.
... View more
05-24-2017
05:52 PM
Thanks @ballardw for the tips. Actually the array worked for me and I converted the character values to numeric after tweaking it.
I gave a value of :
'1' to the event1 value of interest (which is "in_target")
' 2' to those of event1 values of no interest and
a period for missing and blank values.
I am now concerned about calculating the average proportion for each event and then its confidence interval.
@Reeza , I am looking to use the proc freq with the binomial option. At present I have transposed my dataset and it looks like this:
Value '1' for variable "Event" is the event value fo interest.
ID
Period
Event
1
period1_event1
1
1
period2_event1
.
1
period3_event1
.
1
period4_event1
.
2
period1_event1
2
2
period2_event1
1
2
period3_event1
.
2
period4_event1
.
3
period1_event1
1
3
period2_event1
1
3
period3_event1
.
3
period4_event1
.
Is this how I should be executing the proc freq with the binomial option?
proc freq data=have;
ods output Binomial=ci;
by id;
tables event / binomial;
run;
If 'yes' then I will be getting proportion and Upper Conf Limit and Lower Conf Limit by each id. Do I then extract all these estimates for each id from output dataset 'ci' and take its average for the population? Please advice.
Also, another way I apporached this is to calculate the numerator,denominator and proportion for each distint id and then calculate the average proportion for the population.
This is how my output looks for the population for event 1:
Table 1
ID
Period1
Period2
Period3
Period4
Numerator
Denominator
Proportion Calculation
Average_proportion
ID1
in_target
Missing
low
in_target
2
3
=0.66
(0.66+0+0.75)/3=0.47
ID2
low
low
Missing
missing
0
2
=0
ID3
low
in_target
in_target
in_target
3
4
=0.75
Should I calculate the confidence intervals manually using
P = PERCENT/100 ;
N = COUNT + LAG(COUNT) ;
LB = P - ( 1.96*SQRT( P*(1-P)/N ) ) ;
* reset lower bound to 0 if <0 ;
IF LB < 0 THEN LB = 0 ;
UB = P + ( 1.96*SQRT( P*(1-P)/N ) ) ;
* reset upper bound to 1 if >1 ;
IF UB > 1 Then UB = 1 ;
What would be the most efficiat way to do this? Please reaply as soon as possible. Much thanks for your help.
... View more
05-24-2017
01:32 PM
Hi @ballardw. Thanks for your reply.Yes, I should have known to display my data in a form that can be tested. Also, I realized after that proc sql will require me to transpose the data as count will not be applicable to multiple columns.
I wanted to avoid that. If possible, I would like to use the sum function. I anyways have to convert the values of period1 to period5 to numeric. I also have blanks in the data that have to be converted into missing numeric values. So on a clean slate here is my data and a data manipulation question as well. The values of interest to me are 'in_target'. The other values are 'low' and 'high' which are the 'event vaues not of interest'. 'missing' and blank values both need to be treated as 'missing'.
data have; input id period1 period2 period3 period4 period5
datalines;
1 in_target
2 in_target missing low in_target
3 high high missing missing missing
4 low in_target in_target in_target in_target
run;
I am looking to convert the values of in_target to 1 , low and high as '0' ,missing and blank with a 'period'.
I used the array
array ch(*) $ period:;
array nu(*) periodnew:;
do i = 1 to dim(ch);
if ch(i)='in_target' then do;
nu(i)=1;
end;
else if ch(i)='low' then do;
nu(i)=0;
end;
else if ch(i)='high' then do;
nu(i)=0;
end;
else if ch(i)='missing' then do;
nu(i)=.;
end;
else if ch(i)=' ' then do;
nu(i)=.;
end;
end;
run;
I am getting an error stating
ERROR: Array subscript out of range at line # column #
Could you first please tell me what is wrong with my array code?
... View more
05-24-2017
10:51 AM
Hi all,
I need your advice urgently to calculate the ‘confidence intervals’ of an average proportion. Please refer to example Table 1 for the original data. My final table has to look like Table 2.
Table 1. displays the example of the calculation of the ‘average proportion’ of the population for ‘event 1’. The numerator is the count of ‘event1_of interest’ across the time periods (period 1 to period 4) divided by the ‘total count of the time periods’ for the id excluding the period with ‘missing value’ for the event1. After identifying the numerator and denominator for each id, I have to calculate the proportion for each id and then the ‘average proportion’ for the population which is displayed in Table 1.
I have to then estimate the ‘confidence intervals’ for the ‘average proportion’ for each event for the population and create Table 2.
I am looking to calculate the counts(numerator, denominator) and proportion using proc sql. How can I calculate the confidence intervals for the average proportion? Please let me know as soon as possible. Thanks.
Table 1
ID
Period1
Period2
Period3
Period4
Proportion of Event1_Time in target
Proportion Calculation
ID1
Event1_of interest
Missing
Event1_not of interest
Event_of interest
= 2 / 3
=0.66
ID2
Event1_not of interest
Event1_not of interest
Missing
missing
= 0 / 2
=0
ID3
Event1_not of interest
Event1_of interest
Event1_of interest
Event1_of interest
= 3 / 4
=0.75
Average_Proportion
=(0.66+0+0.75)/3=0.47
Table 2
Average_Proportion
95% Confidence Limit
Event
Event 1
0.47
Event 2
Event 3
Event 4
... View more
05-15-2017
04:06 PM
Addtionally, when I output the chart I get a inner and outer box. Is there a way to get rid of it?
Thanks.
... View more
05-15-2017
04:01 PM
Hi all,
I am creating a stacked bar chart for percent scored on test which is categorized in difefrent groups as low , midddle and high. Below is the data and my code. The 'Score' variable is in percent.
(1) I need to reduce the width of the bar and
(2) also add 'Score' of the 'Category' to each segment in the bars.
(3) Remove the word 'Category' on the legend.
(4) adjust the percent 'interval' on the y-axis and add the symbol % next to the numbes on the grid lines. Please advise as soon as possible. Thanks.
proc sgplot data=want;
title 'Distribution XXXXX;
vbar Test / response=value group=Category;
xaxis display=(nolabel);
yaxis grid label='Percent %';
run;
Category
Test
Score
low
CHE
11
middle
CHE
83
high
CHE
3
low
PHY
7
middle
PHY
54
high
PHY
38
low
MAT
10
middle
MAT
69
high
MAT
19
low
BIO
16
middle
BIO
33
high
BIO
51
... View more
05-15-2017
02:07 PM
Hi DanH_sas, thank you I tried it and I can understand its value. I will definitely remember to use it, but now I am required to display the chart in the previous manner. Also any help on displaying the 'N; below the time periods would be appreciated!
... View more