54m ago
quickbluefish
Lapis Lazuli | Level 10
Member since
07-27-2018
- 201 Posts
- 54 Likes Given
- 20 Solutions
- 66 Likes Received
-
Latest posts by quickbluefish
Subject Views Posted 480 Monday 363 Monday 263 a week ago 316 a week ago 797 a week ago 292 a week ago 534 2 weeks ago 512 2 weeks ago 1020 2 weeks ago 1032 2 weeks ago -
Activity Feed for quickbluefish
- Got a Like for Re: Using sql subquery:thoughts and what I learnt from a practice question of SQL1:Essentials. Wednesday
- Got a Like for Re: Using sql subquery:thoughts and what I learnt from a practice question of SQL1:Essentials. Wednesday
- Got a Like for Re: How do I identify Subtypes based on specific algorithm. Tuesday
- Posted Re: Using sql subquery:thoughts and what I learnt from a practice question of SQL1:Essentials on Advanced Programming. Monday
- Posted Re: How to score elements of an array specified in another dataset on SAS Programming. Monday
- Got a Like for Re: Proc T Test Help. Saturday
- Posted Re: How do I identify Subtypes based on specific algorithm on SAS Programming. a week ago
- Got a Like for Re: Can I add linebreaks to the select into :macro_vars?. a week ago
- Posted Re: Can I add linebreaks to the select into :macro_vars? on SAS Programming. a week ago
- Liked Re: I need help using sas with unix commands for ballardw. a week ago
- Got a Like for Re: Odds Ratio Interpretation. a week ago
- Got a Like for SQL LIKE operator and whitespace. a week ago
- Got a Like for Re: Another way(and simpler) to solve practice question m104p11 on page 224 ,Macro1: Essentials pdf. a week ago
- Got a Like for Re: Another way(and simpler) to solve practice question m104p11 on page 224 ,Macro1: Essentials pdf. a week ago
- Posted Re: Another way(and simpler) to solve practice question m104p11 on page 224 ,Macro1: Essentials pdf on Advanced Programming. a week ago
- Posted Re: PROC SQL to find nonmatches on Programming 1 and 2. a week ago
- Posted Re: Assign Flag as minimum value , if same value then last records flag on SAS Programming. 2 weeks ago
- Posted Re: Prediction Interval on Statistical Procedures. 2 weeks ago
- Posted Re: How to import the name of all .png/jpeg files in a folder? on SAS Programming. 2 weeks ago
- Posted Re: How to import the name of all .png/jpeg files in a folder? on SAS Programming. 2 weeks ago
-
My Liked Posts
Subject Likes Posted 2 Monday 1 a week ago 1 3 weeks ago 1 a week ago 1 06-02-2020 10:26 AM
Monday
2 Likes
Not that it's shorter in this case, but you could also solve this using subqueries in a join like this:
proc sql;
select a.countrycode, a.indicatorname,
a.estyear/100 as estpct1 format=percent8.2,
a.estyear3/100 as estpct3 format=percent8.2,
estyear1
from
(select * from sq.globalfindex
where indicatorname="Borrowed for health or medical purposes (% age 15+)") A
inner join
(select countrycode from sq.globalmetadata
where upcase(region)=upcase("Europe & Central Asia") and
upcase(incomegroup)=upcase("High income")) B
on a.countrycode=b.countrycode
order by a.estyear1 desc;
quit;
... View more
Monday
Here is some fake data to more or less match what you have in your screenshots - run this and use PROC PRINT to look at these if you like:
data seqs;
length seq $20 sequences $6 s1-s8 3;
array s {*} s1-s8;
do i=1 to dim(s);
s[i]=(rand('uniform')<0.3);
end;
drop i;
call symputx("nseqs",_N_-1);
infile cards dsd truncover firstobs=1 dlm='|';
input seq sequences;
cards;
1,2,3,4,5,6,7,8|seq1
1,2,3,4|seq2
1,3,6,8|seq3
3,4,7|seq4
;
run;
data values;
length score1-score8 3;
array s {*} score1-score8;
do r=1 to 20;
do i=1 to dim(s);
s[i]=(rand('uniform')<0.6);
end;
output;
end;
drop r i;
run;
I can't say I really understand your request - especially whether you're trying to sum things vertically or horizontally. My assumption based on what you said is that you want to sum horizontally... :
data want;
set
seqs (in=A)
values
;
array T {&nseqs} $20 _temporary_;
array sc {*} score1-score8;
array sq {*} seq1-seq4;
if A then T[_N_]=seq;
else do;
call missing(of sq[*]);
do i=1 to dim(sq);
do s=1 to countW(T[i],',');
sq[i]+sc[scan(T[i],s,',')*1];
end;
end;
output;
end;
keep score1-score8 seq1-seq4;
run;
proc print data=want; run;
... View more
a week ago
1 Like
The first part is just a slight re-work of your input dataset - it was producing all sorts of errors trying to read as it was:
data WORK.SUBTYPE_SAMPLE;
infile cards dsd truncover firstobs=1 dlm=',';
length ID $12 type $12 reference_date service_start service_end 4;
informat reference_date service_start service_end date9.;
format Reference_date DATE9. service_start DATE9. service_end DATE9.;
input ID Type Reference_date service_start service_end;
cards;
1,A,04JAN2016,10JAN2016,21JAN2016
1,B,04JAN2016,09JUL2018,09NOV2019
1,Unspecified,04JAN2016,06JAN2016,10FEB2016
2,B,08JUN2019,08DEC2019,19DEC2019
2,Unspecified,08JUN2019,22OCT2019,09AUG2019
3,Unspecified,02FEB2017,02APR2017,15APR2017
4,A,01JAN2020,03MAR2020,24MAR2020
4,A,01JAN2020,05MAY2018,10MAY2018
4,Unnspecified,01JAN2020,02JAN2020,03JAN2020
5,A,09SEP2016,11NOV2016,15NOV2016
5,B,09SEP2016,09SEP2016,10NOV2016
6,A,03MAR2016,30AUG2016,02NOV2016
6,A,03MAR2016,14OCT2016,19OCT2016
6,A,03MAR2016,26MAR2016,19DEC2016
6,Unspecified,03MAR2016,20OCT2016,21OCT2016
6,Unspecified,03MAR2016,12DEC2016,28DEC2016
6,B,03MAR2016,28JUN2016,15AUG2016
7,B,10OCT2022,11OCT2022,14NOV2022
8,Unspecified,01JAN2019,05MAY2019,06MAY2019
8,Unspecified,01JAN2019,07MAY2019,08MAY2019
;
run;
proc sort data=subtype_sample; by id; run;
data want;
set subtype_sample;
by ID;
length true_type $12 closest 4 anyAB 3;
retain true_type closest anyAB;
if first.ID then do;
closest=10000;
true_type='';
anyAB=0;
end;
dist=min(
abs(service_start-reference_date),
abs(service_end-reference_end)
);
if type in ('A', 'B') then do;
anyAB=1;
if dist<closest then do;
true_type=type;
closest=dist;
end;
end;
else if anyAB=0 then true_type='Unspecified';
if last.ID then output;
keep ID true_type closest;
run;
proc print data=want; run;
... View more
a week ago
1 Like
Here's another option if you want to save them into separate macro variables:
proc contents noprint data=sashelp.cars out=namy (keep=name label); run;
data _null_;
set namy end=last;
nm=quote(strip(name));
lbl=quote("put text here");
c=',';
if last then c='';
msg=compbl('{ name= ' || nm ||' label=' || lbl || '}' || c);
put msg;
call symputx(compress("msg" || _N_), msg);
run;
* for example, the 3rd name/label combination ;
%put MSG3: &msg3;
... View more
a week ago
2 Likes
Agree with others here - I would just add that with COUNTW and SCAN (and similar), it's a good idea to specify your delimiter as the optional last argument - otherwise, SAS will try to guess. And further, if the delimiter is a space (as it is here), never a bad idea to clean up potential multiple whitespace characters with %CMPRES:
%let YRLIST=2012 2014 2016;
%let YRLIST=%CMPRES(&YRLIST);
%let nYRS=%sysfunc(countW(&YRLIST,' '));
.... %let YR=%SCAN(&YRLIST, &i, ' ');
* you can also use %STR( ) instead of ' ' above ;
... View more
a week ago
It's a little hard to follow what you're doing - for one thing, you can comment out multiple lines like this to make things easier to read:
/*
Here is a
comment that is
3 lines long. */
Merging / joining on strings (especially when they're just names that are entered freehand) is not ideal as you've discovered, but sometimes that's all you have. You could try using some sort of "fuzzy" merge like this:
https://blogs.sas.com/content/sgf/2021/09/21/fuzzy-matching/
As for the join, I think you will have an easier time assessing with a single LEFT join and then doing some counts on the resulting data:
PROC SQL;
create table want as
select b.permno, a.bobnamesNew, b.company_name_header, a.bobnamesOriginal
from
work.Temp A
LEFT JOIN
(select distinct permno, company_name_header from names.CRSPnames) B
on a.bobnamesNew=b.company_name_header;
title "# distinct bobNames without a match in CRSP";
select count(distinct a.bobnamesNew) from WANT where missing(permno);
title;
QUIT;
Again, you're going to either have to try a fuzzy merge or, more likely, actually manually correct the bobnames. If there are just things like differences in case, whitespace, special characters, etc., then you probably could make this more automated, but you'd have to provide some examples here in order for people to help.
... View more
2 weeks ago
This is a pretty tricky problem. It might help if we understood what the purpose of the NEW_FLAG variable is?
... View more
2 weeks ago
This sounds like homework? Not sure this is really a SAS question so much as a basic stats question, but standard deviation is the square root of the variance. And standard error (SE) is the standard deviation divided by the square root of the sample size. Your 95% prediction/confidence interval (CI), in this case, is just going to be evenly distributed on either side of your mean, with the lower limit being:
mean - 1.96 * SE
...and the upper limit being:
mean + 1.96 * SE
If your current variance is, say, 16, and your population size is 18, then your SE would be 4/sqrt(18). So.... if you increased your sample size by 10, then what is your SE? And how does that affect the upper and lower bounds of the CI?
I believe population standard deviation has a very slightly different formula than regular SD.
ChatGPT is your friend. Better yet, find a tiny fake dataset and calculate variance, SD, SE and CI by hand - really.
... View more
2 weeks ago
Great. The only reason it's not resolving the macro name is because you have the whole string (starting with dir) inside single quotes. SAS will only resolve macro variables in double quotes (or no quotes at all). You'd have to do something a little trickier to get that particular thing to work with a macro variable, but probably just easier to hardcode your user name unless this really needs to be dynamic.
It might work just like this (assuming no spaces in your file path):
filename have pipe "dir /b /s C/&user/myfolder\*.png";
... View more
2 weeks ago
How about this - the first data step is just generating some fake data. This allows gaps in months. I am not sure what your ID variable is supposed to be, though:
data have;
date='01Jan2023'd;
format date date9.;
do i=1 to 50;
date+rand('integer',5,45);
co2=rand('integer',1,20);
output;
end;
drop i;
run;
proc sort data=have; by date; run;
proc sql noprint;
select min(intnx('month',date,0)), max(intnx('month',date,0))
into :firstmonth trimmed, :lastmonth trimmed from have;
quit;
data _null_;
call symputx('nmonths',intck('month',&firstmonth,&lastmonth)+1);
run;
%put NMONTHS: &nmonths;
data want;
set have end=last;
array T {-1:&nmonths} _temporary_;
T[intck('month',&firstmonth,date)+1]+co2;
if last then do;
do i=1 to &nmonths;
yrmonth=put(intnx('month',&firstmonth,i-1),yymmn6.);
month_m0=T[i];
month_m1=T[i-1];
month_m2=T[i-2];
output;
end;
end;
keep yrmonth month_:;
run;
proc print data=want; run;
... View more
2 weeks ago
By 'predictive', I just means it's positively associated with hypertension, not that rurality causes hypertension.
... View more
2 weeks ago
2 Likes
I think your life will be easier if you use the DESCENDING option in the proc logistic statement -- that will make 0 the referent ("no") category for all binary variables. Right now, you've currently got 0 as the referent group for your dependent (left hand side) variable (because of the event= syntax) and *1* as the referent group for the rural variable. So interpretation is pretty non-intuitive at the moment. Instead, do this:
proc logistic data=work.research_lr DESCENDING;
model hot_spot = rural;
run;
Doing the above should result in an odds ratio that's the reciprocal of what you currently have -- 1/0.627 = 1.595 An OR of 1.595 (assuming a confidence interval that does not include 1) would mean that rurality is predictive of hot spot hypertension. More specifically, the interpretation is that the odds of hypertension are 1.6X higher for rural people than for non-rural people.
... View more
3 weeks ago
Somehow I've never seen the ASPECT option in SGPLOT / SGPANEL - that is really good to know.
... View more
3 weeks ago
1 Like
Agree this is odd - especially the ones that show up on the bottom when the lines are close together. At least for the ones that place the label on top, I think the algorithm is just trying to put it in the least ambiguous position possible, even if it doesn't look great. It's more obvious when you have a lot of lines, some of which (like in your plot) only extend part way down the x-axis:
You could play around with something like this instead -- labeling with the TEXT statement instead of the CURVELABEL options:
data test;
do grp=1 to 10;
ymean=rand('erlang',2)*5;
nwks=25;
if ranuni(0)<0.4 then nwks=rand('integer',5,25);
x=.; y=.;
do wk=1 to nwks;
yval=ymean+rand('normal')*10;
if wk=nwks then do;
put 'hello?';
x=wk;
y=yval;
end;
output;
end;
end;
run;
proc sgplot data=test noautolegend;
series x=wk y=yval / group=grp;
text x=x y=y text=grp / group=grp textattrs=(size=12pt) position=right backfill backlight;
scatter x=x y=y / group=grp markerattrs=(size=12pt);
run;
Looks a little silly as-is, but if you fiddle with the options, I would bet you can get something good.
... View more