About christinakwang

christinakwang · ‎06-11-2017

@s_lassen Thanks for your response! Answers to question 1 and 2 makes sense. Re: 3, I think this is Truven pharmacy claims data. data __drugs; set &do i=1 %to %sysfunc (countw (&pharm_files.)); %let filename = %scan (&pharm_files., &i)); raw.&filename. %end; ; /* This was subsequently cleaned (i.e. removed negative values, etc) and made into collapse_drugs_2; The purpose of the code in the OP was "get the mode of daysupp per NDC" */ I added in the comments. Not sure what else I can provide? I don't even think I have access to the actual data, though I do have the data dictionary + the standard data summary that Truven gives? Thanks!!

christinakwang · ‎06-11-2017

Thanks for the quick response! It doesn't seem like there's an option 😞 There is: Enable autocomplete Enable hint Tab width Substitute spaces for tabs Enable color coding Show line numbers Font size Enable autosave Autosave interval

christinakwang · ‎06-11-2017

Currently its on a white background. Does anyone know how to change this? Thanks.

christinakwang · ‎06-03-2017

Thanks! 2. Re: why "group by" is necessary. In that case, what happens if our group by variables are less than our count variables? i.e. below, where we count by ndcnum, metqty, daysupp, but we only group by ndcnum and daysupp? proc sql; create table mode_b_1 as select count(*) as num_obs, ndcnum, metqty, daysupp from collapse_drugs_2 (where=(daysupp^=.)) group by ndcnum, daysupp order by ndcnum, num_obs desc; quit; 3. Re: proc means step: Can you please explain the difference between mode_a_1 and mode_a_2? I feel like from my understanding of the code they are identical to each other?

christinakwang · ‎06-03-2017

I am trying to understand this code inherited from a coworker who is no longer here: proc sql; create table mode_a_1 as select count(*) as num_obs, ndcnum, metqty, daysupp from collapse_drugs_2 (where=(daysupp^=.)) group by ndcnum, daysupp, metqty order by ndcnum, metqty, num_obs desc; quit; proc means data=mode_a_1 noprint nway missing; class ndcnum num_obs metqty; var daysupp; output out=mode_a_2 (drop=_:) mean=; proc sort; by ndcnum metqty descending num_obs; run; For context, collapse_drugs_2 is a dataset with drug claims. Questions: 1. For the proc sql step, is this an accurate understanding of what the code is doing? Create a dataset called "mode_a_1" that is based off of all rows of collapse_drug_2 (where daysupp values are not missing), where we also create a new variable called "num_obs" that counts the number of observations for each unique combination of ndcnum, metqty, daysupp. Then order by ascending ndcnum, metqty, and then by descending num_obs. 2. If the above is true, why is the "group by" line necessary? Isn't the combination of ndcnum, metqty, daysupp already specified in the select count (*) line? 3. Why is the proc means step necessary? Aren't all daysupp (and therefore mean of daysupp) for each ndcnum, num_obs, metsqty combination the same, because that's how we by definition organized the mode_a_1 dataset? Isn't mode_a_1 the same dataset as mode_a_2? Thanks,

christinakwang · ‎02-19-2017

Thanks Reeza!! This works!!!!!!!!!! 🙂

christinakwang · ‎02-06-2017

Still not sure if I'm getting a solution... 😞

christinakwang · ‎02-01-2017

I'm just a beginner here, but are you hoping to output all duplicates into another dataset, or just remove duplicates? If just to remove duplicates, then proc sort data=duplicates out=deduped; by var1 var2; run; proc sort data=deduped nodupkey; by var1 var2; run;

christinakwang · ‎02-01-2017

Still not sure how I can use a sql join? proc sql; create table atissue.peers as select a.fund_name, b.* from atissue.atissue a inner join atissue.potentialpeers b on a.year = b.year AND a.index_flag = b.index_flag AND a.load = b.load AND a.fof_flag = b.fof_flag where b.class in (select distinct class from atissue.atissue); *Used to have (select distinct class from at.issue.atissue where fund_name = "XXXX" Not sure how we can make the code go through all XXX *separately* i.e. assemble all the matches for Hamilton only. and then Hairspray. And then In the Heights. and then Les Miserables, etc.; quit; For example, the above code is wrong because I end up getting this: data ATISSUE.PEERS; infile datalines dsd truncover; input Fund_name:$14. Peer_name:$8. Year:BEST. Class:$2. Peer_sales:BEST. Index_Flag:BEST. Load:$2. FoF_Flag:BEST.; datalines4; Hamilton,Rubella,2005,GG,3,1,CW,1 Hamilton,Rubella,2006,NK,7,0,NK,0 Hamilton,Rubella,2007,NK,9,0,NK,0 Hamilton,Smallpox,2007,NK,4,0,NK,0 Hamilton,Rubella,2008,CW,2,1,NK,1 Hamilton,Rubella,2010,NK,9,0,GG,0 Hairspray,Rubella,2007,NK,9,0,NK,0 Hairspray,Smallpox,2007,NK,4,0,NK,0 Hairspray,Smallpox,2008,NK,2,0,NK,0 Hairspray,Mumps,2008,NK,9,0,NK,0 Hairspray,Smallpox,2009,NK,5,0,HC,0 In the Heights,Measles,2007,HC,5,1,CW,0 In the Heights,Mumps,2009,GG,7,1,NK,1 In the Heights,Mumps,2010,NK,5,0,NK,0 Les Miserables,Measles,2008,HC,6,1,GG,1 Les Miserables,Measles,2009,HC,2,0,CW,0 Les Miserables,Measles,2010,GG,3,1,CW,0 ;;;; But the first line (Hamilton, Rubella, 2005, GG, 3, 1, CW, 1), should not have been outputted because "GG" class is not in any of the Hamilton years.

christinakwang · ‎01-31-2017

Thanks! I added what I think is the correctly-formatted data in the original post :D. Going to try to update my code... but not sure how to do this without a macro. ...

christinakwang · ‎01-31-2017

Thanks! I've added the data to this post! Digesting the rest of your reply... Can we add more than one file? 😮

christinakwang · ‎01-31-2017

Dataset A: data ATISSUE; infile datalines dsd truncover; input Fund_name:$14. Year:BEST. Class:$2. Sales:BEST. Index_Flag:BEST. Load:$2. FoF_Flag:BEST.; datalines4; Hamilton,2005,CW,1,1,CW,1 Hamilton,2006,CW,5,0,NK,0 Hamilton,2007,CW,4,0,NK,0 Hamilton,2008,NK,8,1,NK,1 Hamilton,2009,NK,2,0,GG,1 Hamilton,2010,NK,9,0,GG,0 Hairspray,2005,GG,3,1,HC,0 Hairspray,2006,GG,7,1,HC,0 Hairspray,2007,HC,4,0,NK,0 Hairspray,2008,HC,5,0,NK,0 Hairspray,2009,NK,2,0,HC,0 Hairspray,2010,NK,1,0,HC,1 In the Heights,2005,HC,8,0,GG,1 In the Heights,2006,HC,9,0,NK,1 In the Heights,2007,GG,4,1,CW,0 In the Heights,2008,NK,1,1,NK,0 In the Heights,2009,CW,5,1,NK,1 In the Heights,2010,NK,3,0,NK,0 Les Miserables,2005,NK,1,0,NK,1 Les Miserables,2006,NK,4,1,CW,0 Les Miserables,2007,NK,5,0,CW,1 Les Miserables,2008,CW,6,1,GG,1 Les Miserables,2009,CW,1,0,CW,0 Les Miserables,2010,GG,3,1,CW,0 ;;;; Dataset B: data POTENTIALPEERS; infile datalines dsd truncover; input Peer_name:$8. Year:BEST. Class:$2. Peer_sales:BEST. Index_Flag:BEST. Load:$2. FoF_Flag:BEST.; datalines4; Rubella,2005,GG,3,1,CW,1 Rubella,2006,NK,7,0,NK,0 Rubella,2007,NK,9,0,NK,0 Rubella,2008,CW,2,1,NK,1 Rubella,2009,CW,5,1,GG,1 Rubella,2010,NK,9,0,GG,0 Smallpox,2005,GG,6,0,HC,1 Smallpox,2006,NK,3,0,HC,0 Smallpox,2007,NK,4,0,NK,0 Smallpox,2008,NK,2,0,NK,0 Smallpox,2009,NK,5,0,HC,0 Smallpox,2010,NK,5,1,HC,1 Mumps,2005,NK,7,1,GG,1 Mumps,2006,CW,2,1,NK,1 Mumps,2007,CW,6,0,CW,0 Mumps,2008,NK,9,0,NK,0 Mumps,2009,GG,7,1,NK,1 Mumps,2010,NK,5,0,NK,0 Measles,2005,NK,3,1,NK,1 Measles,2006,HC,2,0,CW,0 Measles,2007,HC,5,1,CW,0 Measles,2008,HC,6,1,GG,1 Measles,2009,HC,2,0,CW,0 Measles,2010,GG,3,1,CW,0 ;;;; I want to output every row from Dataset B, for a given Fund_name in Dataset A that 1) has a "class" value that appears for any row in a fund_name AND 2) has the exact year - index_flag - load - FoF flag The following code lets me do this by each Fund_name: %MACRO findpeers(fundname); proc sql; create table atissue.&fundname._peers as select a.fund_name, b.* from atissue.atissue a inner join atissue.potentialpeers b on a.year = b.year AND a.index_flag = b.index_flag AND a.load = b.load AND a.fof_flag = b.fof_flag where b.class in (select distinct class from atissue.atissue where a.fund_name = "&fundname"); quit; %MEND findpeers; %findpeers (Hamilton); %findpeers (Hairspray); %findpeers (In the Heights); %findpeers (Les Miserables); Questions: 1) The above code seem to work for Hamilton and Hairspray, but why not In the Heights and Les Miserables? Is that because there are spaces? If so, how can I get around that? 2) How can I simplify the last four lines such that, instead of doing %findpeers four times, I can "do over unique values in the fund_name column? 3) Do I have to then stack all these datasets from each fund_name in a separate step? I want to stack them vertically. Thank you!

christinakwang · ‎01-31-2017

Thanks! The final code ended up being: %MACRO findpeers_inclusive(fundname); proc sql; create table atissue.&fundname._peers_inclusive as select * from atissue.potentialpeers where class in (select distinct class from atissue.atissue where fund_name = "&fundname"); quit; %MEND findpeers_inclusive; %findpeers_inclusive (Hairspray); %findpeers_inclusive (Hamilton);

christinakwang · ‎01-31-2017

OK I revised the question. I realized I wasn't representing the question accurately. In case of the new question, not sure if a sql join would work...because it's not exactly matching anymore?

christinakwang · ‎01-31-2017

Please do teach 🙂 (I am such a beginner)

Online Status	Offline
Date Last Visited	‎06-12-2017 12:54 AM

Re: Understanding sql count (*) as X, var1, var2, var3, and purpose of...

Re: Make SAS Studio/University editor background black?

Make SAS Studio/University editor background black?

Re: Understanding sql count (*) as X, var1, var2, var3, and purpose of...

Understanding sql count (*) as X, var1, var2, var3, and purpose of "gr...

Re: Repeat code for unique values in a column

Re: Repeat code for unique values in a column

Re: finding duplicates obs (using sas code)

Re: Repeat code for unique values in a column

Re: Repeat code for unique values in a column

Re: Understanding sql count (*) as X, var1, var2, var3, and purpose of...

Re: Understanding sql count (*) as X, var1, var2, var3, and purpose of...

Re: Understanding sql count (*) as X, var1, var2, var3, and purpose of...

Re: Repeat code for unique values in a column

Re: Create a macro that outputs rows from Dataset B if its variables x...

Make SAS Studio/University editor background black?

Re: finding duplicates obs (using sas code)

Re: Understanding sql count (*) as X, var1, var2, var3, and purpose of...

Re: Make SAS Studio/University editor background black?

Make SAS Studio/University editor background black?

Re: Understanding sql count (*) as X, var1, var2, var3, and purpose of...

Understanding sql count (*) as X, var1, var2, var3, and purpose of "gr...

Re: Repeat code for unique values in a column

Re: Repeat code for unique values in a column

Re: finding duplicates obs (using sas code)

Re: Repeat code for unique values in a column

Re: Repeat code for unique values in a column

Re: Repeat code for unique values in a column

Repeat code for unique values in a column

Re: Revised Q: output any row from Dataset B whose x matches x in Data...

Re: Create a macro that outputs rows from Dataset B if its variables x...

Re: Create a macro that outputs rows from Dataset B if its variables x...