About Cruise

PaigeMiller · ‎04-07-2020

Use the ENDPOINTS= option of the HISTOGRAM statement.

Cruise · ‎04-07-2020

Thank you chiming in. I really appreciate it. I agree that the assumption needed to be assessed first. I found this SUGI paper to test the PH assumption and should follow its suggestions. https://www.pharmasug.org/proceedings/china2018/SP/Pharmasug-China-2018-SP75.pdf

Cruise · ‎04-05-2020

Nice spot that I'm aiming for. But I gotta graduate first at least 🙂 I just completed a 6-month Co-Op at a large pharma and loved it.

Cruise · ‎04-02-2020

GUESSINGROWS=MAX; Fixed the problem. Yes. The date variable broke had missing for about 98% of the data. Thank you very much Reeza. I greatly appreciate it.

Cruise · ‎04-02-2020

quite handy! thanks a lot for the trick.

PGStats · ‎04-01-2020

Using DIF, and taking care of the first row: data want; set have; dtest = coalesce(dif(test), test); dneg = coalesce(dif(negative), negative); run;

Cruise · ‎03-23-2020

I follow the logic. I appreciate how you took advantage of 'lag' function here. Amazing.

Cruise · ‎03-21-2020

@novinosrin Worked like a magic. Thanks much. data c.concatenated(compress=yes); set c.data1995-c.data2013 (keep=gender ethnic race age lensty zip dob source1-source6 ....); run;

Cruise · ‎03-19-2020

I ended up using Option #2.

ballardw · ‎03-18-2020

Depending on what need the truncated versions for you may need even need to change the data. Use a format for display or group creation purposes: data HAVE; input DX01 $ DX02 $ DX03 $ CASE; cards; 15701 1576 2007 1 15701 2006 1007 1 10001 1576 1007 1 ; proc print data=have; format dx: $3.; run; The groups created by using a different format would be honored in most of the analysis, reporting or graphic procedures.

Cruise · ‎03-18-2020

Thanks a lot. I ended up using IN function.

Reeza · ‎02-23-2020

Close, tertiles or turtles if you get caught by autocorrect.

novinosrin · ‎01-17-2020

Hi @Cruise SQL is indeed a great ready-meals and super convenient construct of clauses that provides fantastic utility for summary statistics aside the provision of pushing a query easily to a third party database and many others. In essence this kind of compatibility is pretty much unparalleled considering a SQL programmer minus SAS programmer would just like that use it to its full extent and be over with tasks. Therefore, Yes, your thought process is correct to question why would SQL not be robust should you consider all the advantages mentioned and I suppose I missed out many advantages too. However, your case is kinda specific that mandates to address certain constraints. 1. Since your sample(assuming a representative sample of your real) seems a clean sorted dataset by ID and in an order presumably the observations within the ID are ordered in sequence as you indicated in your preference in picking the earliest(1st occurrence) in case of a tie. 2. 1 makes a case needing to compute a row_number to identify and choose which one is the earliest. This could be achieved by using MONOTONIC() , which is not documented, some argue it is error prone and at any rate is costly to have an extra pass of the dataset that you noticed in the sub-query. 3. Though seemingly simple and concise, there were 3 passes of the dataset, 1st to compute row_num, 2nd to determine the min(diff) and 3rd to identify and choose min of row_num from min(diff) 4. Also, it is always better to think through a datastep solution the very moment you are certain that your dataset is sorted and you know your data alignment. The GROUP BY clause performs a lot of actions i.e does an internal sort, then follows if there is a summary statistics involved and so forth. One caveat to keep in mind though is, in some operating environments, the SQL optimizer does some magic to beat Datastep solutions with a short cut plan in building an internal algorithm. This topic is complex and not needed to confuse us at this point 5. The Datastep offers better control for the user as it is actually easier and intuitive than SQL. One would argue against me here, but trust me it actually takes longer to get the hang of internal mechanisms of SQL as opposed to clearly defined sequential iteration of datastep execution i.e one by one. In SQL, you can go from row processing to column processing and back to row processing and beyond all in one SELECT CLAUSE. I deem this crazy but I believe I have gotten very thorough with this 🙂 6. So, instead of your data requiring a preliminary row_number partitioned by group, if it had a DATE variable and it it were to be an unsorted dataset and if your operating environment is highly conducive for the SQL optimizer to pick the shortest internal algorithm, my oh my! you might just win. Hope this helps!

novinosrin · ‎01-16-2020

Hi Again @Cruise First off , my sincere apologies for overlooking a very minor logic, albeit that's not an excuse. Try the below modified data temp; input ctc_id date_contact; cards; 1 1 1 2 2 1 2 1 3 1 3 2 ; proc sql; create table want as select count( ctc_id) as count label='num of indivs with multiple' from (select distinct ctc_id from temp group ctc_id having count(distinct date_contact)>1); quit;

Cruise · ‎01-11-2020

The suggested change solved the problem. I would not have thought of this modification unless you pointed out. I greatly appreciate your help. Thanks a lot!

Online Status	Offline
Date Last Visited	‎04-15-2022 04:56 PM

proc sgplot - how to show / force fixed values on x-axis?

PROC SGPLOT how to get more diverse colors

Re: Proc sgplot how to achieve specific order for labels in keylegend?

Re: Proc sgplot how to achieve specific order for labels in keylegend?

Re: Proc sgplot how to achieve specific order for labels in keylegend?

Re: Proc sgplot how to achieve specific order for labels in keylegend?

Re: Swimmer's plot, how to show dose level and text inside the bars

Proc sgplot how to achieve specific order for labels in keylegend?

Swimmer's plot, how to show dose level and text inside the bars

Re: Data merge by multiple variables keeping distinct levels of both d...

Re: PROC SGPLOT how to get more diverse colors

Re: PROC SGPLOT how to get more diverse colors

Re: Proc sgplot how to achieve specific order for labels in keylegend?

Re: Proc sgplot how to achieve specific order for labels in keylegend?

Re: Swimmer's plot, how to show dose level and text inside the bars

Split mixture of strings separated by multiple different delimiters

Proc sgplot how to achieve specific order for labels in keylegend?

Swimmer's plot, how to show dose level and text inside the bars

Re: Compute IQR and STD per record to proc gmap

Re-organize table using proc tabulate or report or transpose?

Re: Place the value labels at the border of each bar in a histogram

Re: Cox proportional HM - missing data

Re: How to interleave multiple data sets one-way aggregated by categor...

Re: Proc import / two date variables with the same format in an excel ...

Re: Converting YYMMDD10. character date to numeric date

Re: How to reverse the cumulative sum across the rows?

Re: Assign Identifier accounting the end of the series of rows by the ...

Re: Concatenate datasets efficiently

Re: Assign unique identifier by multiple variables

Re: Substring multiple variables with same prefix

Re: Look up in array to identify a specific value

Re: How many observations within each percentile / proc means

Re: Select row with min value otherwise select the first row when rows...

Re: Count the number of individuals with multiple observations

Re: Proc tabulate, adding across variable to the table and maintain th...