About sbxkoenk

sbxkoenk · ‎03-27-2021

Hello, This is one way to do it: data abc; Length Id $ 1 Z1 $ 1 Z2 $ 1; input Id $ Z1 $ Z2 $; cards; 1 A 1 1 B 1 1 A 2 1 D 4 2 S 3 2 A 1 3 A 1 ; run; data def(drop=accumZ:); Length accumZ1 accumZ2 $ 100; set abc; retain accumZ1 '' accumZ2 ''; by Id; if first.Id then do; accumZ1='' ; accumZ2=''; Accum_distinct_Z1=0; Accum_distinct_Z2=0; end; if indexc(accumZ1,Z1)=0 then Accum_distinct_Z1+1; if indexc(accumZ2,Z2)=0 then Accum_distinct_Z2+1; accumZ1 = strip(accumZ1)!!strip(Z1); accumZ2 = strip(accumZ2)!!strip(Z2); run; /* end of program */ Cheers, Koen

sbxkoenk · ‎03-27-2021

On top of the suggestion by @Ksharp ... For more extensive / detailed info about "Inserting a substring into a SAS string" , see: Inserting a substring into a SAS string By Leonid Batkhan on SAS Users February 15, 2021 Inserting a substring into a SAS string - SAS Users Have a nice day, Koen

sbxkoenk · ‎03-27-2021

Hello, The Type of the RETAIN statement is "Declarative". A RETAIN statement is a declarative statement which is used in building the Program Data Vector (PDV) during the compilation phase of the DATA step. In your example I would put it right before or right after the SET statement. The difference is in the order of the variables in the output dataset. See this usage note for more insight: Usage Note 8395: How to reorder the variables in a SAS® data set https://support.sas.com/kb/8/395.html Have a nice day, Koen

sbxkoenk · ‎03-27-2021

Hello, As always there are multiple solutions to this, but why don't you use an ID statement? The variables listed in the ID statement are displayed beside each observation. These variables can be used to identify each observation. Koen

sbxkoenk · ‎03-26-2021

Hello @Schtroumpfette , No idea if you have found out meanwhile on how to proceed but here's an example on how to proceed with Hash Object Table Look-up in case of being confronted with a Cartesian product! Suppose you have a look-up table 'LOOKUP_TABLE' of 20 000 records. For every observation in the dataset 'HAVE', you want to scan each and every observation in the lookup table for a possible match. You have no key-variable (no by-variable) to merge on. Here's the classical (non-SQL) way of doing this kind of Cartesian product. It takes time! data work.wanted; set work.have; do pointer = 1 to 20000; set work.lookup_table point=pointer; if whatever_condition_is_TRUE then output; end; run; Here's how to do the same with a hash table look-up. Much faster! Especially for BIG datasets. /* See: https://support.sas.com/resources/papers/proceedings/proceedings/forum2007/271-2007.pdf */ /* See: https://support.sas.com/resources/papers/proceedings16/10200-2016.pdf */ data work.wanted(drop=rc); if _N_=1 then do; declare hash h(dataset: "work.lookup_table", ordered: "A", multidata: "Y"); h.definekey ("key"); h.definedata ('var_1','var_2'); h.definedone(); call missing(var_1, var_2); end; set work.have; do rc = h.find() by 0 while (rc = 0) ; if whatever_condition_is_TRUE then output; rc = h.find_next() ; end ; run; You don't have a key variable, but the hash table requires a key-variable. Therefore, make a key variable in both of your datasets and give the key a constant value for every observation, sthg. like: key='k'; Joining on such a key is the ultimate nxn merge (Cartesian product). Make sure your datasets 'have' and 'lookup_table' have no variables with the same name. Do a (rename=()) if needed. I hope you can translate this to your own concrete situation. Cheers, Koen

sbxkoenk · ‎03-24-2021

Hello, To build upon the reply of @STAT_Kathleen ... if your concern is heteroskedasticity and you wanted to correct for heteroskedasticity, ... I have just come across this interesting (recent) PROC PANEL usage note: Usage Note 67322: Heteroscedasticity and cluster correction of standard errors using the PANEL procedure 67322 - Heteroscedasticity and cluster correction of standard errors using the PANEL procedure (sas.com) Cheers, Koen

sbxkoenk · ‎03-24-2021

Hello @Schtroumpfette , I would like to provide you with a quick code-example, but don't have time right now. I will check the thread of this discussion again in the coming days and if you haven't sorted it out yet (or nobody else has provided you with an example), I will surely post some hash table lookup code for your specific use case where there is NO key-variable to merge on. Meanwhile you can try to sort it out yourself. Here's an introduction on 'Using the Hash Object'. Using the Hash Object https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=lrcon&docsetTarget=n1b4cbtmb049xtn1vh9x4waiioz4.htm&locale=en Look at Example 2: Loading a Data Set and Using the FIND Method to Retrieve Data. It's not entirely what you are looking for as there is a key variable (by-variable) involved, but you will get ideas. You can also search all boards within these communities.sas.com with keywords like: hash object table lookup SAS. You will probably find numerous examples and good papers on the subject of hash table lookup. Cheers, Koen

sbxkoenk · ‎03-24-2021

Hello, Use the 'Insert SAS Code' icon when copy/pasting code in your post. 😉 Your code won't lose structure and formatting. Further on, ... I haven't interpreted your code fully (it's too long!) but it seems to me you need a kind of Cartesian product (for every obs in dataset 1, scan all obs in dataset 2 for a match). There exist multiple ways to do this but by far the fastest solution is to do a Hash Object Table Lookup. You can probably cut your processing time by 80 or 90% by using a hash table (in-memory lookup). Hash tables are a great and easy to use tool introduced in SAS 9. Cheers, Koen

sbxkoenk · ‎03-24-2021

Sorry, you still have to build upon the above program to include the lines with Description='not found'. I hadn't read the 'assignment' thoroughly enough. Koen

sbxkoenk · ‎03-24-2021

Hello, I can imagine less greedy solutions exist but this should work. data _NULL_; if 0 then set one nobs=count; call symput('numobs',strip(put(count,8.))); STOP; run; %PUT&=numobs; data three; set two; do pointer=1 to &numobs.; set one point=pointer; if find(Description,key1,'it')^=0 AND find(Description,key2,'it')^=0 then output; end; run; Cheers, Koen

sbxkoenk · ‎03-23-2021

Hello, the post of @PaigeMiller has given you the right response. I've also seen that you liked it. It's even better to mark it as a solution! With my post I just want to make you aware that there exists a procedure in SAS/ETS that directly supports your [t-31] logic. It's PROC TIMEDATA. Have a look at the program below. PROC TIMEDATA data=sashelp.citiday out=_NULL_ /*PRINT=(ARRAYS)*/ OUTARRAY=wanted1; outarrays dtbd3m_31 my_dtbd3m; /*by by_var*/; id date interval=day accumulate=total format=date9.; var dtbd3m; do t = 32 to dim(dtbd3m); dtbd3m_31[t] = dtbd3m[t-31]; if dtbd3m_31[t]>6 then my_dtbd3m[t]=dtbd3m[t]; else my_dtbd3m[t]=-99999; end; run; QUIT; data wanted1; set wanted1; if _N_<32 then delete; if my_dtbd3m=-99999 then delete; run; PROC SQL noprint; create table wanted2 as select * from wanted1 where date IN (select date from sashelp.citiday); QUIT; I get the same 411 rows in return as @PaigeMiller of course. Cheers, Koen

sbxkoenk · ‎03-23-2021

Just FYI, as I see the question has already been solved. Something like this works as well: data combined; set ODSHMS.TABLE_202101-ODSHMS.TABLE_202112 indsname=dsn; datestamp=scan(dsn,2,'_'); run; It's called: Using Data Set Lists with SET ... but it's obviously less generic as the previous two solutions. Koen

sbxkoenk · ‎03-23-2021

Hello, While I cannot help with the error message (I don't grasp why you would have 'invalid characters'), I can suggest you to read these articles. I'm sure you are interested, given your original question: Paper 3018-2019 (SAS Global Forum 2019) Predicting Inside the Dead Zone of Complete Separation in Logistic Regression Robert Derr, SAS Institute Inc., Cary, NC https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2019/3018-2019.pdf Odds ratio plots with a logarithmic scale in SAS By Rick Wicklin on The DO Loop July 29, 2015 https://blogs.sas.com/content/iml/2015/07/29/or-plots-log-scale.html Cheers, Koen

sbxkoenk · ‎03-23-2021

Hello @ahschnell , Could you solve your "problem" with the suggestion of @PaigeMiller ? I haven't seen any reaction from your side. If your question is still alive, you may consider the usage of PROC NLIN (SAS/Stat) for your segmented regression and accompanying hypothesis test. See: Segmented regression models in SAS By Rick Wicklin on The DO Loop December 14, 2020 https://blogs.sas.com/content/iml/2020/12/14/segmented-regression-sas.html and The NLIN Procedure Example 87.1 Segmented Model https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=statug&docsetTarget=statug_nlin_examples01.htm&locale=en and The NLIN Procedure Example 87.5 Comparing Nonlinear Trends among Groups https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=statug&docsetTarget=statug_nlin_examples05.htm&locale=en Good luck, Koen

sbxkoenk · ‎03-23-2021

Hello, Below code may not be entirely correct as I have to type it "blindly" (i.e. without being able to run it and see the LOG). But you will surely get the idea. PROC SQL noprint; create table work.ds_to_append as select memname from dictionary.tables where libname='ODSHMS' and substr(memname,1,8) = 'TABLE_20'; QUIT; data _NULL_; if 0 then set work.ds_to_append nobs=count; call symput('numobs',strip(put(count,8.))); STOP; run; %PUT &=numobs; %MACRO append_loop; %LOCAL i; PROC DATASETS library=WORK NoList; delete appendresult / memtype=DATA; run; QUIT; %DO i = 1 %TO &numobs.; data _NULL_; set work.ds_to_append(firstobs=&i. obs=&i.); call symput('memname' , strip(memname) ); call symput('date_to_add',substr(strip(memname),7)); run; data ODSHMS.&memname.; set ODSHMS.&memname.; datestamp=&date_to_add.; run; PROC APPEND base=work.appendresult data=ODSHMS.&memname.; run; %END; QUIT; %MEND append_loop; options mprint symbolgen; %append_loop /* end of program */ Cheers, Koen

Online Status	Offline
Date Last Visited	Friday

Re: PROC COMPARE label attribute is missing

Re: PROC COMPARE label attribute is missing

Re: How to Calculate the Standardized Mean Difference between two trea...

Re: SAS Heat Map

Re: SAS Heat Map

Re: NLMIXED: Weibull survival model with longitudinal covariate

Re: ods pdf and gmap: PDF output different than EG

Re: SGPANEL plot is too narrow

Re: Visualizing associations between categorical variables

Re: proc expand functionality

Re: PROC COMPARE label attribute is missing

Re: SAS Heat Map

Re: Burr distribution CDF

Re: SAS Heat Map

From Napkin Sketch to Automated Agents: Assessing Climate Risks with L...

Re: proc expand functionality

Re: Visualizing associations between categorical variables

Re: Bai-Perron test

Re: How to make the symbol outline/border thinner in %sgrectangle

Re: how to get the last day of the month using %sysfunc and intnx func...

How to use the CAS-tables, that Model Studio creates, elsewhere (in ot...

MidPoint: the half-way point along a great circle path between two poi...

Kalman Filtering and Kalman Smoothing in PROC UCM

Re: Count Accumulate number of unique values-Data step

Re: add a 0 within the character

Re: calculate accumulate value of X-location of Retain

Re: Absolute Value Difference

Re: Macro is taking a long time to run - Risk set sampling

Re: Proc Panel with regression weight

Re: Macro is taking a long time to run - Risk set sampling

Re: Macro is taking a long time to run - Risk set sampling

Re: How to find strings that contain values from other variables?

Re: How to find strings that contain values from other variables?

Re: Problem to Organize a Table

Re: Automatically select and append tables from the library

Re: Extrem odd ratio with firth logistic regression

Re: Test difference in slopes

Re: Automatically select and append tables from the library