About Dess

Dess · ‎11-14-2014

Thank you. This worked like a charm straight away, not even a typo.

Dess · ‎11-10-2014

I have a dataset dsEvents which contains multiple variables/attributes of the same value domain. I would like to change these values to arbitrary ranking numbers. Consider the dataset: data dsEvents; input ID e1 e2 e3 e4 e5 e6 e7 e8 e9 e10 e11; cards; 077123 A A4 B I1 L L U8 A E2 A3 W5; 622941 B B B2 L L5 C9 E1 A L Q D9; 2452 E E B A4 B2 O C L Q D9 W2; run; I want to swap these values to numbers according to some arbitrary ranking system: A = -1 A4 = 0 L = 0 W5 = 22 B = 2 E = 3 ...etc Yielding as a result (removed latter attributes for brevity) data dsRanking; input ID e1 e2 ...; cards; 077123 -1 0 ...; 622941 2 2 ...; 2452 3 3 ...; run; My current understanding is to use a regexp prxchange(s///g) for every var and for every possible value. This is a lot. Thus, I ask for your assistance in finding a better way. Keep in mind that every value can appear in every variable/attribute. There are about 16 possible values, and up to 50 variables/attributes. Any idea how to solve this?

Dess · ‎10-27-2014

This worked right off the bat. Excellent and elegant answer. Thank you for the help.

Dess · ‎10-27-2014

This was a great way to do it I think. There was however an issue with counting, as expanding on your solution caused some errors regarding 'conversion from integer to character', though it is probable that this was due to my understanding rather than any irregularity in the procedure. Thank you for the great answer.

Dess · ‎10-27-2014

Thank you very much. The procedure you listed did unfortunately count not only the individual frequencies, but accumulated them in each subsequent observation. It did however prove very helpful in showing me how it could be done. So thank you.

Dess · ‎10-23-2014

Hi, I am in need of some assistance here. I am trying to create a dataset with frequencies, as mapped over several variables/attributes. My 'have' set is as follows: data have; input ID a b c d e f END; cards; 1 X X X Y X Z K 922 X Y Y Z Y X K 33 W Z Y Y X X K 12 X X W W X Y F ; run; The value domain of each variable/attribute a .. f is the same. From this dataset 'have' I wish to create a new set where each variable is the frequency of that value in 'have'. For instance, if we are to use the above dataset 'have', the resultant 'want' is: data want; input ID X Y Z W END; cards; 1 4 1 1 0 K 922 2 3 1 0 K 33 2 2 1 1 K 12 3 1 0 2 F ; run; Can you give me some suggestions on how to accomplish this feat?

Dess · ‎10-10-2014

Now it's working flawlessly. Thank you so much for the assistance. Hopefully I have learnt something in the process as well

Dess · ‎10-10-2014

Awesome, now it's almost working. I got the RESULT column to keep track of everything no more than 6 years from b. Now I just need the timeframe to slide with each new entry (that is, if e is within range of b, make e + 6 years the new cutoff point). proc sql; create table WANT as select *, MDS.MIN_DATE as MIN_DATE format=DATETIME20., MDS.MAX_DATE as MAX_DATE format=DATETIME20., case when ExamsE between CALCULATED MIN_DATE and intnx("DTYEAR",CALCULATED MIN_DATE,6) then "Y" else "N" end as RESULT from dsHistory left join (select distinct dsHistory.ItemsX,min(ExamsE) as MIN_DATE,max(ExamsE) as MAX_DATE from dsHistory group by dsHistory.ItemsX) MDS on dsHistory.ItemsX=MDS.ItemsX; quit; Do you have suggestions on how to add in the moving timeframe? Thank you in advance

Dess · ‎10-09-2014

I'm afraid I do not understand where DS comes from. Is it a new set? is it a copy of dsHistory? Thank you.

Dess · ‎10-08-2014

The algorithm should go by 6 years, not 6 months. Hmm, it will take me some time to read up on and hopefully understand the functions you are using. Thank you

Dess · ‎10-08-2014

Heluuu, I am trying to find a window of interest in my time data. For an item x I have an interval [a, b], and inside this interval at various points I have examinations e. Each examination comes with a date d. My problem is that in some cases the interval [a, b] is large, and there is a significant gap between the last examination e n and b. I would like to scratch all examinations that occur prior to this gap. I have the following relevant points in a table: DATA dsHistory; INPUT ItemsX ExamsE EndpointB; DATALINES; 0001 02/03/95 04/08/13 0001 07/11/98 04/08/13 0001 09/09/02 04/08/13 0001 01/06/06 04/08/13 0205 01/05/87 02/11/10 0205 14/03/01 02/11/10 0205 22/02/04 02/11/10 0205 17/08/08 02/11/10 0205 11/08/09 02/11/10 ; RUN; Using this example, I would like x 0001 to disappear (as there is no e i close to b 0001 ), and I would like the first observation of x 0205 to disappear as well (there is a significant gap between the first observation and the others). In short, my proposed algorithm looks like: foreach unique subset ItemX x in X t_in_sequence = b(x) foreach ExamsE e in x if t_in_sequence - e > 6 years Remove observation corresponding to e(x) from set; else t_in_sequence = t_examination; Now, my problem is that I don't know how to do this other than splitting the dataset into several thousand subsets, and reiterating through them with a new t_in_sequence every time. How can I go through each x inplace? How can I update t_in_sequence while going through x? Does anyone have suggestions on how to do this? Thank you for reading

Dess · ‎08-29-2014

Tom wrote: proc sort data=dsOriginalSet; by pkey xpot; run; proc transpose data=dsOriginalSet out=want prefix=diag; by pkey; var diag; run; Tried this, and it works great. Thank you. For some reason I got two additional variables, _NAME_ and _LABEL_, filled on every row with DIAG, but that's not a problem, only a curiosity. Thank you all for the great answers!

Dess · ‎08-29-2014

Hi, I could really use some help transforming my sets. Consider the following: data dsOriginalSet; input pKey diag pid xpot xdist; cards; 1998 G03 17752 080707 5 2727 E003M 185355 040188 9 1124 G03 85174 110101 3 1998 G03 65510 030912 5 2727 G03 319013 011096 8 4493 G02 881244 050709 3 ; The primary key here is pid, with some data being derivative. From this set, I would like to construct a sequence of diag, using pKey as the primary key, where each occurrence of diag is sorted by date (as provided by xpot). The result from the above set should be: data dsTransformed input pKey diag1 diag2 (diag3 ...); cards; 1998 G03 G03 2727 E003M G03 1124 G03 4493 G02 ; The number of diag variables is arbitrary, but will not exceed 10. Do you have any suggestions on how to accomplish this? Thank you in advance

Online Status	Offline
Date Last Visited	‎09-01-2015 07:12 AM

Re: Altering var values across the entire dataset

Altering var values across the entire dataset

Re: How to create a frequency table from multiple variables

Re: How to create a frequency table from multiple variables

Re: How to create a frequency table from multiple variables

How to create a frequency table from multiple variables

Re: How to build an in-place sliding window sequence

Re: How to build an in-place sliding window sequence

Re: How to build an in-place sliding window sequence

Re: How to build an in-place sliding window sequence

Re: Altering var values across the entire dataset

Altering var values across the entire dataset

Re: How to create a frequency table from multiple variables

Re: How to create a frequency table from multiple variables

Re: How to create a frequency table from multiple variables

How to create a frequency table from multiple variables

Re: How to build an in-place sliding window sequence

Re: How to build an in-place sliding window sequence

Re: How to build an in-place sliding window sequence

Re: How to build an in-place sliding window sequence

How to build an in-place sliding window sequence

Re: Dataset transformation and horizontal appends

Dataset transformation and horizontal appends