About soumri

soumri · ‎11-28-2019

@DarthPathos Yes Chris, it works very well. Thank you for this valuable collaboration.

soumri · ‎11-18-2019

Hi DarthPathos, Nice to hear from you. Is it possible to send you my dataset privately and discuss exactly what I need with variables exactly as they appear, then we'll publish the solution for the whole SAS community with just the example I've sent.

soumri · ‎11-15-2019

thank you DarthPathos, I could perhaps send you my dataset with the real variables, if tests are needed. A colleague told me (because it sounds like a loop maybe) that I have to use ARRAYs and DO Loops !!!! I do not know! I will remain at your disposal for more details.

soumri · ‎11-15-2019

Thank you for paying attention to my request. I see that the first part of the code ( just a counter, which I have already tried), should be changed by this one: PROC SQL; create table id_count as select ID, count (test_variable) as Cnt from table_name group by ID, rank order by ID, Rank, TD; QUIT; for the second and third condition, they must not be separated to accomplish them together. Treating them separately leads to false results. moreover, for the second condition, an error is generated which I do not see where is the problem: 16558 select distinct group, ID, year into Temp_Table ------- 79 ERROR 79-322: Expecting a:. Thanks again!

soumri · ‎11-15-2019

Hi, It's a somewhat complicated question I'm going to ask here, but many of my analyzes depend heavily on it. Your help would be, therefore, very precious for me. In fact, I have a database with over than 700,000 observations, that seems to this: GROUP ID RANK TD YEAR SEASON GYS Value LH84 B000001 1 28MAR2011 2011 1 LH8420111 11.2 LH84 B000001 1 30MAY2011 2011 2 LH8420111 9.6 LH84 B000001 1 27JUN2011 2011 3 LH8420113 7.8 LH84 B000001 1 01AUG2011 2011 3 LH8420113 7.2 LH84 B000001 2 09FEB2012 2012 1 LH8420121 19.3 LH84 B000001 2 10APR2012 2012 2 LH8420122 20.6 LH84 B000002 1 10APR2012 2012 2 LH8420122 9.4 LH84 B000002 1 05JUN2012 2012 3 LH8420123 10.9 LH84 B000002 1 14AUG2012 2012 3 LH8420123 8.7 KC01 B000013 4 18JUN2000 2000 3 KC0120003 9.6 KC01 B000013 4 14AUG2000 2000 3 KC0120003 9.2 KC01 B000013 4 14OCT2000 2000 4 KC0120004 7.2 etc... With: - TD is a test date, - Year is the year of TD (extracted from the TD variable) - Season is the season of the TD (Months from the test date were used to create the season variable). - And, GYS is a composite variable, created from GROUP, YEAR and SEASON variables * In my raw database there are almost 1100 different GROUPS, more than 45,000 different IDs, 5 different RANK, 15 Years (from 2000 to 2014) and 4 SEASONS. * Each ID can have up to 11 different tests for a particular RANK How do I to create another new one (while keeping the same variables), which fulfills the following conditions? - Only IDs with at least 3 tests for a particular RANK are considered. - Only GROUPS that contain at least 5 different IDs in a given YEAR are considered; - For each class of GYS there are at least 4 observations. The sample I gave is very small and can not be used to test any code, but I could provide you with a larger sample if the need arises. My best thanks.

soumri · ‎08-31-2018

a GYM is a new variable created by putting (Group || Test Year || Test Month), Test year (Y) and Test Month (M) were taken from TD variable by Y = year (TD) and M = month (TD)

soumri · ‎08-31-2018

Yes, I understood what you did, but it did not work by adjusting it to my actual data. find attached a part of my file in csv format that can still help (dates are given in SAS format).

soumri · ‎08-31-2018

Hi, I want to share individuals in classes by respecting certain conditions. I have the following data: ID: identifier; G: group; W: work TD: date of the test; OBS: the measure. Each ID normally belongs to a single group (G) but can have 1, 2, 3, 4 or 5 Works (W) and in each Work, it has been verified 3, 4, ... or 10 times (number of TD): ID G W TD OBS b1 H1 1 28/02/2008 13.5 b1 H1 1 01/04/2008 17.2 b1 H1 1 17/05/2008 16.8 b1 H1 1 05/08/2008 12.5 b1 H1 1 22/09/2008 10.0 b1 H1 3 27/03/2009 22.3 b1 H1 3 23/02/2009 20.1 b1 H1 3 11/05/2009 18.3 b1 H1 3 19/06/2009 18.9 b1 H1 3 29/07/2009 16.9 b2 H1 1 02/10/2009 20.0 b2 H1 1 28/11/2009 22.2 b2 H1 1 20/12/2009 24.6 b2 H1 1 31/01/2010 23.0 b2 H1 1 27/02/2010 20.5 .... (this is just an example, you can't test the code (it requires more data) What I am trying to do is to form classes of [Group * Year * Month of the test date] (GYM ). In the new data I am looking for I only want to keep observations that respect the following conditions:: - Each GYM class must contain at least 6 observations. - Each group G kept must have at least 4 different IDs. - Each selected ID must be present with at least its first 3 observations for a given Work. Please note that we may lose whole works for a given ID, or some IDs for a given group and all their controls, or even entire groups with all the IDs they contain. data new; set have; Y = year (TD); M = month (TD); GYM = compress (G || Y || M); proc sort; by ID W; run; proc sql; create table new2 as select *, count (distinct(TD)) as N_TD from new group by GYM; quit; / ****** Here, I noticed that the number of observations by GYM classes varies from 1 to hundreds. (I want to keep only the groups with at least 6 observations, by respecting the conditions above) ***/ Can someone help me formulate this code or tell me if there is another way? Thank you for your expertise and insight.

soumri · ‎08-27-2018

Yes I tray several joints, but I can't find any solution. can you make me your proposal in SAS code!!

soumri · ‎08-27-2018

soumri · ‎10-05-2017

Thank you @Patrickfor the very satisfactory clarifications.

soumri · ‎10-04-2017

@FredrikE @Patrick Why you put the f16. I can not understand the meaning.

soumri · ‎10-04-2017

@FredrikE @Patrick Thank you both of you, very good solution, Thank you FredriKE for the rectification you have made to let the Patrick's code work with my very old version of SAS. I'm sorry, I'll accept Patrick's solution for the SAS community to benefit from the latest updates. Thank you again for both of you.

soumri · ‎10-04-2017

Hi, I have a database with Id's with strings of characters of different lengths whereas they must be normally identical. here is the example: Id CA000015432204 CA008552136654 CA8552136654 CA08552136654 CA000000300205 CA300205 The third and the fourth id are normally the same as the second, but with a lack of typing there are some missing zeros after the "CA" code. Similarly for the fifth and the sixth Id. What I look for is How I can add as many zeros right after the "CA" code so that I will have a total number of characters equal to 14 as for the first, second, and fifth id, and the result will look like this : Id CA000015432204 CA008552136654 CA008552136654 CA008552136654 CA000000300205 CA000000300205 My data contain more than a million of observations and many country codes lake "CA" for "Canada". I think substr and compress can do the trick but I find no way to do it. Thank you very much.

soumri · ‎10-02-2017

Yes, it works well. tank you.

Online Status	Offline
Date Last Visited	‎03-15-2021 11:11 AM

Re: Observation selection

Re: Observation selection

Re: Observation selection

Re: Observation selection

Observation selection

Re: Create classes under some conditions

Re: Create classes under some conditions

Create classes under some conditions

Re: cycling observations

cycling observations

Re: Observation selection

Re: Observation selection

Re: Adding Zeros at a specific location of the variable

Re: Adding Zeros at a specific location of the variable

Re: Adding Zeros at a specific location of the variable

Re: Observation selection

Re: Adding Zeros at a specific location of the variable

Re: sum calculation from an equation

Re: Observation selection

Re: Observation selection

Re: Observation selection

Re: Observation selection

Observation selection

Re: Create classes under some conditions

Re: Create classes under some conditions

Create classes under some conditions

Re: cycling observations

cycling observations

Re: Adding Zeros at a specific location of the variable

Re: Adding Zeros at a specific location of the variable

Re: Adding Zeros at a specific location of the variable

Adding Zeros at a specific location of the variable

Re: remove special character from a specific variable