About kk13

kk13 · ‎04-04-2025

Below is an example of data I have. I want to find an array of unique types based on the startmonth and startyear. I want to find an array of valid end dates for each unique type. data HAVE; input caseid startmonth961 startyear961 startmonth962 startyear962 startmonth963 startyear963 startmonth964 startyear964 startmonth981 startyear981 startmonth982 startyear982 startmonth983 startyear983 startmonth984 startyear984 endmonth961 endyear961 endmonth962 endyear962 endmonth963 endyear963 endmonth964 endyear964 endmonth981 endyear981 endmonth982 endyear82 endmonth983 endyear983 endmonth984 endyear984 type961 type962 type963 type964 type981 type982 type983 type984; datalines; 1 1 1995 5 1996 10 1997 -4 -4 1 1998 10 1997 -4 -4 -4 -4 1 1995 5 1988 -4 -4 -4 -4 -4 -4 12 1997 -4 -4 -4 -4 6 8 1 . 6 1 . . ; run; This is my code. I was able to find unique types. I'm unable to get the valid end dates. data mycode; set have; array train type961-type964 type981-type984; array start_mo startmonth961-startmonth964 startmonth981-startmonth984; array start_yr startyear961-startyear964 startyear981-startyear984; array end_mo endmonth961-endmonth964 endmonth981 - endmonth984; array end_yr endyear961-endyear964 endyear981 - endyear984; array unique u01 - u10; array stmo_unique stmo01-stmo10; array styr_unique styr01-styr10; array m m01-m10; array endmo_unique endmo01-endmo10; array endyr_unique endyr01-endyr10; unique_count = 0; do i =1 to dim(train); if (train{i} ne .) then do; match_found = 0; do j= 1 to unique_count; if (train{i} = unique{j} and start_mo{i} = stmo_unique{j} and start_yr{i} = styr_unique{j}) then do; if end_mo{i} ne -4 and end_yr{i} ne -4 then do; if endmo_unique{j} =-4 and endyr_unique{j}=-4 or (end_yr{i}>endyr_unique{j}) or (end_yr{i} = endyr_unique{j} and end_mo{i} > endmo_unique{j}) then do; endmo_unique{j} =end_mo{i}; endyr_unique{j}=end_yr{i}; end; end; match_found = 1; m{j} = 1; leave; end; end; if (match_found = 0) then do; unique_count + 1; unique{unique_count} = train{i}; stmo_unique{unique_count}=start_mo{i}; styr_unique{unique_count}=start_yr{i}; endmo_unique{unique_count}=end_mo{i}; endyr_unique{unique_count}=end_yr{i}; end; end; end; keep caseid run; This is what I want. If there are duplicates of types, some have invalid end dates and some valid end dates, then the end date should be the most current valid end date. If duplicates have all valid end dates, then it should be the most current valid end dates. If duplicates have all invalid end dates, then the end date should be invalid. If type is unique, then assign its own end dates (it could be valid or invalid). For example, type963 and type982 have same type (1) and same start month & year (10/1997). However, endmonth963 (-4) and endyear963 (-4) differ from endmonth982 (12) and endyear1982(1997). So, the third unique type should be: u03 =1, stmo03=10, styr03=1997, endyr03=12, endyr03=1997, m03 =1; data want; input id u01 - u5 stmo01-stmo5 styr01-styr5 endmo01-endmo05 endyr01-endyr5 m01-m5; datalines; 1 6 8 1 6 . 1 5 10 1 . 1995 1996 1997 1998 . 1 5 12 -4 . 1995 1988 1997 -4 . . . 1 . . ; run;

kk13 · ‎05-07-2024

Is there an option to get the number of count in the top 3. For example, the variable a has 2 counts of 12, 1 count of 80, and 1 count of 100.

kk13 · ‎05-07-2024

I forgot to add that I would like to look at the top 10 distinct values. The proc univariate gives does not give out the top distinct values.

kk13 · ‎05-07-2024

I have a data set with multiple variables (columns). I'm trying to get the top 10 values of each variable. However, I would like the output to be in a new dataset with a new variable called: variable_names. The row shows the top 10 values from smallest to largest. The data below is just an example. I have more than 4 input variables. Here, I am just looking at the top 3. For a bigger data set, I may look at the top 10. data example; input a b c d; datalines; 100 120 123 140 12 23 43 42 12 12 23 23 5 7 5 2 80 88 98 2 3 4 101 4 ; %let nTop = 3; proc univariate data = new_data NExtrObs=&nTop; ods select ExtremeObs; run; The proc univariate gives a separate output for each. The UNIVARIATE Procedure Variable: a Extreme Observations Lowest Highest Value Obs Value Obs 3 6 12 3 5 4 80 5 12 3 100 1 I would like to create a new data set where output with column names: Variables and the top values. For example, Variables Top1 Top2 Top3 a 12 80 100 b 23 88 120 c 5 6 1

kk13 · ‎03-03-2021

I would like to find out if array (a1-a8) contains 1 or 0 using the date1-date8 array. The check1 to check2 provides the dates that I need to look at. For id=1, I need to look at if the dates are within check1<=dates{i}<=check2. Therefore, date1 through date5 are within check1 and check2. I need to check if a1 through a5 contains any 1. If yes, then HAS=1. If a1 through a5 are all zeros, then HAS=0. If a1 through a5 are some zeros and some missing, then HAS is missing. data z; input id a1-a8 date1-date8 check1 check2; datalines; 1 0 0 1 1 . . . . 15 15 16 16 17 18 18 20 11 17 2 . . 1 1 0 1 1 1 14 16 17 18 19 21 22 24 8 18 3 . . . 1 1 1 1 1 14 15 17 17 18 19 21 22 16 23 4 0 0 0 0 0 1 1 1 13 15 18 18 21 21 23 23 10 18 ; data want; input id a1-a8 b1-b8 check1 check2 HAS; datalines; 1 0 0 1 1 . . . . 15 15 16 16 17 18 18 20 11 17 1 2 . . 1 1 0 1 1 1 14 16 17 18 19 21 22 24 8 18 1 3 . . . 1 1 1 1 1 14 15 17 17 18 19 21 22 16 23 1 4 0 0 0 0 0 1 1 1 13 15 18 18 21 21 23 23 10 18 0 5 0 0 . . 1 1 1 1 10 11 11 13 13 17 17 18 9 11 . ; I found jth dimension where dates is within check1 and check2. I need to check if a1 through a jth contains any 0 or if a1 through a jth are all equal to 0. data z1; set z; array a a1-a8; array date date1-date8; do i=1 to dim(a); if check1<=date{i}<=check2 then do; j=i; end; end; run;

kk13 · ‎03-03-2021

I would like to check if the array "a" contains any 1s. The check1, check2, and "b" array are used to determine the dimension number I need to check in array "a." For id=1, check1=11 and check2=17. I found that 11<=b{i}<=17 is at i=2 (b2=15). Hence, I need to check if a1 through a2 contains any 1s. data z; input id a1-a4 b1-b4 check1 check2; datalines; 1 1 . . 1 10 15 18 20 11 17 2 . . 1 1 17 18 19 21 16 18 3 . . . 1 18 19 21 22 16 23 ; data want; input id a1-a4 b1-b4 check1 check2 HAS; datalines; 1 1 . . 1 10 15 18 20 11 17 1 2 . . 1 1 17 18 19 21 16 18 0 3 . . . 1 18 19 21 22 23 23 1 ; data z1; set z; array a a1-a4; array b b1-b4; do i=1 to dim(a); if check1<=b{i}<=check2 then do; j=i; end; end; run;

kk13 · ‎02-25-2021

Age1520 is the total number of visits from ages 15 through 20,...etc. data have; input id age1520 age2125 age2630; datalines; 1 7 8 3 2 4 10 5 3 7 13 2 4 5 15 6 5 4 11 2 ; data weights; input id weights15-weights30; datalines;

kk13 · ‎02-25-2021

I have longitudinal data of number of hospital visits for individuals from when they were ages 15 through 30. I have counted the total number of visits at different age ranges (15-20, 21-25, 26-30, and 15-30). Each individuals have different weights assigned at each age. I would like to calculate the weighted average of visits at different age ranges. Does it make sense to take the weights at age 30 (weight30) to take the weighted averages? Or should I use weight20 to calculate the weighted average number visits 15-20 (visits1520), weight25 for visits2125, weight30 for visits2630, and weight30 for visits1530.

kk13 · ‎12-09-2020

I have multi dimensional array of numeric numbers. I would like to create separate multi dimensional array that shows the first observed unique numbers. data have1; input a1_1-a1_4 b1_1-b1_4 c1_1-c1_4; datalines; 2000 2001 2002 . 2001 2004 . . 2002 2004 2005 20006 2000 2002 2003 2004 2003 2004 . . 2002 2005 2006 . ; data want1; input fa1_1-fa1_4 fb1_1-fb1_4 fc1_1-fc1_4; datalines; 2000 2001 2002 . . 20004 . . . . 2005 2006 2000 2002 2003 2004 . . . . . 2005 2006 . ;

kk13 · ‎10-02-2020

I have a data set which has account numbers and balances in 2018 and 2020. I found the account numbers in 2019 which has some accounts from 2018 and 2020. I would like to create an array of balances for 2019, where if the 2019 account number matches with either 2018 or 2020, then pick the balance 2018. (Data Want shows what I would like!) For example, the first observation has account number 101 in 2019. This account was present in both 2018 and 2020. Therefore, I would like the 2019 balance (bal2019_1) to equal (bal2018_1). data have; input acct2018_1 acct2018_2 acct2018_3 acct2018_4 acct2018_5 bal2018_1 bal2018_2 bal2018_3 bal2018_4 bal2018_5 acct2020_1 acct2020_2 acct2020_3 acct2020_4 acct2020_5 bal2020_1 bal2020_2 bal2020_3 bal2020_4 bal2020_5; datalines; 101 407 103 . . 40 60 80 . . 101 604 505 . . 10 19 20 . . 303 203 . . . 70 80 . . . 507 205 406 907 . 14 19 89 99 . 901 602 801 . . 10 14 24 . . 404 901 505 802 . 20 90 99 87 . 301 501 904 905 . 84 90 20 95 . 808 . . . . 34 . . . . . ; run; data account_balance2019; input acct2019_1 acct2019_2 acct2019_3 acct2019_4 acct2019_5 ; datalines; 101 . . . . 907 203 . . . 801 505 901 . . 808 301 . . . ; run; data want; input acct2018_1 acct2018_2 acct2018_3 acct2018_4 acct2018_5 bal2018_1 bal2018_2 bal2018_3 bal2018_4 bal2018_5 acct2019_1 acct2019_2 acct2019_3 acct2019_4 acct2019_5 bal2019_1 bal2019_2 bal2019_3 bal2019_4 bal2019_5 acct2020_1 acct2020_2 acct2020_3 acct2020_4 acct2020_5 bal2020_1 bal2020_2 bal2020_3 bal2020_4 bal2020_5; datalines; 101 407 103 . . 40 60 80 . . 101 . . . . 40 . . . . . 101 604 505 . . 10 19 20 . . 303 203 . . . 70 80 . . . 907 203 . . . 99 80 . . . 507 205 406 907 . 14 19 89 99 . 901 602 801 . . 10 14 24 . . 801 505 901 . . 24 99 10 . . 404 901 505 802 . 20 90 99 87 . 301 501 904 905 . 84 90 20 95 . 808 301 . . . 34 84 . . . 808 . . . . 34 . . . . . ; run;

kk13 · ‎09-27-2020

I'm trying to create a new array called (newtopic with 5 elements or variables) where it equals non-missing topic1-topic10. data have; input begin end topic1 - topic10; datalines; 3 8 . . 1 0 1 0 1 . . . 2 6 . 0 0 1 1 1 . . . . 4 8 . . . 1 1 0 0 0 . . 1 5 1 1 1 1 0 . . . . . ; data want; input begin end newtopic1-newtopic5; datalines; 3 8 1 0 1 0 1 2 7 0 0 1 1 1 4 9 1 1 0 0 0 1 5 1 1 1 1 0 ;

kk13 · ‎09-21-2020

I would like to check where a number is located in a multi-dimensional array. Then, I would like to take the minimum of the entire column for that particular row. For example, I have the following data sets. This is a 3-dminesional with 3 rows and 5 columns. array temp{3,5} a1_1-a1_5 a2_1-a2_5 a3_1-a3_5; In the first observation, Check1= 111 is located at temp{2,3}. For non-missing values, I would like to take the minimum. So, min(a2_1-a2_5)=1. In the second observation, Check1=222 is located at temp{2,4} and min(a2_1-a2_5)=1. Check2=223 is located at temp{3,2} and min(a3_1-a3_5)=3. data k; input check1 check2 a1_1 -a1_5 a2_1-a2_5 a3_1-a3_5; datalines; 111 . 12 122 133 . . 1 11 111 13 . 777 2 . . . 222 223 5 555 232 . . 1 111 123 222 224 11 223 3 . . ; run; Thank you

Online Status	Offline
Date Last Visited	‎04-04-2025 03:25 PM

Find the array of unique type and valid date

Re: Proc Univariate for top n largest values for multiple variables (c...

Re: Proc Univariate for top n largest values for multiple variables (c...

Proc Univariate for top n largest values for multiple variables (colum...

Check if an array contains a value in specific dimension

Check if an array contains a value.

Re: weighted average

weighted average

Create separate array for first observed unique number

SAS Array Creating A New Dataset And Matching

Find the array of unique type and valid date

Re: Proc Univariate for top n largest values for multiple variables (c...

Re: Proc Univariate for top n largest values for multiple variables (c...

Proc Univariate for top n largest values for multiple variables (colum...

Check if an array contains a value in specific dimension

Check if an array contains a value.

Re: weighted average

weighted average

Create separate array for first observed unique number

SAS Array Creating A New Dataset And Matching

SAS array. Create a new array with nonmissing value

Multi-Dimensional Array- Checking for match and take the minimum