About ercksh8

ercksh8 · ‎06-16-2021

I would like to use the earliest date as the index date.

ercksh8 · ‎06-16-2021

Hello all. DATA item1; INPUT ID $ Transaction_number $ Date1; CARDS; 1 1000 20190112 1 1001 20190121 1 1002 20200111 2 1003 20190102 2 1004 20200110 3 1005 20210123 5 1006 20210102 6 1007 20200101 ; RUN; DATA item2; INPUT ID $ Transaction_number $ Date2; CARDS; 1 2000 20190211 1 2001 20210102 2 2002 20200521 3 2003 20210101 4 2004 20200101 5 1006 20210102 5 2005 20210202 ; RUN;DATA item3; INPUT ID $ Transaction_number $ Date3; CARDS; 1 3000 20210411 1 3001 20200102 2 3002 20200521 3 3003 20200101 4 3004 20190101 5 3006 20200102 5 3005 20210202 ; RUN; I am working with a data that looks something like this. The dates in the above data in my dataset were characters so I converted them into numerical values by doing this. DATA item1; set item1; date_1 = INPUT(Date1, YYMMDD8.); FORMAT date_1 YYMMDDN8.; PUT date_1=; RUN; DATA item2; set item2; date_2 = INPUT(Date2, YYMMDD8.); FORMAT date_2 YYMMDD8.; PUT date_2=; RUN; DATA item3; set item3; date_3 = INPUT(Date3, YYMMDD8.); FORMAT date_3 YYMMDD8.; PUT date_3=; RUN; I am trying to create a dataset where I only include those who purchased item 1 and create variables YN_2 YN_3 to see if the individuals that purchased item 1 purchased item 2, 3 within a year since purchasing item 1. PROC SQL; CREATE TABLE YN_2 as SELECT SA.ID, SA.Transaction_number, CASE WHEN SA.Date1 <= SB.Date2 <= (SA.Date1 + 365) THEN 1 ELSE 0 END AS YN_2 FROM item1 AS SA LEFT JOIN item2 AS SB ON SA.ID = SB.ID ORDER BY ID ; QUIT; I have been able to find YN_2 and YN_3 separately using the code above, but am struggling to put them all together. I would only like to keep the first dates of purchase in item1 but don't want to do the same for items 2 and 3 as the initial date might not fall within the 1 year period but have a future purchase that falls within the 1 year. The end product I would like is a data where all IDs observed are included and have YN_2, YN_3...YN_n(n being an arbitrary number) of all items that I have (more than the three I included above) among those who have purchased item 1. The data above is only a sample of the data and I'm sorry in advance if it feels too incomplete. Thank you for the help.

ercksh8 · ‎04-26-2021

Thank you. I will also keep that in mind in the future.

ercksh8 · ‎04-23-2021

ID Year Case Expenditure Type 1 1 1 100 1 1 1 2 200 1 1 1 3 300 2 2 1 4 400 1 2 1 5 . . 2 1 6 500 2 3 1 7 600 2 4 1 8 100 1 1 2 9 100 1 1 2 10 200 2 1 2 11 300 1 2 2 12 400 1 2 2 13 500 2 2 2 14 600 2 3 2 15 . . 4 2 16 . . 1 3 17 100 1 1 3 18 200 1 1 3 19 300 2 2 3 20 400 2 2 3 21 500 1 2 3 22 600 2 3 3 23 . . 4 3 24 100 1 Sorry. I was trying to simplify my data and didn't realize I uploaded it that way. I included cases because when I attempted, I realized that instead of giving me a data that had one cost observation for each year, it gave me the total cost value on each observation. I was hoping there would be a way to simplify this. The reason I kept case in was because I was hoping to identify the total expenditure by the types of purchase later on and it was on a separate data table that was sorted by case and did not have the ID variable. The code I tried looked something like this. proc sql; create table A as select ID, year, case, expenditure, type, sum(expenditure) as total_expenditure from original group by ID; quit;

ercksh8 · ‎04-23-2021

This is after merging each year's data by ID.

ercksh8 · ‎04-23-2021

Hi all. I am currently working with a panel data and would like to find the total expenditure of each individuals for each year The data has a list of all purchases and the expenditure for each purchase is listed separately. ID Expenditure_year1 Case Expenditure_year2 Case Expenditure_year3 Case 1 200 1 500 9 100 17 1 300 2 200 10 200 18 1 100 3 100 11 300 19 2 400 4 100 12 400 20 2 . 5 200 13 500 21 2 200 6 300 14 600 22 3 300 7 . 15 . 23 4 100 8 . 16 200 24 ID Total_expenditure_year1 Total_expenditure_year2 Total_expenditure_year3 1 600 800 600 2 600 600 1500 4 100 0 200 Since ID 3 does not have any observation in year 2 and year 3, I would like ID 3 excluded. However for the missing value in year 1 for ID 2, I would like to skip over just that observation and still have the rest added. The resulting table would look something like the second table. If the ID is like ID 4, in which it has an observation for 2 years but maybe didn't make any purchase in year 2 to have missing value, I would like it to replaced with 0 and have the yearly expenditure show up as 0 in the final result. If possible I would like to set up a plot for each ID their trend in expenditure by year afterwards. The X-axis would be years (1,2,3) and U=Y-axis would be total_expenditure for each year. Thank you.

ercksh8 · ‎04-20-2021

data scale; set cleaned; by ID; IF a = 1 then score1 = 1; if a = 2 | 3 then score1 = 2; if a >= 4 & a <= 6 then score1 = 3; IF b = 1 then score2 = 1; if b = 2 | 3 then score2 = 2; if b >= 4 & b <= 6 then score2 = 3; IF c = 1 then score3 = 1; if c = 2 | 3 then score3 = 2; if c >= 4 & c <= 6 then score3 = 3; total_score = score + score + score; if d = 1 then act1 = 2; else act1 = 1; if e = 1 then act2 = 2; else act2 = 1; if f = 1 then act3 = 2; else act3 = 1; if g = 1 then act4 = 2; else act4 = 1; total_act = act1 + act2 + act3 + act4; run; This is the code that I have been trying. I know for sure there are IDs in which a, b, and c are all 1 but there are no cases where total_score = 3. I would like to know if there is a way to fix my code so that I can get my desired output.

ercksh8 · ‎04-16-2021

Hello all. So currently I am trying to create a table that will merge data from 4 datasets (say Y1, Y2, Y3, Y4). The 4 datasets are panel data from year 1, 2, 3, and 4 respectively. I have merged the four data sets using the data merge by ID. When I viewed the resulting data, I realized that there were some IDs that are not present in all four datasets. I would like to be able to make a dataset that merges only the IDs that are present in all four years. After doing so, I would like to create a new variable that adds all expenditures that were made by each ID in all 4 years, if they meet a separate condition. (i.e. if their zip code starts with the number 5). Y1 ID Expenditure Zipcode 1 50 20000 2 140 30000 3 30 51000 4 40 52000 5 50 50000 Y2 ID Expenditure Zipcode 6 100 50000 2 110 30000 3 120 51000 4 130 52000 5 140 50000 Y3 ID Expenditure Zipcode 1 50 20000 2 140 30000 3 30 51000 4 40 52000 5 50 50000 Y4 ID Expenditure Zipcode 1 50 20000 2 140 30000 3 30 51000 4 40 52000 If the datasets look something like this, the first step would create a data table with IDs 2, 3, 4 since they are the only ones present in all 4 datasets. Then the second part would create a new table with only IDs 3, 4 and would have an additional column sum_expenditure. I'm sorry my question is so all over the place. Thank you all in advance!

Online Status	Offline
Date Last Visited	‎06-18-2021 08:10 AM

Re: How to change date character into numeric dates

How to change date character into numeric dates

Re: How to add all values of a variable within each ID by year?

Re: How to add all values of a variable within each ID by year?

Re: How to add all values of a variable within each ID by year?

How to add all values of a variable within each ID by year?

Adding variables within ID to create new variable

How to add values of a column for each ID only if they meet a conditio...

Re: How to change date character into numeric dates

How to change date character into numeric dates

Re: How to add all values of a variable within each ID by year?

Re: How to add all values of a variable within each ID by year?

Re: How to add all values of a variable within each ID by year?

How to add all values of a variable within each ID by year?

Adding variables within ID to create new variable

How to add values of a column for each ID only if they meet a conditio...