About Jianan_luna

Jianan_luna · ‎12-26-2020

I am currently taking course: Statistics 1: Introduction to ANOVA, Regression, and Logistic Regression-Chapter 3: More complex linear models. When I do the practice: performing a two-way ANOVA, Q3, I cannot successfully complete step 5 from the answer. I cannot add anything on the "model effects" section. Here is what's my SAS like, could you please help me figure it out? Thanks so much!

Jianan_luna · ‎11-28-2020

Thanks so much!

Jianan_luna · ‎11-28-2020

Thanks!

Jianan_luna · ‎11-28-2020

Thanks so much

Jianan_luna · ‎11-28-2020

Thanks so much

Jianan_luna · ‎11-27-2020

Thanks for your reply! I just change my coding like this: data cus_tran; input ID AGE $5. GENDER $ SUB_NUM TRANSACTION_DATE :yymmdd10. QUANTITY Profit; format TRANSACTION_DATE yymmdd10.; Datalines; 1 10-19 F 0 2014-05-06 1 13 1 10-19 F 1 2014-05-06 1 13 1 10-19 F 1 2014-04-10 2 22 2 10-19 M 1 2014-04-11 2 10 3 30-40 F 1 2014-04-07 1 10 4 10-19 F 0 2014-04-08 1 10 3 30-40 F 0 2014-04-22 1 20 5 20-30 M 1 2014-04-30 1 20 2 10-19 M 0 2014-05-01 1 20 3 30-40 F 0 2014-05-06 1 10 4 10-19 F 1 2014-05-10 3 20 4 10-19 F 0 2014-04-30 1 20 5 20-30 M 1 2014-05-10 1 20 2 10-19 M 0 2014-05-10 1 30 3 30-40 F 0 2014-05-01 1 10 2 10-19 M 1 2014-04-01 1 10 3 30-40 F 1 2014-05-01 1 20 ; RUN; proc sql; create table want as SELECT ID,put(TRANSACTION_DATE,monyy7.) as month_year, sum (Profit) as totalProfit, count (QUANTITY) as totalQuantity, sum (SUB_NUM) as frequency, GENDER, AGE From cus_tran Group By ID, month_year; quit; However, the outcome is like the following. Now the data is not grouped by each specific ID. Can you please help me fix this? I just wonder can I use duplicate function? But that is for the duplicate row, not for specific duplicated cell. Can you help me? Thanks so much!

Jianan_luna · ‎11-27-2020

I just got a result table like this: However, I want to edit my layout like this: for each specific customers, their corresponding age and gender, with total_profit/total_quantity, frequency on April, frequency on May, and with their gender and age. I am trying TRANSPOSE, however, I failed to transpose it. Could you please help me fix this? I put my coding following with this table as well. Thanks so much! ID total_profit only in April total quantity (only on April) frequency (only on April) frequency (only on May) gender age 1 2 data cus_tran; input ID AGE $5. GENDER $ SUB_NUM TRANSACTION_DATE :yymmdd10. QUANTITY Profit; format TRANSACTION_DATE yymmdd10.; Datalines; 1 10-19 F 0 2014-05-06 1 13 1 10-19 F 1 2014-05-06 1 13 1 10-19 F 1 2014-04-10 2 22 2 10-19 M 1 2014-04-11 2 10 3 30-40 F 1 2014-04-07 1 10 4 10-19 F 0 2014-04-08 1 10 3 30-40 F 0 2014-04-22 1 20 5 20-30 M 1 2014-04-30 1 20 2 10-19 M 0 2014-05-01 1 20 3 30-40 F 0 2014-05-06 1 10 4 10-19 F 1 2014-05-10 3 20 4 10-19 F 0 2014-04-30 1 20 5 20-30 M 1 2014-05-10 1 20 2 10-19 M 0 2014-05-10 1 30 3 30-40 F 0 2014-05-01 1 10 2 10-19 M 1 2014-04-01 1 10 3 30-40 F 1 2014-05-01 1 20 ; RUN; proc sql; create table want as SELECT ID,put(TRANSACTION_DATE,monyy7.) as month_year, sum (Profit) as totalProfit, count (QUANTITY) as totalQuantity, sum (SUB_NUM) as frequency From cus_tran Group By ID,month_year; quit;

Jianan_luna · ‎11-27-2020

Thanks so much Sir! I want to get the corresponding gender and age for every ID. I add gender and age after SELECT clause, but when I added it, then it the ID is not grouped at all. like this: Is there any way to get the corresponding age and gender for everyone with Group_by function. Thanks so much. Here is my coding: data cus_tran; input ID AGE $5. GENDER $ SUB_NUM TRANSACTION_DATE :yymmdd10. QUANTITY Profit; format TRANSACTION_DATE yymmdd10.; Datalines; 1 10-19 F 0 2014-05-06 1 13 1 10-19 F 1 2014-05-06 1 13 1 10-19 F 1 2014-04-10 2 22 2 10-19 M 1 2014-04-11 2 10 3 30-40 F 1 2014-04-07 1 10 4 10-19 F 0 2014-04-08 1 10 3 30-40 F 0 2014-04-22 1 20 5 20-30 M 1 2014-04-30 1 20 2 10-19 M 0 2014-05-01 1 20 3 30-40 F 0 2014-05-06 1 10 4 10-19 F 1 2014-05-10 3 20 4 10-19 F 0 2014-04-30 1 20 5 20-30 M 1 2014-05-10 1 20 2 10-19 M 0 2014-05-10 1 30 3 30-40 F 0 2014-05-01 1 10 2 10-19 M 1 2014-04-01 1 10 3 30-40 F 1 2014-05-01 1 20 ; RUN; proc sql; create table want as SELECT ID,put(TRANSACTION_DATE,monyy7.) as month_year, sum (Profit) as totalProfit, count (QUANTITY) as totalQuantity, sum (SUB_NUM) as frequency, GENDER, AGE From cus_tran Group By ID,GENDER,AGE,month_year; quit;

Jianan_luna · ‎11-27-2020

Thanks so much, but when I run the coding, it shows errors in log. The error is like following: the TRANSACTION_DATE should be numeric, but actually it's a character type. Could you please help me fix it? Thanks so much again!

Jianan_luna · ‎11-27-2020

I am using GROUP_BY to aggregate some variables. My goal is group by ID, then SUM the "Profit" only on April, COUNT the "Quantity" only on April, SUM the "SUB_NUM" only on May. I tried to find how to use IF function in aggregation, but I didn't find it. Could you please help me figure it out? Here is the dataset and my coding. data cus_tran; input ID AGE $5. GENDER $ SUB_NUM TRANSACTION_DATE $10. QUANTITY Profit; Datalines; 1 10-19 F 0 2014-05-06 1 13 1 10-19 F 1 2014-05-06 1 13 1 10-19 F 1 2014-04-10 2 22 2 10-19 M 1 2014-04-11 2 10 3 30-40 F 1 2014-04-07 1 10 4 10-19 F 0 2014-04-08 1 10 3 30-40 F 0 2014-04-22 1 20 5 20-30 M 1 2014-04-30 1 20 2 10-19 M 0 2014-05-01 1 20 3 30-40 F 0 2014-05-06 1 10 4 10-19 F 1 2014-05-10 3 20 4 10-19 F 0 2014-04-30 1 20 5 20-30 M 1 2014-05-10 1 20 2 10-19 M 0 2014-05-10 1 30 3 30-40 F 0 2014-05-01 1 10 2 10-19 M 1 2014-04-01 1 10 3 30-40 F 1 2014-05-01 1 20 ; RUN; proc sql; SELECT ID, sum (Profit) as totalProfit, count (QUANTITY) as totalQuantity, sum (SUB_NUM) as frequency From cus_tran Group By ID; quit; This is my coding, but it includes all data from both April and May, can you please help me separate it? Thanks so much! Sincerely, Thanks

Jianan_luna · ‎11-27-2020

Thanks so much, here is the dataline I created, please check: data cus_tran; input ID AGE $5. GENDER $ SUB_NUM TRANSACTION_DATE $10. QUANTITY Profit; Datalines; 1 10-19 F 0 2014-05-06 1 13 1 10-19 F 1 2014-05-06 1 13 1 10-19 F 1 2014-04-10 2 22 2 10-19 M 1 2014-04-11 2 10 3 30-40 F 1 2014-04-07 1 10 4 10-19 F 0 2014-04-08 1 10 3 30-40 F 0 2014-04-22 1 20 5 20-30 M 1 2014-04-30 1 20 2 10-19 M 0 2014-05-01 1 20 3 30-40 F 0 2014-05-06 1 10 4 10-19 F 1 2014-05-10 3 20 4 10-19 F 0 2014-04-30 1 20 5 20-30 M 1 2014-05-10 1 20 2 10-19 M 0 2014-05-10 1 30 3 30-40 F 0 2014-05-01 1 10 2 10-19 M 1 2014-04-01 1 10 3 30-40 F 1 2014-05-01 1 20 ; RUN;

Jianan_luna · ‎11-27-2020

Thanks so much Sir, I think I got it. I create a code like this, please check it data cus_tran; input ID AGE $5. GENDER $ SUB_NUM TRANSACTION_DATE $10. QUANTITY Profit; Datalines; 1 10-19 F 0 2014-05-06 1 13 1 10-19 F 1 2014-05-06 1 13 1 10-19 F 1 2014-04-10 2 22 2 10-19 M 1 2014-04-11 2 10 3 30-40 F 1 2014-04-07 1 10 4 10-19 F 0 2014-04-08 1 10 3 30-40 F 0 2014-04-22 1 20 5 20-30 M 1 2014-04-30 1 20 2 10-19 M 0 2014-05-01 1 20 3 30-40 F 0 2014-05-06 1 10 4 10-19 F 1 2014-05-10 3 20 4 10-19 F 0 2014-04-30 1 20 5 20-30 M 1 2014-05-10 1 20 2 10-19 M 0 2014-05-10 1 30 3 30-40 F 0 2014-05-01 1 10 2 10-19 M 1 2014-04-01 1 10 3 30-40 F 1 2014-05-01 1 20 ; RUN;

Jianan_luna · ‎11-27-2020

I post the attachment “cus_tran_sas.xlsx” under my question. Please check it

Jianan_luna · ‎11-27-2020

Sorry, can you please let me know what kind of data type you are looking for? I didn’t make sense

Jianan_luna · ‎11-27-2020

Recently, I am working on prediction for one dataset(for a store). I want to use group_by function to conclude some attributes. But my goal is a little complicated, because I want to separate the data into April and May. For each specific customer(CUSTOMER_ID): Profit, AGE, GENDER for each customer in April QUANTITY purchased by each customer in April I count TRANSACTION_DATE to find the how many times each customers come to store in April SUB_NUM means whether the customers purchase certain product, I use 1 represent the customer purchased it, 0 represents they don't. Here, my goal is: for each customer, did they purchase it on April, and did they purchase it on May, so basically, it's two columns. Here is the most confused point. The my desired outcome is like this, for each specific customer, I can conclude: customer_id gender age sub_num on April sub_num on May Quantity purchased on April sum of "transaction_date" on April 1 2 3 here is the coding I have proc sql; SELECT customer_ID, sum (Profit) as totalProfit, count (QUANTITY) as totalQuantity, count (TRANSACTION_DATE) as frequency, From cus_tran where TRANSACTION_DATE < "01MAY2014"d Group By customer_ID; quit; I am confused, because I want to group each specific customers, and conclude several data only on April, one only on May, and combine those dataset in one table. I think where clause is not available here, if only focus on April data, the data is likely to become incomplete, thus I cannot got data for each customers. I spend a so hard time on it, can you please help me fix it? Thanks so much!

Online Status	Offline
Date Last Visited	‎09-09-2021 08:50 AM

about n-way ANOVA

Re: About Group By

Re: About Group_by aggregation functions

Re: About Group_by aggregation functions

Re: About Transpose

Re: About Group_by aggregation functions

About Transpose

Re: About Group_by aggregation functions

Re: About Group_by aggregation functions

About Group_by aggregation functions

Re: About SQL

Re: About SQL

Re: About SQL

Re: About quiz

Re: About quiz

about n-way ANOVA

Re: About Group By

Re: About Group_by aggregation functions

Re: About Group_by aggregation functions

Re: About Transpose

Re: About Group_by aggregation functions

About Transpose

Re: About Group_by aggregation functions

Re: About Group_by aggregation functions

About Group_by aggregation functions

Re: About Group By

Re: About Group By

Re: About Group By

Re: About Group By

About Group By