Hello!
Please help me transpose this data set from Long to Wide.
Here is what I have:
DayofWeek | User | Website | Mean | Sum | Freq |
Saturday | 17306100 | MiKandi | 10 | 10 | 1 |
Friday | 17306100 | 75 | 75 | 1 | |
Sunday | 17306100 | Voice Search | 151.67 | 455 | 3 |
Saturday | 17306100 | icomfort | 30 | 30 | 1 |
Saturday | 17306243 | AOL | 87 | 261 | 3 |
Sunday | 17306243 | 120.65 | 7963 | 66 | |
Monday | 17306243 | Firefox | 536 | 536 | 1 |
Tuesday | 17306243 | 15 | 15 | 1 | |
Wednesday | 17306243 | IMDb | 202.5 | 405 | 2 |
Thursday | 17306243 | Landry's, Inc | 25 | 25 | 1 |
Thursday | 18114509 | Norton Mobile | 45 | 45 | 1 |
Thursday | 18114509 | Notes | 317.5 | 3175 | 10 |
Thursday | 18114509 | Our Groceries | 510.94 | 8175 | 16 |
Friday | 18072613 | Chase | 245 | 2450 | 10 |
Friday | 18072613 | Check | 374.38 | 2995 | 8 |
Friday | 18072613 | Craigslist | 55 | 55 | 1 |
Wednesday | 18062931 | Barcode Scanner | 5 | 5 | 1 |
Wednesday | 18062931 | BofA | 69.923 | 909 | 13 |
Wednesday | 18062931 | 343.94 | 35426 | 103 |
Here is what I need:
Essentially multiple each day of the week (7 days of the week) by each of the 3 variables (Mean, Sum, Freq) for a total of 21 new variables. I want to keep the User and Website columns constant and unchanged.
Could someone please help me attack this with a do loop and array, so that I am able to replicate it using the same procedure but for a WeekoftheYear variable (which holds 52 different values, one for each week of the year). Thank you!
User | Website | MeanSunday | MeanMonday | MeanTuesday | MeanWednesday | MeanThursday | MeanFriday | MeanSaturday | FreqSunday | FreqMonday | FreqTuesday | FreqWednesday | FreqThursday | FreqFriday | FreqSaturday | FreqSunday | SumMonday | SumTuesday | SumWednesday | SumThursday | SumFriday | SumSaturday |
17306100 | icomfort | |||||||||||||||||||||
17306100 | MiKandi | |||||||||||||||||||||
17306100 | ||||||||||||||||||||||
17306100 | Voice Search | |||||||||||||||||||||
17306243 | AOL | |||||||||||||||||||||
17306243 | ||||||||||||||||||||||
17306243 | Firefox | |||||||||||||||||||||
17306243 | ||||||||||||||||||||||
17306243 | IMDb | |||||||||||||||||||||
17306243 | Landry's, Inc | |||||||||||||||||||||
18062931 | Barcode Scanner | |||||||||||||||||||||
18062931 | BofA | |||||||||||||||||||||
18062931 | ||||||||||||||||||||||
18072613 | Chase | |||||||||||||||||||||
18072613 | Check | |||||||||||||||||||||
18072613 | Craigslist | |||||||||||||||||||||
18114509 | Norton Mobile | |||||||||||||||||||||
18114509 | Notes | |||||||||||||||||||||
18114509 | Our Groceries |
Do you have a date attached to that day of the week? How do you know which week that Saturday belongs to?
Yes that date is attached to a date, but the data has been summarized for each day of the week by Mean, Sum, and Freq. The numbers representing the Mean, Sum, and Freq variables is Time Spent on the website.
I should have mentioned that the DayofWeek variable is numeric, running from 1=Sunday to 7=Saturday, and is formatted to show the day of the week using the FORMAT downame.
So you'll only have one day of the week for each company?
EDIT: SAS Learning Module: Reshaping data long to wide using the data step
I will need to summarize for each day of the week by company but for this example you are correct, only one day of the week for each company
Here is another segment of the dataset where you will see Websites repeating for certain days of the week. A row is identified by a unique DayofWeek, User, and Website combination:
DayofWeek | User | Website | Mean | Sum | Freq |
Saturday | 243736 | Bonus Box Tops | 15 | 15 | 1 |
Saturday | 243736 | Flip Chip | 601.55 | 6617 | 11 |
Saturday | 243736 | MyChart | 100 | 100 | 1 |
Saturday | 243736 | PCH Sweeps | 35 | 35 | 1 |
Sunday | 243736 | Flip Chip | 158.6 | 793 | 5 |
Sunday | 243736 | PCH Sweeps | 107.33 | 322 | 3 |
Monday | 243736 | Bonus Box Tops | 10 | 20 | 2 |
Monday | 243736 | Cash Slots | 31.5 | 441 | 14 |
Monday | 243736 | Flip Chip | 273 | 2730 | 10 |
Monday | 243736 | Graphing Calculator | 669.33 | 2008 | 3 |
Monday | 243736 | Hot Light | 93.5 | 187 | 2 |
Monday | 243736 | PokerHouse | 507.5 | 1015 | 2 |
Monday | 243736 | Tri-Peaks Solitare | 493.5 | 987 | 2 |
Tuesday | 243736 | Battle Nations | 820.67 | 2462 | 3 |
Tuesday | 243736 | Hot Light | 15 | 15 | 1 |
Tuesday | 243736 | PCH Sweeps | 101 | 303 | 3 |
Tuesday | 243736 | Slots Universe | 741.67 | 2225 | 3 |
Tuesday | 243736 | Treasure Diving | 510.83 | 14814 | 29 |
Wednesday | 243736 | Bonus Box Tops | 30 | 30 | 1 |
Wednesday | 243736 | Cash Slots | 91.286 | 1917 | 21 |
Wednesday | 243736 | Castle Clash | 600.69 | 43250 | 72 |
Wednesday | 243736 | Lucktastic | 809 | 809 | 1 |
Wednesday | 243736 | PCH Sweeps | 25 | 50 | 2 |
Wednesday | 243736 | Slots Universe | 25 | 50 | 2 |
Wednesday | 243736 | Treasure Diving | 573.51 | 30396 | 53 |
Thursday | 243736 | Bonus Box Tops | 5 | 5 | 1 |
Thursday | 243736 | Flip Chip | 311.75 | 3741 | 12 |
Thursday | 243736 | Hot Light | 640.5 | 1281 | 2 |
Thursday | 243736 | Lucktastic | 121 | 121 | 1 |
Thursday | 243736 | MyChart | 237 | 237 | 1 |
Thursday | 243736 | PCH Sweeps | 155.5 | 622 | 4 |
Thursday | 243736 | Treasure Diving | 738.35 | 22889 | 31 |
Thursday | 243736 | Tri-Peaks Solitare | 238.25 | 953 | 4 |
Friday | 243736 | Flip Chip | 671.67 | 4030 | 6 |
Friday | 243736 | Hot Light | 982 | 982 | 1 |
Friday | 243736 | PCH Sweeps | 181 | 181 | 1 |
Friday | 243736 | Tri-Peaks Solitare | 164.5 | 329 | 2 |
Sunday | 398526 | Bloomberg | 23 | 23 | 1 |
Monday | 398526 | Blackhawks | 35 | 35 | 1 |
Saturday | 398942 | Spider Free | 2 | 2 | 1 |
Friday | 398942 | Spider Free | 30 | 30 | 1 |
Sunday | 429762 | DownloaderFREE | 189 | 189 | 1 |
Sunday | 429762 | autoshutter | 127 | 127 | 1 |
Monday | 429762 | Angry Birds Star Wa | 290 | 290 | 1 |
Monday | 429762 | Toysrus | 1 | 1 | 1 |
Tuesday | 429762 | Angry Birds Star Wa | 1588 | 1588 | 1 |
The link above gives a fairly detailed example on how to accomplish this.
SAS Learning Module: Reshaping data long to wide using the data step
So here is what I came up with, but because a row is identified by the combination of the UserID and AppName, it does not work. I also think there must be a better way to transpose this data.
PROC SORT DATA=WORK.wave1appdow;
BY AppName UserID;
RUN;
DATA transtest3;
SET wave1appdow;
BY AppName UserID;
KEEP AppName UserID
Mean_1-Mean_7
Sum_1-Sum_7
Freq_1-Freq_7
RETAIN Mean_1-Mean_7
Sum_1-Sum_7
Freq_1-Freq_7
ARRAY AppTimeMean (1:7) Mean_1-Mean_7;
ARRAY AppTimeSum (1:7) Sum_1-Sum_7;
ARRAY AppTimeFreq (1:7) N_1-N_7;
IF first.Appname THEN
DO;
DO i = 1 - 7;
AppTimeMean ( i ) = 0;
AppTimeFreq ( i ) = 0;
AppTimeMed ( i ) = 0;
END;
END;
AppTimeMean (D_DayofWeek) = Mean;
AppTimeSum (D_DayofWeek) = Sum;
AppTimeFreq (D_DayofWeek) = Freq;
IF last.Appname THEN OUTPUT;
RUN;
Any Suggestions?
DO i = 1 - 7;
That does not do what you think. You need DO I=1 TO 7;
Did you generate the statistics using SAS? If so then perhaps it would be easier if you generated them in a more vertical format.
DayofWeek | User | Website | Stat | Value |
Saturday | 17306100 | MiKandi | MEAN | 10 |
Friday | 17306100 | MEAN | 75 |
Then you can use PROC TRANSPOSE;
proc transpose data=have out=want ;
by user website ;
id stat dayofweek;
var value;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.