Hello,
I have a data from the last 12 months and need help on how to transpose the date in order, group by IDs and getting the total count. Please advise
Version - SAS M6
See sample dataset below:
data dataset; input id $1-3 yrmth 5-10 count 12; datalines; 001 201911 2 001 201912 1 001 201903 1 001 202004 5 001 202006 1 002 201908 1 002 201909 2 002 201909 1 002 201910 1 003 202001 4 003 202002 3 003 202005 2 003 202007 1; run;
EXPECTED RESULT:
ID 201908 201909 201910 201911 201912 202001 202002 202003 202004 202005 202006 202007 TOTAL
001 0 0 0 2 1 0 0 1 5 0 1 0 10
002 1 3 1 0 0 0 0 0 0 0 0 0 5
003 0 0 0 0 0 4 3 0 0 2 0 1 10
Your Proc REPORT is very close to what you want, just some minor changes:
You can send the output directly to Excel using ODS EXCEL, as it looks like you want to have it in Excel.
proc report data=two;
columns id name age count, yrmth count=total;
define id / group;
define name / group;
define age / group;
define yrmth / across " ";
define count / sum " ";
define total / sum "Total";
run;
Do you want the result to be a SAS data set, or do you want it to be a report?
I'm skeptical that creating a wide data set from a long data set is a good idea ... usually, it is not a good idea and makes your coding harder. So please explain further this "linkage on another table".
From my first data set I have the data from the last 12 months
Variable includes:
Then, I'll do a linkage between member info table that includes:
FINAL Report will include
ID, Member Demographics, YearMonth (12 months), Total Count
SAMPLE OUTPUT
ID Name Phone 201908 201909 201910 201911 201912 202001 2019...... TOTAL
001 Ana 888##### 0 1 0 3 0 0 0 4
So, again the idea of transposing to a wide data set seems unnecessary to me.
UNTESTED CODE
data dataset;
input id $1-3 yrmth 5-10 count 12;
datalines;
001 201911 2
001 201912 1
001 201903 1
001 202004 5
001 202006 1
002 201908 1
002 201909 2
002 201909 1
002 201910 1
003 202001 4
003 202002 3
003 202005 2
003 202007 1
;
proc sql;
create table two as select a.*,b.name,b.phone
from dataset a left join demographics b
on a.id=b.id;
quit;
proc report data=two;
columns id name phone yrmth,count count=total;
define id / group;
define name / group;
define phone / group;
define yrmth / across;
define count / sum;
define total / sum;
run;
Now you see the problem with wide datasets and how difficult they are to work with.
The code I provided ought to work, or something similar ought to work, kbut you don't explain why you don't like it.
@mrafael03 wrote:
so i’m looking for a result that will automatically populate the year month accordingly. It might be a combination of transpose + macro?
PROC REPORT will do this easily without macro and without a transpose. You still haven't specified what was wrong with the results from the PROC REPORT I showed, and its possible that a relatively minor change to the above code will do what you want.
@PaigeMiller wrote:
Now you see the problem with wide datasets and how difficult they are to work with.
The code I provided ought to work, or something similar ought to work, kbut you don't explain why you don't like it.
Hi, again I tried your codes and it's actually worked and closer to what I'm looking for. Please see attached file for sample output that I'm aiming for
I agree with @PaigeMiller about the discussion wide versus long.
But if you want to go this way, here is a sample that gets you close. Please note, SAS does not allow for variable names to start with a digit by default. You have to set SAS System Option VALIDVARNAME=ANY. This then also means you have to use a name literal for those new names like so '201909'n.
data dataset;
input id $1-3 yrmth 5-10 count 12;
datalines;
001 201911 2
001 201912 1
001 201903 1
001 202004 5
001 202006 1
002 201908 1
002 201909 2
002 201909 1
002 201910 1
003 202001 4
003 202002 3
003 202005 2
003 202007 1
;
/*
* create orderd lit of column names
*/
proc sql noprint;
select distinct
cats("'", yrmth, "'n")
into
:yrmth_list separated by " "
from
dataset
order by
1
;
quit;
%put NOTE: &=yrmth_list;
/*
* create an empty base table
* using the list from above
*/
data basetable;
length
id $ 3
&yrmth_list 8
;
stop;
run;
/*
* aggregate data, since some yrmth appear more tan once
*/
proc sql;
create table dataset_aggr as
select
id
, yrmth
, sum(count) as count
from
dataset
group by
id
, yrmth
;
quit;
/*
* transpose the data
*/
proc transpose
data=dataset_aggr
out=ds_trsp(drop=_name_)
;
by id;
id yrmth;
var count;
run;
/*
* append t base table
*/
proc append
base=basetable
data=ds_trsp
;
run;
@BrunoMueller thank you for confirming my point of view. Your code takes a lot more effort than PROC REPORT from a long data set.
Totally agree Proc REPORT would be far less code, but I guess it all depends on what type of output is needed, report, data, ...
There is a function for converting names into name literals. Of course you still need to convert the number into a string.
nliteral(cats(yrmth))
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.