Hello,
I have a data from the last 12 months and need help on how to transpose the date in order, group by IDs and getting the total count. Please advise
Version - SAS M6
See sample dataset below:
data dataset; input id $1-3 yrmth 5-10 count 12; datalines; 001 201911 2 001 201912 1 001 201903 1 001 202004 5 001 202006 1 002 201908 1 002 201909 2 002 201909 1 002 201910 1 003 202001 4 003 202002 3 003 202005 2 003 202007 1; run;
EXPECTED RESULT:
ID 201908 201909 201910 201911 201912 202001 202002 202003 202004 202005 202006 202007 TOTAL
001 0 0 0 2 1 0 0 1 5 0 1 0 10
002 1 3 1 0 0 0 0 0 0 0 0 0 5
003 0 0 0 0 0 4 3 0 0 2 0 1 10
Your Proc REPORT is very close to what you want, just some minor changes:
You can send the output directly to Excel using ODS EXCEL, as it looks like you want to have it in Excel.
proc report data=two;
columns id name age count, yrmth count=total;
define id / group;
define name / group;
define age / group;
define yrmth / across " ";
define count / sum " ";
define total / sum "Total";
run;
Do you want the result to be a SAS data set, or do you want it to be a report?
I'm skeptical that creating a wide data set from a long data set is a good idea ... usually, it is not a good idea and makes your coding harder. So please explain further this "linkage on another table".
From my first data set I have the data from the last 12 months
Variable includes:
Then, I'll do a linkage between member info table that includes:
FINAL Report will include
ID, Member Demographics, YearMonth (12 months), Total Count
SAMPLE OUTPUT
ID Name Phone 201908 201909 201910 201911 201912 202001 2019...... TOTAL
001 Ana 888##### 0 1 0 3 0 0 0 4
So, again the idea of transposing to a wide data set seems unnecessary to me.
UNTESTED CODE
data dataset;
input id $1-3 yrmth 5-10 count 12;
datalines;
001 201911 2
001 201912 1
001 201903 1
001 202004 5
001 202006 1
002 201908 1
002 201909 2
002 201909 1
002 201910 1
003 202001 4
003 202002 3
003 202005 2
003 202007 1
;
proc sql;
create table two as select a.*,b.name,b.phone
from dataset a left join demographics b
on a.id=b.id;
quit;
proc report data=two;
columns id name phone yrmth,count count=total;
define id / group;
define name / group;
define phone / group;
define yrmth / across;
define count / sum;
define total / sum;
run;
Now you see the problem with wide datasets and how difficult they are to work with.
The code I provided ought to work, or something similar ought to work, kbut you don't explain why you don't like it.
@mrafael03 wrote:
so i’m looking for a result that will automatically populate the year month accordingly. It might be a combination of transpose + macro?
PROC REPORT will do this easily without macro and without a transpose. You still haven't specified what was wrong with the results from the PROC REPORT I showed, and its possible that a relatively minor change to the above code will do what you want.
@PaigeMiller wrote:
Now you see the problem with wide datasets and how difficult they are to work with.
The code I provided ought to work, or something similar ought to work, kbut you don't explain why you don't like it.
Hi, again I tried your codes and it's actually worked and closer to what I'm looking for. Please see attached file for sample output that I'm aiming for
I agree with @PaigeMiller about the discussion wide versus long.
But if you want to go this way, here is a sample that gets you close. Please note, SAS does not allow for variable names to start with a digit by default. You have to set SAS System Option VALIDVARNAME=ANY. This then also means you have to use a name literal for those new names like so '201909'n.
data dataset;
input id $1-3 yrmth 5-10 count 12;
datalines;
001 201911 2
001 201912 1
001 201903 1
001 202004 5
001 202006 1
002 201908 1
002 201909 2
002 201909 1
002 201910 1
003 202001 4
003 202002 3
003 202005 2
003 202007 1
;
/*
* create orderd lit of column names
*/
proc sql noprint;
select distinct
cats("'", yrmth, "'n")
into
:yrmth_list separated by " "
from
dataset
order by
1
;
quit;
%put NOTE: &=yrmth_list;
/*
* create an empty base table
* using the list from above
*/
data basetable;
length
id $ 3
&yrmth_list 8
;
stop;
run;
/*
* aggregate data, since some yrmth appear more tan once
*/
proc sql;
create table dataset_aggr as
select
id
, yrmth
, sum(count) as count
from
dataset
group by
id
, yrmth
;
quit;
/*
* transpose the data
*/
proc transpose
data=dataset_aggr
out=ds_trsp(drop=_name_)
;
by id;
id yrmth;
var count;
run;
/*
* append t base table
*/
proc append
base=basetable
data=ds_trsp
;
run;
@BrunoMueller thank you for confirming my point of view. Your code takes a lot more effort than PROC REPORT from a long data set.
Totally agree Proc REPORT would be far less code, but I guess it all depends on what type of output is needed, report, data, ...
There is a function for converting names into name literals. Of course you still need to convert the number into a string.
nliteral(cats(yrmth))
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.