Hello,
I am having an issue running proc transpose on my data. Below are sample data:
ID |
Variable |
Variable_Num |
question_text_short |
answer |
entry_count |
1 |
Gen_Health |
1 |
General health |
Fair |
1 |
1 |
Gen_Health |
1 |
General health |
Fair |
2 |
1 |
Gen_Health |
1 |
General health |
Fair |
3 |
1 |
GAD |
83 |
Anxiety |
Several days |
1 |
1 |
GAD |
83 |
Anxiety |
Some days |
2 |
1 |
GAD |
83 |
Anxiety |
Every day |
3 |
1 |
PSS |
69 |
Stress |
Sometimes |
1 |
1 |
PSS |
69 |
Stress |
Sometimes |
2 |
1 |
PSS |
69 |
Stress |
Never |
3 |
As you can see, the variable_num corresponds to the Variable. Entry_count refers to the time of the entry. I want each Variable to be a separate column. In other words, I want the data structure to be the following:
ID |
entry_count |
Gen_Health |
GAD |
PSS |
1 |
1 |
Fair |
Several days |
Sometimes |
1 |
2 |
Fair |
Some days |
Sometimes |
1 |
3 |
Fair |
Every day |
Never |
My code is the following:
proc transpose data=data out=wide prefix=answer;
by entry_count;
id Variable;
var answer;
run;
But the log fills with errors such as "ERROR: The ID value "answerGAD" occurs twice in the same BY group." No dataset is produced. I tried both "Variable" and "Variable_Num" for the "id" in the code. Can someone assist? I even tried shortening the prefix to "a" after reading another message on this forum that the prefix was too long, but I still get the same error. Thanks.
Change the sort order to ID Entry_count and use the following:
proc transpose data=data out=wide ; by id entry_count; id Variable; idlabel question_text_short; var answer; run;
Change the sort order to ID Entry_count and use the following:
proc transpose data=data out=wide ; by id entry_count; id Variable; idlabel question_text_short; var answer; run;
EDITED: Never mind, I had forgotten a semicolon! ballardw's solution worked!
Hello,
Upon running the code, I get the following error:
ERROR: Variable IDLABEL not found.
It seems as though SAS does not recognize the IDLABEL command.
Show us the ENTIRE log for PROC TRANSPOSE. Please copy the log as text and paste it into the window that appears when you click on the </> icon. Always (that's ALWAYS) do this when you have an error in the log.
I can't post the entire thing. Communites.sas.com says it's too long; plus, I had to force clear the window in SAS because it was too long for my window.
Other than what I already pasted before, here's the other information:
ERROR: The ID value "a17" occurs twice in the same BY group. ERROR: The ID value "a17" occurs twice in the same BY group. ERROR: The ID value "a17" occurs twice in the same BY group. ERROR: The ID value "a17" occurs twice in the same BY group. ERROR: The ID value "a17" occurs twice in the same BY group. NOTE: The above message was for the following BY group: entry_count=9 ERROR: The ID value "a10" occurs twice in the same BY group. ERROR: The ID value "a10" occurs twice in the same BY group. ERROR: Too many bad BY groups. NOTE: The SAS System stopped processing this step because of errors. NOTE: There were 317016 observations read from the data set DATA. WARNING: The data set WORK.WIDE may be incomplete. When this step was stopped there were 0 observations and 0 variables. WARNING: Data set WORK.WIDE was not replaced because this step was stopped. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 1:07:21.06 cpu time 1:17.46
That the IDLABEL statement is mistaken for a variable name points to a missing semicolon for the preceding statement.
Oh wow I feel like a you-know-what! Haha thanks! This worked great.
I am trying to create a wide data set with a unique DUPERSID per line and the RXNDC and RXNAME variable values on the same row from a long data set with DUPERSID, RXNDC and RXNAME values in varying number of rows. Below are my SAS code including datalines and the SAS output.
data testdata; infile datalines truncover; input DUPERSID :$10. RXNDC :$11. RXNAME :$50.; datalines; 2320018102 -15 ANTICOAGULANTS 2320018102 -15 ANTICOAGULANTS 2320038101 -15 ANTICOAGULANTS 2320038101 -15 ANTICOAGULANTS 2320038101 -15 ANTICOAGULANTS 2320105101 -15 INSULIN 2320105101 -15 INSULIN 2320105101 -15 INSULIN 2320105101 -15 INSULIN 2320222101 73070010011 INSULIN ASPA 2320222101 73070010011 INSULIN ASPA 2320222101 73070010011 INSULIN ASPA 2320222101 73070010011 INSULIN ASPA 2320222101 73070010011 INSULIN ASPA 2320222101 73070010011 INSULIN ASPA 2320222101 73070010011 INSULIN ASPA 2320222101 73070010011 INSULIN ASPA ; proc print data=testdata; run; proc sort data=WORK.testdata; by DUPERSID; run; proc transpose data=testdata out=work.wide_IC prefix=RX; by DUPERSID ; var rxndc rxname; run; PROC PRINT DATA= WORK.wide_IC ; RUN;
Output
Obs DUPERSID _NAME_ RX1 RX2 RX3 RX4 RX5 RX6 RX7 RX8
1 2320018102 RXNDC -15 -15
2 2320018102 RXNAME ANTICOAGULANTS ANTICOAGULANTS
3 2320038101 RXNDC -15 -15 -15
4 2320038101 RXNAME ANTICOAGULANTS ANTICOAGULANTS ANTICOAGULANTS
5 2320105101 RXNDC -15 -15 -15 -15
6 2320105101 RXNAME INSULIN INSULIN INSULIN INSULIN
7 2320222101 RXNDC 73070010011 73070010011 73070010011 73070010011 73070010011 73070010011 73070010011 73070010011
8 2320222101 RXNAME INSULIN INSULIN INSULIN INSULIN INSULIN INSULIN INSULIN INSULINIssue: The wide data set has the RXNDC and RXNAME values in separate lines rather than the same line.
To further widen the the above output data, I used the following SAS code (with double transposition).
proc sort data=WORK.wide_IC; by DUPERSID _NAME_; run; proc transpose data=WORK.wide_IC out=work.narrow_IC ; by DUPERSID _NAME_ ; var RX:; run; proc sort data=WORK.narrow_IC; by DUPERSID; run; proc transpose data=WORK.narrow_IC out=work.wide2_IC; by DUPERSID ; var COL1; id _NAME_; run;
I am getting the following errors.
ERROR: The ID value "RXNAME" occurs twice in the same BY group.
ERROR: The ID value "RXNAME" occurs twice in the same BY group.
Question: What changes would I make to my code to create a wide data set
so that it will have the RXNDC and RXNAME values on same line, with
one DUPERSID per line?
Any help will be appreciated.
You should start you own thread, possibly referencing this one. Your issue is different than the original poster.
The data step shown results in values of "INSULIN" not "INSULIN ASPA".
Also what will you do with the resulting data set? Typically a wide set is much harder to work with.
Can you show exactly what you expect for output? You have so many repeated values in that example data I'm not sure that I understand what is needed or the utility of that many repeated values on an observation.
@pkm_edu wrote:
1) Sorry for not starting a new thread. Do you want me to create a new thread?
3) Your question/comment: Also what will you do with the resulting data set? Typically a wide set is much harder to work with.
My response: I want to count the number of prescription fills/refills at the respondent level and merge this wide data with other person-level data files.
Yes start a new thread, and your question appears to be a "how to count".
Do you need a data set with the count or a report that you can read. Very likely Proc Freq will handle either.
Consider if you want a count by person of each of your two variables this may be a start:
Proc freq data=work.testdata; /* person with separate table counts of rxndc and rxname*/ tables dupersid*(rxndc rxname) / norow nocol nopercent; /* or count of both at one time*/ tables dupersid * rxndc * rxname/ list nopercent; run;
You can create output data sets of either type to merge back on to your other data by Dupersid.
Excellent! Worked perfectly.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.