Hello everyone,
Thank you in advance for your help. I am working with SAS 9.4 version, and need help with transposing my data.
The data original format is as follows:
value | admin | Epi_value | Epi_metric | Year |
44 | Afghanistan | 1500 | epi_HIV Incidence | 2019 |
44 | Afghanistan | 50 | epi_infections Five year percent change | 2019 |
44 | Afghanistan | 36.4 | epi_infections compare 2015 percent change | 2019 |
44 | Afghanistan | 36 | epi_ART known treatment | 2019 |
44 | Afghanistan | epi_ART change_art_2015 | 2019 |
I would like to transpose my data so then the "Epi_metric" variables are their own columns, with the corresponding "Epi_value" values, per country and year. This source only contains data for the Year 2019, and the "Epi_metric" variable repeats for each country.
I'd appreciate any help.
Current Code:
* Research Objective: Exploratory data analysis for HIV Lab;
*______________________________________________________________;
*Import datasets and Merge;
Data HIV_Data; *HIV Policy Lab, including HIV incidence metrics;
INFILE "C:\Users\Briana.Thrift.ctr\Downloads\epi_data_all_values.csv" dsd firstobs=2 truncover ;
LENGTH admin $54 epi_metric $54;
INPUT value admin$ epi_value epi_metric$ Year
;
Rename admin=Country_Name;
Run;
Proc contents data=HIV_Data;
Run;
Proc Print data=HIV_Data (obs=2);
Run;
*Transpose the Data;
Proc sort data=HIV_Data;
by Country_Name;
Run;
Proc Transpose data=HIV_Data out=wide_EpiMetric prefix=epi_metric;
by Country_Name;
id epi_metric;
var epi_value;
Run;
Proc print data=wide_EpiMetric (obs=2);
Run;
Current SAS Error Message:
10559 Proc Transpose data=HIV_Data out=wide_EpiMetric prefix=epi_metric;
10560 by Country_Name;
10561 id epi_metric;
10562 var epi_value;
10563 Run;
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
Country_Name=Afghanistan
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
Country_Name=Albania
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
Country_Name=Algeria
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
Country_Name=Angola
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
Country_Name=Argentina
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
Country_Name=Armenia
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
Country_Name=Australia
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
Country_Name=Austria
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
Country_Name=Azerbaijan
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: Too many bad BY groups.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 116 observations read from the data set WORK.HIV_DATA.
WARNING: The data set WORK.WIDE_EPIMETRIC may be incomplete. When this step was stopped there were 0 observations and 0 variables.
WARNING: Data set WORK.WIDE_EPIMETRIC was not replaced because this step was stopped.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
Thank you!
Briana Thrift
Hint: Show some data with the repeats.
Then show how the output should look.
It may be that you want an additional variable or two on the by statement, perhaps Year.
I can't replicate the error with this small example data set.
Please provide a larger portion of your data that illustrates the error. Do not provide data as a screen capture. Provide the data following these instructions. Thank you.
I add that (depending on what you are trying to do), you might be MUCH better off leaving this data set long instead of wide. There are many operations in SAS that can be programmed and executed easier with a long data set instead of a wide data set (see Maxim 19).
Creating variables with these long names such as epi_infections_Five_year_percent_change and so on will be a nightmare to program.
So what are you going to do with this data?
SAS has a 32 character limit for variable names, so the contents of the ID variable have to be unique in the first 32 characters.
Based on experience, there is a more than fair chance that you are better off leaving the data in long format.
Maybe, since your code has a comment about "exploratory" a report of some kind is what you really want.
So perhaps instead of
Proc Transpose data=HIV_Data out=wide_EpiMetric prefix=epi_metric; by Country_Name; id epi_metric; var epi_value; Run;
Proc tabulate data=HIV_date; class country_name epi_metric; value epi_value; table country_name, epi_metric * epi_value*( min mean median max stddev) ; run;
Let's explore the data without making an ugly hard to use data set with practically impossible to remember and tiring to type variable names.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.