BookmarkSubscribeRSS Feed
brianathrift
Fluorite | Level 6

Hello everyone,

 

Thank you in advance for your help. I am working with SAS 9.4 version, and need help with transposing my data.

 

The data original format is as follows: 

 

value admin Epi_value Epi_metric Year
44 Afghanistan 1500 epi_HIV Incidence 2019
44 Afghanistan 50 epi_infections Five year percent change 2019
44 Afghanistan 36.4 epi_infections compare 2015 percent change 2019
44 Afghanistan 36 epi_ART known treatment 2019
44 Afghanistan epi_ART change_art_2015 2019

 

I would like to transpose my data so then the "Epi_metric" variables are their own columns, with the corresponding "Epi_value" values, per country and year. This source only contains data for the Year 2019, and the "Epi_metric" variable repeats for each country. 

 

I'd appreciate any help.

 

Current Code:


* Research Objective: Exploratory data analysis for HIV Lab;

*______________________________________________________________;

*Import datasets and Merge;

Data HIV_Data; *HIV Policy Lab, including HIV incidence metrics;
  	INFILE "C:\Users\Briana.Thrift.ctr\Downloads\epi_data_all_values.csv" dsd firstobs=2 truncover ;
	LENGTH admin $54 epi_metric $54;
  	INPUT value	admin$	epi_value	epi_metric$	Year
;

	Rename admin=Country_Name;
Run;

Proc contents data=HIV_Data;
Run;

Proc Print data=HIV_Data (obs=2);
Run;

*Transpose the Data;

Proc sort data=HIV_Data;
	by Country_Name;
Run;

Proc Transpose data=HIV_Data out=wide_EpiMetric prefix=epi_metric;
   by Country_Name;
   id epi_metric;
   var epi_value;
Run;

Proc print data=wide_EpiMetric (obs=2);
Run;

Current SAS Error Message: 

 


10559  Proc Transpose data=HIV_Data out=wide_EpiMetric prefix=epi_metric;
10560     by Country_Name;
10561     id epi_metric;
10562     var epi_value;
10563  Run;

ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
      Country_Name=Afghanistan
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
      Country_Name=Albania
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
      Country_Name=Algeria
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
      Country_Name=Angola
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
      Country_Name=Argentina
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
      Country_Name=Armenia
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
      Country_Name=Australia
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
      Country_Name=Austria
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
      Country_Name=Azerbaijan
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: The ID value "epi_metricepi_HIV_patients_on_AR" occurs twice in the same BY group.
ERROR: Too many bad BY groups.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 116 observations read from the data set WORK.HIV_DATA.
WARNING: The data set WORK.WIDE_EPIMETRIC may be incomplete.  When this step was stopped there were 0 observations and 0 variables.
WARNING: Data set WORK.WIDE_EPIMETRIC was not replaced because this step was stopped.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds

Thank you!

Briana Thrift

5 REPLIES 5
ballardw
Super User

Hint: Show some data with the repeats.

Then show how the output should look.

 

It may be that you want an additional variable or two on the by statement, perhaps Year.

PaigeMiller
Diamond | Level 26

I can't replicate the error with this small example data set.

 

Please provide a larger portion of your data that illustrates the error. Do not provide data as a screen capture. Provide the data following these instructions. Thank you.

--
Paige Miller
PaigeMiller
Diamond | Level 26

I add that (depending on what you are trying to do), you might be MUCH better off leaving this data set long instead of wide. There are many operations in SAS that can be programmed and executed easier with a long data set instead of a wide data set (see Maxim 19).

 

Creating variables with these long names such as epi_infections_Five_year_percent_change and so on will be a nightmare to program.


So what are you going to do with this data?

--
Paige Miller
Kurt_Bremser
Super User

SAS has a 32 character limit for variable names, so the contents of the ID variable have to be unique in the first 32 characters.

 

Based on experience, there is a more than fair chance that you are better off leaving the data in long format.

ballardw
Super User

Maybe, since your code has a comment about "exploratory" a report of some kind is what you really want.

So perhaps instead of

Proc Transpose data=HIV_Data out=wide_EpiMetric prefix=epi_metric;
   by Country_Name;
   id epi_metric;
   var epi_value;
Run;
Proc tabulate data=HIV_date;
   class country_name epi_metric;
   value epi_value;
   table country_name,
           epi_metric * epi_value*( min mean median max stddev)
          ;
run;

Let's explore the data without making an ugly hard to use data set with practically impossible to remember and tiring to type variable names.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 753 views
  • 0 likes
  • 4 in conversation