Dear Sir/Friends,
I am Anil working in Some international research institute i have one question regarding an Proc transpose. That is there are variables called person ID, Item name, quantity, Value in dataset1 and there are person ID and some information on person family in dataset2. So in dataset2 contains 40 observations and dataset1 contains 1600 observations so i want to merge these two datasets but we can't because in dataset1 the ID repeats 40 times so i planned to transpose dataset1 like all itemnames as column names and using by variable as a Person ID.Here is the small sample of the dataset1.
proc transpose data=dataset1 out=dataset2;
id Item_name;
var Quantity Total value;
by Vdsid_hhid ;
run;
Vdsid_hhid Item_name Quantity Total value
INMHKAL100213000 Sorghum 66 528
INMHKAL100213000 Wheat 15 75
INMHKAL100213000 Rice 10 60
INMHKAL100213000 Pigeonpea 4 119
INMHKAL100213000 Chickpea 1 12
INMHKAL100213000 Matki/Cowpea 1 28
INMHKAL100213000 Edible oil 4 208
INMHKAL100213000 All types of vegetables 0 115
INMHKAL100213000 All types of fruits 0 30
INMHKAL100214000 PDS Rice 10 60
INMHKAL100214000 Sorghum 165 1155
INMHKAL100214000 Wheat 20 100
INMHKAL100214000 Pigeonpea 5 133
INMHKAL100214000 Chickpea 2 48
INMHKAL100214000 Matki/Cowpea 4 94.5
INMHKAL100214000 Groundnut oil 6 276
INMHKAL100214000 All types of vegetables 0 70
INMHKAL100214000 All types of fruits 216 216
so i need transpose like this
Vdsid_hhid Sorghum Wheat Rice Pigeonpea Chickpea
quantity value quantity value quantity value quantity value
I tried in many ways but i am failed to get these results. Please can anybody give some idea on this.
here i attached the data for your kind reference.
Thank you
Best Regards,
Anil
How about this?
Create test scenario.
DATA HAVE;
LENGTH VDSID_HHID $40. ITEM_NAME $40.;
INFILE DATALINES DELIMITER=",";
INPUT VDSID_HHID $ ITEM_NAME $ QUANTITY TOTAL_VALUE;
DATALINES;
INMHKAL100213000,SORGHUM,66,528
INMHKAL100213000,WHEAT,15,75
INMHKAL100213000,RICE,10,60
INMHKAL100213000,PIGEONPEA,4,119
INMHKAL100213000,CHICKPEA,1,12
INMHKAL100213000,MATKI/COWPEA,1,28
INMHKAL100213000,EDIBLE OIL,4,208
INMHKAL100213000,ALL TYPES OF VEGETABLES,0,115
INMHKAL100213000,ALL TYPES OF FRUITS,0,30
INMHKAL100214000,PDS RICE,10,60
INMHKAL100214000,SORGHUM,165,1155
INMHKAL100214000,WHEAT,20,100
INMHKAL100214000,PIGEONPEA,5,133
INMHKAL100214000,CHICKPEA,2,48
INMHKAL100214000,MATKI/COWPEA,4,94.5
INMHKAL100214000,GROUNDNUT OIL,6,276
INMHKAL100214000,ALL TYPES OF VEGETABLES,0,70
INMHKAL100214000,ALL TYPES OF FRUITS,216,216
RUN;
Summarise multiple observations with the same VDSID_HHID and ITEM_NAME.
PROC SUMMARY DATA=HAVE NWAY MISSING;
CLASS VDSID_HHID ITEM_NAME;
VAR QUANTITY TOTAL_VALUE;
OUTPUT OUT=HAVESUM (drop = _:) SUM=;
RUN;
Transpose data to have a single column called COL1 containing the values of both QUANTITY and TOTAL_VALUE and a variable called _NAME_ containing the variable names transposed into COL1.
PROC TRANSPOSE DATA=HAVESUM OUT=TRANSPOSE;
BY ITEM_NAME VDSID_HHID NOTSORTED;
VAR QUANTITY TOTAL_VALUE;
RUN;
Transpose the data a second time, using the ITEM_NAME and _NAME_ variables in the ID statement to name the created variables in a format similar to what you desired.
PROC TRANSPOSE DATA=TRANSPOSE OUT=TRANSPOSE1 (DROP=_NAME_) DELIMITER=_;
ID ITEM_NAME _NAME_;
VAR COL:;
BY VDSID_HHID NOTSORTED;
RUN;
You can now merge your 2 datasets.
You could do something similar in a datastep, however I am sure you know how to create a series of IF ELSE statements.
Regards,
Scott
Hi Anil,
From your description of the problem it seems like a simple DATA step merge would work, although if you could attack your dataset2 that would help to clarify things.
The following program should add further details from dataset2 onto each Product ID in dataset1.
data one;
input PID $1-16 NAME $ quantity value;
datalines;
INMHKAL100213000 Sorghum 66 528
INMHKAL100213000 Wheat 15 75
INMHKAL100213000 Chickpea 1 12
INMHKAL100213000 Matki/Cowpea 1 28
INMHKAL100213000 Edible 4 208
INMHKAL100213000 All_types_of 0 115
INMHKAL100213000 All_types_of 0 30
INMHKAL100213000 Milk/Cur 30 300
INMHKAL100213000 Sugar 5 100
INMHKAL100213000 Groundnut 3 75
INMHKAL100214000 PDS_Rice 10 60
INMHKAL100214000 Sorghum 165 1155
INMHKAL100214000 Wheat 20 100
INMHKAL100214000 Pigeonpea 5 133
INMHKAL100214000 Chickpea 2 48
;
data two;
input PID $ 1-16 Additional $17-35;
datalines;
INMHKAL100213000 Info on first PID
INMHKAL100214000 Info on second PID
;
data three;
merge one two;
by PID;
run;
PID | NAME | quantity | value | Additional |
INMHKAL100213000 | Sorghum | 66 | 528 | Info on first PID |
INMHKAL100213000 | Wheat | 15 | 75 | Info on first PID |
INMHKAL100213000 | Chickpea | 1 | 12 | Info on first PID |
INMHKAL100213000 | Matki/Co | 1 | 28 | Info on first PID |
INMHKAL100213000 | Edible | 4 | 208 | Info on first PID |
INMHKAL100213000 | All_type | 0 | 115 | Info on first PID |
INMHKAL100213000 | All_type | 0 | 30 | Info on first PID |
INMHKAL100213000 | Milk/Cur | 30 | 300 | Info on first PID |
INMHKAL100213000 | Sugar | 5 | 100 | Info on first PID |
INMHKAL100213000 | Groundnu | 3 | 75 | Info on first PID |
INMHKAL100214000 | PDS_Rice | 10 | 60 | Info on second PID |
INMHKAL100214000 | Sorghum | 165 | 1155 | Info on second PID |
INMHKAL100214000 | Wheat | 20 | 100 | Info on second PID |
INMHKAL100214000 | Pigeonpe | 5 | 133 | Info on second PID |
INMHKAL100214000 | Chickpea | 2 | 48 | Info on second PID |
Dear Murry,
Thanks a lot for your reply actually my problem is not merging i need to transpose the item names in to column names. Please give me an solution for this.
Actual results i need like this
PID Sorghum Wheat Chickpea PDS_Rice
quantity value quantity value quantity value quantity value
INMHKAL100213000 66 528 21 434 12 121
INMHKAL100214000
So for this i used sas codes like this
proc transpose data=dataset1 out=transpose;
id Item_name;
var Quantity Value;
by HHID;
run;
Please give me yourmail ID i will forward the sample dataset for your reference.
Thanks
Regards,
Anil
Looks like a report and not a transpose. Use PROC REPORT.
So dataset 2 is not relevant to this operation? I will have another look at this on my lunch break. Thanks for additional info.
Ya dataset 2 is not relevant i need only transpose for dataset 1 which i have sent by attachment earlier.
Thank you
Anil
To me your desired result looks like a report, not a database table.
You might want to look at PROC TABULATE or PROC REPORT. From these you could even store the result in an output table if you wish.
How about this?
Create test scenario.
DATA HAVE;
LENGTH VDSID_HHID $40. ITEM_NAME $40.;
INFILE DATALINES DELIMITER=",";
INPUT VDSID_HHID $ ITEM_NAME $ QUANTITY TOTAL_VALUE;
DATALINES;
INMHKAL100213000,SORGHUM,66,528
INMHKAL100213000,WHEAT,15,75
INMHKAL100213000,RICE,10,60
INMHKAL100213000,PIGEONPEA,4,119
INMHKAL100213000,CHICKPEA,1,12
INMHKAL100213000,MATKI/COWPEA,1,28
INMHKAL100213000,EDIBLE OIL,4,208
INMHKAL100213000,ALL TYPES OF VEGETABLES,0,115
INMHKAL100213000,ALL TYPES OF FRUITS,0,30
INMHKAL100214000,PDS RICE,10,60
INMHKAL100214000,SORGHUM,165,1155
INMHKAL100214000,WHEAT,20,100
INMHKAL100214000,PIGEONPEA,5,133
INMHKAL100214000,CHICKPEA,2,48
INMHKAL100214000,MATKI/COWPEA,4,94.5
INMHKAL100214000,GROUNDNUT OIL,6,276
INMHKAL100214000,ALL TYPES OF VEGETABLES,0,70
INMHKAL100214000,ALL TYPES OF FRUITS,216,216
RUN;
Summarise multiple observations with the same VDSID_HHID and ITEM_NAME.
PROC SUMMARY DATA=HAVE NWAY MISSING;
CLASS VDSID_HHID ITEM_NAME;
VAR QUANTITY TOTAL_VALUE;
OUTPUT OUT=HAVESUM (drop = _:) SUM=;
RUN;
Transpose data to have a single column called COL1 containing the values of both QUANTITY and TOTAL_VALUE and a variable called _NAME_ containing the variable names transposed into COL1.
PROC TRANSPOSE DATA=HAVESUM OUT=TRANSPOSE;
BY ITEM_NAME VDSID_HHID NOTSORTED;
VAR QUANTITY TOTAL_VALUE;
RUN;
Transpose the data a second time, using the ITEM_NAME and _NAME_ variables in the ID statement to name the created variables in a format similar to what you desired.
PROC TRANSPOSE DATA=TRANSPOSE OUT=TRANSPOSE1 (DROP=_NAME_) DELIMITER=_;
ID ITEM_NAME _NAME_;
VAR COL:;
BY VDSID_HHID NOTSORTED;
RUN;
You can now merge your 2 datasets.
You could do something similar in a datastep, however I am sure you know how to create a series of IF ELSE statements.
Regards,
Scott
Dear Scott,
Thanks a lot for your idea now it's working. i will keep in touch with you.
thanks a lot.
My mail id - anilgvdbm@gmail.com
Glad that I could help. Good luck with the rest of your task.
Maybe I'm asking something that everyone already knows the answer to, so I apologize if I'm asking out-of-turn.
Is your data ("dataset1") already in a SAS dataset? Or, is it in the form of text as per your "Real data.txt.zip"?
dataset1 is in the form of text which i attached in Zip file as a text file. Dataset2 i forgot to attach. even i need transpose for dataset 1. so in this operation dataset2 is not required.
Ok - so the first thing you need to do is to turn your text file into a SAS dataset. Your text file appears to be tab delimited. Try the following code. Once you have a SAS dataset, then try PROC TRANSPOSE.
(Second try - code didn't show up the first time . . . )
proc import datafile='some path\real data.txt'
out=datafile1
dbms=dlm
replace;
delimiter='09'x;
run;
I forgot to mention that your variable names will not show up as you expect them to when you use the proc import code.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.