BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
overky
Calcite | Level 5

I am reading in data to create X dataset.  The raw data comes as a seperate observation/line for each ItemCode.  I am wanting to combine the IDs into one Observation and have multiple ItemCodes strung out horizontally instread of verticle.  I have tried a proc transpose but that does not combine IDs into one observation.

 

proc sort data=Raw nodup;
by ID ItemCode;
run;

data Raw;
set Raw;
counter=_N_;
_N_=1;
run;

PROC SORT DATA=Raw;
BY ID ItemCode;
RUN;
DATA NewRaw1;
ARRAY ??? (Unknown number of observations)
RETAIN ?
BY ID;
IF FIRST.ID THEN
DO I = 1 TO 1;
DO J = 1 TO 2;
CS[I,J] = .;
END;
END;
CS[ID, ItemCode] = ResponseCode;
IF LAST.ID THEN OUTPUT;
KEEP ID CS1-CS;
RUN;

1 ACCEPTED SOLUTION

Accepted Solutions
collinelliot
Barite | Level 11

Try the following and see if that works. I added an ID for testing.

 

data have;
input ID Item Response;
datalines;
1 200 10000
1 220 500
1 510 600000
1 400 750000
1 230 34000
2 330 25
2 400 47
2 600 62
2 788 1000
2 900 500000
2 9910 10010
;

proc transpose data = have out = want prefix = item_;
    by id;
    var response;
    id item;
run;

View solution in original post

9 REPLIES 9
collinelliot
Barite | Level 11

Can you provide a small sample of your data? Based on your description, proc transpose should do what want. That is:

 

Id code

1 A

1 B

1 C

 

Could easily become:

Id Code1 Code2 Code3

1 A         B           C

 

 

collinelliot
Barite | Level 11

Simple example:

 

data have;
    do id = 1 to 3;
        do item_code = 'A', 'B', 'C';
        output;
        end;
    end;
run;

proc transpose data = have out = want prefix = code;
    by id;
    var item_code;
run;
overky
Calcite | Level 5
IDItemResponse
120010000
1220500
1510600000
1400750000
123034000
133025
140047
160062
17881000
1900500000
1991010010

 

I have something like this but I need a new dataset in this format:

 

ID2002205104002303304006007889009910
11000050060000075000034000254762100050000010010
21100055066000082500037400285268110055000011011
31210060572600090750041140305775121060500012112
41331066679860099825045254336383133166550013323
514641732878460109807549779376991146473205014656
6161058059663061207883547574076100161180525516121
71771688610629371328671602334483110177288578117733
81948797411692301461538662564992121194997435919507
92143610721286153160769272882541011332144107179421457
102357911791414769176846180170591111462358117897423603

 

One trouble is that ItemCodes vary by ID.  There are 135 possible ItemCodes from 220-9912.  Sometimes when I read in the datafile there maybe 5 IDS next time maybe 60.

 

 

collinelliot
Barite | Level 11

Try the following and see if that works. I added an ID for testing.

 

data have;
input ID Item Response;
datalines;
1 200 10000
1 220 500
1 510 600000
1 400 750000
1 230 34000
2 330 25
2 400 47
2 600 62
2 788 1000
2 900 500000
2 9910 10010
;

proc transpose data = have out = want prefix = item_;
    by id;
    var response;
    id item;
run;
collinelliot
Barite | Level 11

Note that you cannot (should not?) have just the numbers as the variable names in the transpose, so in the example I've added "item_" as a prefix. 

Reeza
Super User

The long format is better for storing the data. How are you going to handle the multiple records, ie ID 1 Item 400 is repeated twice in this list. Do you need to summarize this first somehow?

 

Now that I've said that, this is a straight PROC TRANSPOSE procedure if you want a SAS dataset. Especially if the data is dynamic. 

You may need to sort the dataset by ID first. 

 

 

proc transpose data=have out=want prefix=Item_;
by ID;
id Item;
var response;
run;

if you want a report - HTML, PDF or EXCEL PROC TABULATE is actually probably a better procedure, especially if you also need to summarize, since it will do that in one step.

collinelliot
Barite | Level 11

What is the count of distinct item code values? That is how many columns you'll end up with in the transposed data.

 

Variation by ID is not an issue as long at there are not duplicates within an ID.

overky
Calcite | Level 5

Thank you.  The Proc Transpose worked.  I do not need to summarize the ItemCode 400 twice was a mistake in my example.  There will never be duplicate itemcodes.

Reeza
Super User

@overky Please mark @collinelliot solution as correct then 🙂

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 777 views
  • 1 like
  • 3 in conversation