BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
superbug
Quartz | Level 8

My data has 1,201,250 rows, part of the data is as below. 

 In each form, each examinee take the same 250 item.  There are a few same itemID across different forms. e.g. item pp012 in both AAA and BBB, pp012 and pp024 in both BBB and CCC.  there are 1350 unique items across all forms. I want to transpose this data back to wide to make a SPARSE matrix in terms of raw. The desired SPARSE matrix should have 1350 columns in terms of raw score. SPARSE matrix means there are no data in a certain column after transposing.

DATA have ; 
  INPUT form examineeID _NAME_   itemID  raw; 
CARDS ; 
AAA 111 raw_1  pp012   0 
AAA 111 raw_2  pp014   1
AAA 111 raw_3  pp017   1 
AAA 111 raw_4  pp015   0 
..
AAA 444 raw_1  pp012   1 
AAA 444 raw_2  pp014   0 
AAA 444 raw_3  pp017   1 
AAA 444 raw_4  pp015   1
...
BBB 777  raw_1 pp012   0
BBB 777  raw_2 pp024   1
BBB 777  raw_3 pp027   0
BBB 777  raw_4 pp025   1
...
BBB 999  raw_1 pp012   1
BBB 999  raw_2 pp024   1
BBB 999  raw_3 pp027   0
BBB 999  raw_4 pp025   0
....
CCC 666  raw_1 pp012   0
CCC 666  raw_2 pp024   1
CCC 666  raw_3 pp037   0
CCC 666  raw_4 pp035   1
...
CCC 888  raw_1 pp012   1
CCC 888  raw_2 pp024   1
CCC 888  raw_3 pp037   0
CCC 888  raw_4 pp035   0
...

 I used the following code

proc sort data=have; by examineeid; run;
proc transpose data=have  out=want (drop=_NAME_) delim=_;
  by examineeid;
  id itemid;
  var raw;
run;

The resulted matrix is 4805x1350, which is the correct dimension. But from left to right, the columns SHOULD be in the ascending order of itemID. How can I get that order? I tried to sort "have" data by itemID examineeID before transposing, but the resulted matrix is not the correct dimension I wanted. I am stucked there for a while, much thanks to your help!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Please test your data step code before posting, so you can be sure it works and creates valid data.

 

Create a pattern dataset of all items, and prepend that before transposing; filter out the empty observation after the transpose:

data have; 
input form $ examineeID $ _NAME_ $ itemID $ raw; 
datalines; 
AAA 111 raw_1  pp012   0 
AAA 111 raw_2  pp014   1
AAA 111 raw_3  pp017   1 
AAA 111 raw_4  pp015   0 
AAA 444 raw_1  pp012   1 
AAA 444 raw_2  pp014   0 
AAA 444 raw_3  pp017   1 
AAA 444 raw_4  pp015   1
BBB 777  raw_1 pp012   0
BBB 777  raw_2 pp024   1
BBB 777  raw_3 pp027   0
BBB 777  raw_4 pp025   1
BBB 999  raw_1 pp012   1
BBB 999  raw_2 pp024   1
BBB 999  raw_3 pp027   0
BBB 999  raw_4 pp025   0
CCC 666  raw_1 pp012   0
CCC 666  raw_2 pp024   1
CCC 666  raw_3 pp037   0
CCC 666  raw_4 pp035   1
CCC 888  raw_1 pp012   1
CCC 888  raw_2 pp024   1
CCC 888  raw_3 pp037   0
CCC 888  raw_4 pp035   0
;

proc sort
  data=have (keep=itemid)
  out=pattern
  nodupkey
;
by itemid;
run;

proc sort data=have;
by examineeid;
run;

data pretrans;
set
  pattern
  have
;
run;

proc transpose
  data=pretrans
  out=want (drop=_NAME_ where=(examineeid ne ""))
  delim=_
;
by examineeid;
id itemid;
var raw;
run;

View solution in original post

2 REPLIES 2
Kurt_Bremser
Super User

Please test your data step code before posting, so you can be sure it works and creates valid data.

 

Create a pattern dataset of all items, and prepend that before transposing; filter out the empty observation after the transpose:

data have; 
input form $ examineeID $ _NAME_ $ itemID $ raw; 
datalines; 
AAA 111 raw_1  pp012   0 
AAA 111 raw_2  pp014   1
AAA 111 raw_3  pp017   1 
AAA 111 raw_4  pp015   0 
AAA 444 raw_1  pp012   1 
AAA 444 raw_2  pp014   0 
AAA 444 raw_3  pp017   1 
AAA 444 raw_4  pp015   1
BBB 777  raw_1 pp012   0
BBB 777  raw_2 pp024   1
BBB 777  raw_3 pp027   0
BBB 777  raw_4 pp025   1
BBB 999  raw_1 pp012   1
BBB 999  raw_2 pp024   1
BBB 999  raw_3 pp027   0
BBB 999  raw_4 pp025   0
CCC 666  raw_1 pp012   0
CCC 666  raw_2 pp024   1
CCC 666  raw_3 pp037   0
CCC 666  raw_4 pp035   1
CCC 888  raw_1 pp012   1
CCC 888  raw_2 pp024   1
CCC 888  raw_3 pp037   0
CCC 888  raw_4 pp035   0
;

proc sort
  data=have (keep=itemid)
  out=pattern
  nodupkey
;
by itemid;
run;

proc sort data=have;
by examineeid;
run;

data pretrans;
set
  pattern
  have
;
run;

proc transpose
  data=pretrans
  out=want (drop=_NAME_ where=(examineeid ne ""))
  delim=_
;
by examineeid;
id itemid;
var raw;
run;
superbug
Quartz | Level 8

@Kurt_Bremser 

THANK YOU SO MUCH!

A double thumb up to your help!

Catch up on SAS Innovate 2026

Dive into keynotes, announcements and breakthroughs on demand.

Explore Now →
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1370 views
  • 0 likes
  • 2 in conversation