BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
superbug
Quartz | Level 8

My data has 1,201,250 rows, part of the data is as below. 

 In each form, each examinee take the same 250 item.  There are a few same itemID across different forms. e.g. item pp012 in both AAA and BBB, pp012 and pp024 in both BBB and CCC.  there are 1350 unique items across all forms. I want to transpose this data back to wide to make a SPARSE matrix in terms of raw. The desired SPARSE matrix should have 1350 columns in terms of raw score. SPARSE matrix means there are no data in a certain column after transposing.

DATA have ; 
  INPUT form examineeID _NAME_   itemID  raw; 
CARDS ; 
AAA 111 raw_1  pp012   0 
AAA 111 raw_2  pp014   1
AAA 111 raw_3  pp017   1 
AAA 111 raw_4  pp015   0 
..
AAA 444 raw_1  pp012   1 
AAA 444 raw_2  pp014   0 
AAA 444 raw_3  pp017   1 
AAA 444 raw_4  pp015   1
...
BBB 777  raw_1 pp012   0
BBB 777  raw_2 pp024   1
BBB 777  raw_3 pp027   0
BBB 777  raw_4 pp025   1
...
BBB 999  raw_1 pp012   1
BBB 999  raw_2 pp024   1
BBB 999  raw_3 pp027   0
BBB 999  raw_4 pp025   0
....
CCC 666  raw_1 pp012   0
CCC 666  raw_2 pp024   1
CCC 666  raw_3 pp037   0
CCC 666  raw_4 pp035   1
...
CCC 888  raw_1 pp012   1
CCC 888  raw_2 pp024   1
CCC 888  raw_3 pp037   0
CCC 888  raw_4 pp035   0
...

 I used the following code

proc sort data=have; by examineeid; run;
proc transpose data=have  out=want (drop=_NAME_) delim=_;
  by examineeid;
  id itemid;
  var raw;
run;

The resulted matrix is 4805x1350, which is the correct dimension. But from left to right, the columns SHOULD be in the ascending order of itemID. How can I get that order? I tried to sort "have" data by itemID examineeID before transposing, but the resulted matrix is not the correct dimension I wanted. I am stucked there for a while, much thanks to your help!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Please test your data step code before posting, so you can be sure it works and creates valid data.

 

Create a pattern dataset of all items, and prepend that before transposing; filter out the empty observation after the transpose:

data have; 
input form $ examineeID $ _NAME_ $ itemID $ raw; 
datalines; 
AAA 111 raw_1  pp012   0 
AAA 111 raw_2  pp014   1
AAA 111 raw_3  pp017   1 
AAA 111 raw_4  pp015   0 
AAA 444 raw_1  pp012   1 
AAA 444 raw_2  pp014   0 
AAA 444 raw_3  pp017   1 
AAA 444 raw_4  pp015   1
BBB 777  raw_1 pp012   0
BBB 777  raw_2 pp024   1
BBB 777  raw_3 pp027   0
BBB 777  raw_4 pp025   1
BBB 999  raw_1 pp012   1
BBB 999  raw_2 pp024   1
BBB 999  raw_3 pp027   0
BBB 999  raw_4 pp025   0
CCC 666  raw_1 pp012   0
CCC 666  raw_2 pp024   1
CCC 666  raw_3 pp037   0
CCC 666  raw_4 pp035   1
CCC 888  raw_1 pp012   1
CCC 888  raw_2 pp024   1
CCC 888  raw_3 pp037   0
CCC 888  raw_4 pp035   0
;

proc sort
  data=have (keep=itemid)
  out=pattern
  nodupkey
;
by itemid;
run;

proc sort data=have;
by examineeid;
run;

data pretrans;
set
  pattern
  have
;
run;

proc transpose
  data=pretrans
  out=want (drop=_NAME_ where=(examineeid ne ""))
  delim=_
;
by examineeid;
id itemid;
var raw;
run;

View solution in original post

2 REPLIES 2
Kurt_Bremser
Super User

Please test your data step code before posting, so you can be sure it works and creates valid data.

 

Create a pattern dataset of all items, and prepend that before transposing; filter out the empty observation after the transpose:

data have; 
input form $ examineeID $ _NAME_ $ itemID $ raw; 
datalines; 
AAA 111 raw_1  pp012   0 
AAA 111 raw_2  pp014   1
AAA 111 raw_3  pp017   1 
AAA 111 raw_4  pp015   0 
AAA 444 raw_1  pp012   1 
AAA 444 raw_2  pp014   0 
AAA 444 raw_3  pp017   1 
AAA 444 raw_4  pp015   1
BBB 777  raw_1 pp012   0
BBB 777  raw_2 pp024   1
BBB 777  raw_3 pp027   0
BBB 777  raw_4 pp025   1
BBB 999  raw_1 pp012   1
BBB 999  raw_2 pp024   1
BBB 999  raw_3 pp027   0
BBB 999  raw_4 pp025   0
CCC 666  raw_1 pp012   0
CCC 666  raw_2 pp024   1
CCC 666  raw_3 pp037   0
CCC 666  raw_4 pp035   1
CCC 888  raw_1 pp012   1
CCC 888  raw_2 pp024   1
CCC 888  raw_3 pp037   0
CCC 888  raw_4 pp035   0
;

proc sort
  data=have (keep=itemid)
  out=pattern
  nodupkey
;
by itemid;
run;

proc sort data=have;
by examineeid;
run;

data pretrans;
set
  pattern
  have
;
run;

proc transpose
  data=pretrans
  out=want (drop=_NAME_ where=(examineeid ne ""))
  delim=_
;
by examineeid;
id itemid;
var raw;
run;
superbug
Quartz | Level 8

@Kurt_Bremser 

THANK YOU SO MUCH!

A double thumb up to your help!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 391 views
  • 0 likes
  • 2 in conversation