BookmarkSubscribeRSS Feed
hermanchen
Fluorite | Level 6
Hi -  I have a wide flat dataset with column response rates of binomial questions:
 
sample#, col1a, col1b, col2a, col2b...colna, colnb
 
1, 50%,50%,20%,80%,...,0%,100%
 
2, 0%,100%,80%,20%,...,50%,50%
 
 
 
I would like to run proc freq /chisq to compare samples 1 & 2. Prefer to use proc sql to process two columns at a time (q# A and B) but difficult to get proc freq to accept wide flat dataset because proc freq prefers a transpose format like below?
 
1,col1a,50%
 
1,col1b,50%
 
2,col2a,20%
 
2,col2b,80%
 
...
 
n,colna,0%
 
n,colnb,100%
 
 
 
my pseudo code with syntax errors below - please help? thanks!
 
 
 
/*compute counts from columns A/B ; save headers for loop*/
 
proc contents data=work.widefile18 Out=work.headers order=VARNUM;
run;
 
/*process first question: COLA and COLB*/
proc sql; 
CREATE TABLE HEADERNAME AS SELECT VARNUM, NAME FROM work.headers 
WHERE (VARNUM in (4,5 ));
%let resultcount=&sqlobs;
QUIT;
 
 
/*compute counts*/
%MACRO LOOP2();
proc sql;
select NAME into :COLA from HEADERNAME where VARNUM=1;
select NAME into :COLB from HEADERNAME where VARNUM=2;
quit;
data collab;
set work.widefile18;
count=1*vvaluex(COLA); output;
count=1*vvaluex(COLB); output;
run;%MEND;
 
 
/*loop COLA to COLB datastep */
%global i;
do i = 1 to 2;
DATA HEADERNAME;
SET HEADERNAME;
  
CALL EXECUTE("%LOOP2("||TRIM("sample &i",&COLA,&COLB)||")");
proc freq data=work.collab;
by count;
tables sample*COLA /chisq;
run; 
 
%let i=&i+1;
 
end;*/
RUN;

 

9 REPLIES 9
PaigeMiller
Diamond | Level 26

Prefer to use proc sql to process two columns at a time (q# A and B) but difficult to get proc freq to accept wide flat dataset because proc freq prefers a transpose format like below?

 

Use PROC TRANSPOSE. It is the right tool for the job. Then you don't have to use  complications like macros and CALL EXECUTE.

--
Paige Miller
hermanchen
Fluorite | Level 6

Hi,

the flatwide dataset (2n)+1 columns.  after proc transpose, the narrow dataset is now (2n) x 3 columns like below.  COL2 also transposed to have multi-values from just binomial value.  How do I specify only 2x2 crosstabulation (COL1 and COL2) in my code:


proc freq data=narrowdata;
weight COL3;
tables COL1*COL2 / noPercent noRow nocol ;

run;

 

 

transposed dataset:

1,col1a,50%
 
1,col1b,50%
 
2,col1a,20%
 
2,col1b,80%
 
...
 
n,colna,0%
 
n,colnb,100%
PaigeMiller
Diamond | Level 26

What's wrong with the code you show here?

--
Paige Miller
hermanchen
Fluorite | Level 6

the problem is transposed COL2 has now multi-values containing the flatwide file's headers: question 1A and 2B, question 2A and 2B,..., question 255A and 255B.

Then the chisq output is a 255x255 table rather then 255 2x2 tables that I am looking for.

I am looking for proc freq code that would do something like:

 

for i = 1 to n questions

  create itable select row i, i+1 from narrowdata where sampleno in (1,2)

  proc freq data=itable ; tables itables[COL1]*itable[COL2] /weight=COL3 chisq;

end;

 

I understand the tables statements can process 2x2 tables however narrowdata is in nxn dataset and proc freq has no option to tell it to process every 2 rows by COL1 id, therefore, loop is required?

PaigeMiller
Diamond | Level 26

It would be very helpful if you could provide your data using these instructions:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

 

--
Paige Miller
hermanchen
Fluorite | Level 6

Ok but how do I create a dummy variable as a class to create 2x2 tables of my narrowdata - syntax error in data steps , incorrectly, creates empty qtype column?

 

/*create dummy variable by n question type yes/no proportions*/
data narrowfile2;
set narrowfile1;
by sampleno;
retain qtype 1 ;
retain rowcount 1;
if (mod(qtype,2)=1) then do qtype=rowcount; output; end; else do qtype=qtype; rowcount+1; output; end;
run;

hermanchen
Fluorite | Level 6

Hi Mr. Miller - I ran the macro to generate my example data but the macro didn't quite run/convert my data.  I edited but the example should still apply - how do I transpose and create a dummy variable (for the multi-valued tranposed columns) as a class variable?

Note: will need to DROP RATE 

 

Thanks!

 

 

data WORK.mydata;
infile datalines dsd truncover dlm=',';
input DBN:$6. Name:$6. Rate:BEST. _1a_:BEST. E:BEST. _1b_:BEST. G:BEST. _1c_:BEST. I:BEST. _1d_:BEST. K:BEST. _1e_:BEST. M:BEST.;
label DBN="DBN" Name="Name" Rate="Total Response Rate" _1a_="1a. How many" E="E" _1b_="1b. How many" G="G" _1c_="1c. How many" I="I" _1d_="1d. How many" K="K" _1e_="1e. How many" M="M";
datalines;
01M539,TECHNO,95,41,59,6,94,29,71,22,78,6,94
30Q300,THE_30,100,19,81,12,88,15,85,23,77,4,96
;

art297
Opal | Level 21

@hermanchen  Based on what you've posted, it's difficult to understand exactly what you're trying to accomplish.

 

However, that said, I think that the Untranspose macro a group of us published last year will get you on the right track. You can download the paper and macro from: http://www.sascommunity.org/wiki/An_Easier_and_Faster_Way_to_Untranspose_a_Wide_File

 

Using that macro, given how I interpreted what you might be looking for, I ran the following code:

 

data have;
  infile cards dlm=',';
  informat col1a col1b col2a col2b col3a col3b percent4.;
  input sample_no col1a col1b col2a col2b col3a col3b;
  cards;
1, 50%,50%,20%,80%,0%,100%
2, 0%,100%,80%,20%,50%,50%
;

%untranspose(data=have, out=want, by=sample_no, id=type,
 id_informat=$1.,var=col1-col3, makelong=yes)

proc sort data=want;
  by sample_no _name_;
run;

proc freq data=want ;
  tables type*_value_/chisq;
  by sample_no _name_;
run;

Hope that helps,

Art, CEO, AnalystFinder.com

 

hermanchen
Fluorite | Level 6

Hi Art, this looks promising - will look into untranspose

Thanks!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 3941 views
  • 0 likes
  • 3 in conversation