DATA Step, Macro, Functions and more

Data manage

Accepted Solution Solved
Reply
Contributor
Posts: 20
Accepted Solution

Data manage

 

Dear expert,

I got an uneasy task. I have two columns of data say X1 and X2 each in 3 digits but assigned as characters. Each column has over 6000 rows. I need to physically combine X1 and X2 and make them six digits. First X1 with X2 and then X2 with X1. I could do that by CATT function, and the product is like X12 and X21. This is how it becomes:

X1      X2      X12          X21

361   362   361362    362361

845  846    845846    846845

846  845    846845    845846

845  846    845846    846845

......

 

Now the problem is, I need to count X12 or X21 in such a way that 845846 and 846845 are considerd exactly same. So in the above example, the entry 845846 should count 3 not 2 or 1. PROC FREQ for X12 or X21 would count only 2 or 1 respectively. I tried defining like:

if X12=845846 and X21=846845 then X21=X12;

it works with PROC FREQ of X21, but to write for each of the 6000 entries is almost impossible, if not impossible.

 

I would be very grateful if any kind soul can help me out with an better easier solution.

 

Thank you,

Sijansap


Accepted Solutions
Solution
‎08-22-2017 09:01 AM
Respected Advisor
Posts: 4,936

Re: Data manage

Create a sorted version of your concatenated variable:

 

if X1 < X2 then sX12 = catt(X1, X2);
else sX12 = catt(X2, X1); 

then do your freq count on sX12.

 

PG

View solution in original post


All Replies
Solution
‎08-22-2017 09:01 AM
Respected Advisor
Posts: 4,936

Re: Data manage

Create a sorted version of your concatenated variable:

 

if X1 < X2 then sX12 = catt(X1, X2);
else sX12 = catt(X2, X1); 

then do your freq count on sX12.

 

PG
Contributor
Posts: 20

Re: Data manage

Thank you very much PGStats. The only problem could be - even though they are digits but they are considered as characters since some of them start like 001, 020 etc. Anyway, I like your idea of sorted CATT and the comparing the values between X1 and X2. I certainly appreciate your suggestion and will try it out.

Thanks again.

Contributor
Posts: 20

Re: Data manage

Thanks again PGStats.

I switched X1 and X2 in the inequality to preseve zeroes, and worked perfectly.

I truly appreciate your help.

 

Sijansap

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 99 views
  • 0 likes
  • 2 in conversation