BookmarkSubscribeRSS Feed
binhle50
Obsidian | Level 7

Dear SAS users,

 

I have two columns with values separated by comma as below:

column 1     

row1: HB,SF,IVF,SVF,HB,SF,SVF,SF,SVF,HB,SF,HB,SVF

row2: IVF,SF,HB,SF,SVF

row3: SF,HB,IVF

row3: SF,HB,IVF

column 2

row1: 0.0,1,2,0,0,0,,1,0,0,0,2,1

row2: 1,2,0,0,0

row3: 1,1,0

row4: 1,0,0

Where column 1 contains the code of a lab test: there are 4 type: HB, SF, SVF, and IVF and only IVF have only 1 test, the 3 others will have multiple tests

And column 2 contains the test results: 0 =negative, 1=positive, 2=ambiguous, blank=missing results

The test and the result will exactly order the same, but the tests are not in any sequential.

I need to create 2 variables to show the test results using those 2 columns. The first variable is to show to test result of all tests that happened before IVF test. The second variables is to show the test result after IVF test.

I want,

in row 1, there are 2 tests before IVF, and they are all negative so the VAR1=0. There are 10 tests after IVF and have at least 1 test=1 so VAR2=1.

In row 2, there is no test before IVF, so VAR1=null. There are 4 tests after IVF and none=1 but one test=2 so VAR2=2;

in row 3, there are 2 tests before IVF and both=1 so VAR1=1 and VAR2=blank (no test after IVF)

in row 4, there are 2 tests before IVF and only one test=1 so VAR1=0 and VAR2=blank (no test after IVF)

what SAS procedure should be the best to handle this. I tried scan, array, countw but none seems working properly.

Any help from you all will be highly appreciated.

 

Best,

BL

5 REPLIES 5
ballardw
Super User

Try using these Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

 

 

It really does not make much since to say "column 1 " has tests and "column 2" when those should be variable names. And why to you have two "row 3" tests???

 

Actually probably pretty trivial with actual data.

 

Also, "Isn't working" is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.

binhle50
Obsidian | Level 7
I am sorry the second row3 of column 1 should be row 4
binhle50
Obsidian | Level 7
And below is my sample dataset
data sample;
input id:10. column1:$200. column2:$200.;
datalines;
1 HB,SF,IVF,SVF,HB,SF,SVF,SF,SVF,HB,SF,HB,SVF 0.0,1,2,0,0,0,,1,0,0,0,2,1
2 IVF,SF,HB,SF,SVF 1,2,0,0,0
3 SF,HB,IVF 1,1,0
4 SF,HB,IVF 1,0,0
5 HLB,SF,HLB 0,0,0
run;

for raw 5, I want VAR1=null and VAR2=null because missing IVF
ballardw
Super User

You need to state RULES, not examples of values for the VAR1 and VAR2 calculations. I am not sure I understand everything you said for those.

 

Here is one way to pick values apart based on the position, if any of IVF in the data.

data want;
   set sample;
   array t(20) $ 5;
   array r(20) ;
   array temp(20) ;
   do i=1 to countw(column1,',');
      t(i) = scan(column1,i,',');
      r(i) = input (scan(column2,i,','),1.);
   end;
   IVFpos = whichc('IVF',of t(*));
   if IVFPOS> 0 then do;
      /* before IVF copy some values to the TEMP array*/
      do i=1 to (IVFpos -1);
         temp[i]=r[i];
      end;
      /* logic for assigning VAR1*/ 
      /* this just picks the largest*/
      var1 = max(of temp[*] );
      /* reset temp array to empty and copy values after*/
      call missing( of temp(*));
      do i=(IVFpos +1) to dim(r);
         temp[i]=r[i];
      end;
      /* apply the ogic for VAR2*/
      var2 = max(of temp[*] );
   end;
   drop i t1-t20 r1-r20 temp1-temp20 ivfpos;
run;

Arrays are a handy way to keep similar values together. There a some functions such as WHICHC and WHICHN that when coupled with arrays make it easy to identify if any of the variables have a specific value. WHICHC is for character values and is used above to get which test, if any is IVF. WHICHN finds numeric values. The result is the number of the item found or 0 if not found.

 

As I mentioned, I'm not sure of all of your rules. You can use WHICHN with the Temp array to search for specific values.

Maybe something like this.

If whichn(1,of temp(*))>0 then var1=1;

else if whichn(2,of temp(*))=>0 then var1=2;

else var1=0.

 

 

binhle50
Obsidian | Level 7

Thanks so much for your great help! using whichc and whichn is really helpful and is exactly what I am looking for.

Best,

Le

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 861 views
  • 0 likes
  • 2 in conversation