turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Programming
- /
- Syntax for creating a new dataset based on codes f...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-14-2017 04:09 PM - edited 12-14-2017 04:10 PM

I am creating a database from another and I want to keep observations that have codes in certain variables. It will look something like this:

data new;

set old;

if ((4100 or 4101) in (DX1-DX25)) and (100 or 101 or 105 in (DXCC1-DXCC25))) or

((4200 or 4201) in (DX1-DX25)) and (200 or 201 or 205 in (DXCC1-DXCC25)));

run;

Is the syntax correct for what I want to do? I want to make sure before I run because it is a large dataset and will take a long time.

Accepted Solutions

Solution

12-14-2017
05:12 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Melk

12-14-2017 04:55 PM

Just create 2 do loops. Something like this:

data new (drop = i);

set old;

array DX{*} DX1-DX25;

array DXCC{*} DXCC1-DXCC14;

do i = 1 to dim(DX);

if DX[i] in (4100, 4101) then dx1 = 1;

if DX[i] in (4200, 4201) then dx2 = 1;

end;

do i = 1 to dim(DXCC);

if DXCC[i] in (100,101,105) then dxcc1=1;

if DXCC[i] in (200,201,205) then dxcc2=1;

end;

if sum(dx1,dxcc1) = 2 or sum(dx2,dxcc2) = 2;

run;

I would suggest to test your code first on a small subset

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Melk

12-14-2017 04:27 PM - edited 12-14-2017 05:13 PM

If you want to test code simply, you can do that:

data want;

set have (obs=100);

** add whatever you want to do;

run;

Here, your syntax is far from working. This would be a reasonable approach:

data want;

set have;

array dx {25};

array dxcc {23};

do k=1 to 25 until;

if dx{k} in (4100, 4101) then flag_group1=1;

else if dx{k} in (4200, 4201) then flag_group2=1;

end;

if flag_group1=1 then do k=1 to 23 until (flag=1);

if dxcc{k} in (100, 101, 105) then flag=1;

end;

if flag_group2=1 and flag ne 1 then do k=1 to 23 until (flag=1);

if dxcc{k} in (200, 201, 205) then flag=1;

end;

if flag=1;

drop k flag flag_group1 flag_group2;

run;

And do test it using OBS= before running against the entire data set.

***** EDITED: changes added to allow for different number of elements in each array.

Also Edited to match your AND / OR logic.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

12-14-2017 04:34 PM

my dxcc actually has a difference length, I just noticed. Will this still work if I use the longest length of the 2 in my k-1 to x?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Melk

12-14-2017 04:39 PM

I'm not sure what you are asking. Do you have some other number of elements (not 25)? Are you talking about the lengths of character variables (which requires a slightly different syntax)?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

12-14-2017 04:44 PM

Ah sorry - yes the number if elements in the arrays are different for DX and DXCC.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Melk

12-14-2017 04:45 PM

I think this code will work, with the assumption you have equal number of variables both for DX and DXCC.

data new (drop = i);

set old;

array DX{*} DX1-DX25;

array DXCC{*} DXCC1-DXCC25;

do i = 1 to dim(DX);

if DX[i] in (4100, 4101) then dx1 = 1;

if DXCC[i] in (100,101,105) then dxcc1=1;

if DX[i] in (4200, 4201) then dx2 = 1;

if DXCC[i] in (200,201,205) then dxcc2=1;

end;

if sum(dx1,dxcc1) = 2 or sum(dx2,dxcc2) = 2;

run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to skyvalley81

12-14-2017 04:51 PM

Thanks for your response. What if the number of vars are different for DX and DXCC?

Solution

12-14-2017
05:12 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Melk

12-14-2017 04:55 PM

Just create 2 do loops. Something like this:

data new (drop = i);

set old;

array DX{*} DX1-DX25;

array DXCC{*} DXCC1-DXCC14;

do i = 1 to dim(DX);

if DX[i] in (4100, 4101) then dx1 = 1;

if DX[i] in (4200, 4201) then dx2 = 1;

end;

do i = 1 to dim(DXCC);

if DXCC[i] in (100,101,105) then dxcc1=1;

if DXCC[i] in (200,201,205) then dxcc2=1;

end;

if sum(dx1,dxcc1) = 2 or sum(dx2,dxcc2) = 2;

run;

I would suggest to test your code first on a small subset