Hi SAS Experts,
Today I want to pick some specific observations in each country in one dataset named "wins_sample" to create a new dataset called "treatment"
I have the dataset wins_sample attached
A quick description of observation of the dataset wins_sample
TYPE ENAME GEOGN YEAR acc_pay ACC_PAY_TUR ACC_STA
134495 DYCASA 'B' ARGENTINA 1994 7445 . Local standards
134495 DYCASA 'B' ARGENTINA 1995 10099 8.7216142271
134495 DYCASA 'B' ARGENTINA 1996 6277 8.0189301417 Local standards
134495 DYCASA 'B' ARGENTINA 1997 8419 11.732035928 Local standards
134495 DYCASA 'B' ARGENTINA 1998 15387 7.5126438713 Local standards
/*type, year, acc_sta geogn: character variables
acc_pay, acc_pay_tur: numeric variables*/
The dataset wins_sample here has 3 countries: Argentina, Australia, and Austria.
For this sample, what I want is to create a dataset named "treatment" that contains all observations which has YEAR from 2001,2002, 2004->2008 (from 2001 to 2008 but not the observations in the year 2003) in GEOGN= "AUSTRALIA" and all observations which has YEAR from 2004,2005, 2007->2011 in GEOGN ="AUSTRIA"
Many thanks in advance and warm regards.
@Phil_NZ wrote:
What if I want to pick observation in country GEOGN="BRAZIL" and from "1991->1997"( excluding 1993) or "2007 ->2017", can you suggest to me how to fill in the code below?
(...)
Can we do:
geogn="BRAZIL" & ((yr in (1991:1997) & yr~=1993)or yr in (2007:2017))
Yes, this is correct. The parentheses which I've highlighted in blue below are redundant, but they may help to understand the logic. I would insert a blank between ")" and "or" for better readability.
geogn="BRAZIL" & ((yr in (1991:1997) & yr~=1993) or yr in (2007:2017))
If you're unsure about a logical condition like this, you can also create a small test dataset with the relevant variables (geogn, yr), let yr run from, say, 1990 to 2020 (DO loop), apply the IF condition and check the result.
where
(geogn = "AUSTRIA" and 2004 le year le 2011 and year ne 2006) or
(/* add similar for AUSTRALIA */)
;
Hi @Phil_NZ,
If you store the year values in a numeric variable, then you don't need the conversion using the INPUT function as shown below:
data treatment(drop=yr);
set wins_sample;
yr=input(year, ?? 32.);
if geogn="AUSTRALIA" & yr in (2001:2008) & yr~=2003
| geogn="AUSTRIA" & yr in (2004:2011) & yr~=2006
;
run;
I am not sure if the highlighted symbol here is "or"
Sorry I am not sure about this symbol from google so I ask.
Because there is a link about operator
And many types of "OR"
Many thanks!
Yes, the pipe character (|) is the OR operator (see second row in the table "Logical Operators"). But feel free to use the mnemonics or, and, ne, etc. if you're more familiar with them or if the dependence on the operating environment (see footnotes under the table "Logical Operators") is a concern.
What if I want to pick observation in country GEOGN="BRAZIL" and from "1991->1997"( excluding 1993) or "2007 ->2017", can you suggest to me how to fill in the code below?
data control;
set wins.winsorize;
yr=input(year, ?? 32.);
if geogn="UNITEDS" & yr in (1999:2017)
|geogn="BRAZIL" & yr in (1991:1997) & yr~=1993 /*and what else*/
;
run;
Can we do:
geogn="BRAZIL" & ((yr in (1991:1997) & yr~=1993)or yr in (2007:2017))
Warm regards.
@Phil_NZ wrote:
What if I want to pick observation in country GEOGN="BRAZIL" and from "1991->1997"( excluding 1993) or "2007 ->2017", can you suggest to me how to fill in the code below?
(...)
Can we do:
geogn="BRAZIL" & ((yr in (1991:1997) & yr~=1993)or yr in (2007:2017))
Yes, this is correct. The parentheses which I've highlighted in blue below are redundant, but they may help to understand the logic. I would insert a blank between ")" and "or" for better readability.
geogn="BRAZIL" & ((yr in (1991:1997) & yr~=1993) or yr in (2007:2017))
If you're unsure about a logical condition like this, you can also create a small test dataset with the relevant variables (geogn, yr), let yr run from, say, 1990 to 2020 (DO loop), apply the IF condition and check the result.
data treatment;
set have.wins_sample ;
where (GEOGN="ARGENTINA") or
(YEAR in ('2001','2002','2004','2005', '2006','2007','2008') and GEOGN= "AUSTRALIA") or
(YEAR in ('2004','2005','2007','2008','2009','2010','2011') and GEOGN= "AUSTRIA");
run;
Did you try this?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.