Hi Every one,
I am trying to convert my postgre sql code to SAS where I declared a record. What is the best way that I can declare a record in sas. How to implement the below functionality in sas either using Data step or proc sql. Any help is highly appreciated.
Below is my postgre sql code. It will declare a record and then iterate through each record at a time in the for loop. It uses the execute format to dynamically populate the values using all and year.
DO
$do$
DECLARE
rec RECORD;
BEGIN
/* MAKE test1 TABLE */
/* MAKE TABLE */
CREATE TABLE test1 (Name varchar(15),year varchar(4),PRIMARY KEY (Name,year));
/* POPULATE test1 WITH APPROPRIATE Name */
INSERT INTO test1 (Name,year) SELECT b.Name, c.year FROM (SELECT * FROM test2) b, (SELECT DISTINCT year FROM test3) c;
/* POPULATE test1 WITH 'ALL' ROWS */
FOR rec IN SELECT DISTINCT year FROM test3 LOOP
EXECUTE format('INSERT INTO test1 (Name,year) VALUES ($1,$2)')
USING 'ALL', rec.year;
END LOOP;
END
$do$;
What does "declare a record" even mean? Can you explain what that code is doing?
Looks a lot like you are trying to do this:
data test1;
length Name $15 year $4 value 8 ;
set test2 (keep=break year rename=(break=name))
test3 (in=in3 keep=year)
;
if in3 then break='ALL';
run;
But it is not clear why you created a variable name VALUE if you didn't plan to give it any values.
Hi Tom,
Based on your query, I edited my post. Please let me know if that is clear.
Thanks,
Ravi.
@rkarr5 wrote:
Hi Tom,
Based on your query, I edited my post. Please let me know if that is clear.
Thanks,
Ravi.
So it looks like I was right.
Why did you create the variable VALUE if you aren't going to give it any values.
Only thing I didn't include was adding an index (why would you need that?) or selecting distinct (why does your source data have duplicates)?
I removed the value. I had a query with multiple loops and that value is used for the next loop. But I don't want to make it complex so I removed the value variable. Coming to the second part that you asked, there will be definitely duplicated values of years in my table. That's the reason I used the distinct. If you ask why do you have duplicate then I can't answer that as my data structure is like that which has duplicate records of year. For example a student who joined in multiple courses in the same year will have multiple records with same year.
Don’t do conversion line by line. Instead figure out what a specific process is doing and explain it with a data example, showing the input and output and explain the logic. This way, we can help point you to the best SAS solution. By groups may make the the need for loops unnecessary for starters.
@rkarr5 wrote:
I removed the value. I had a query with multiple loops and that value is used for the next loop. But I don't want to make it complex so I removed the value variable. Coming to the second part that you asked, there will be definitely duplicated values of years in my table. That's the reason I used the distinct. If you ask why do you have duplicate then I can't answer that as my data structure is like that which has duplicate records of year. For example a student who joined in multiple courses in the same year will have multiple records with same year.
StudentIdNo Region Age agebrk region_desc subregion_desc demo_age gender race year
1 1A 12 12 to 17 Central Chicago age12to17 male white_rac 2013
2 1B 14 12 to 17 Northern Aurora age12to17 female white_rac 2014
3 1C 7 6 to 11 Southern Dallas age6to11 male hispanic 2015
4 1D 3 Under 3 Central Chicago age0to2 female white_rac 2016
5 1D 5 3 to 5 Northern Aurora age3to5 female white_rac 2017
So this is my data. The code I pasted is trying to break this data into different segments based on age,gender, race,year. So there will be a different combinations for different variables. The last for loop is trying to create a ALL value with different years. So the last for loop should append ALL and 2012, ALL,2013,ALL,2014....so on....
I want to dynamically filter the dataset.
Then you likely want a macro, here are some resources on macros.
If you show what fields you want to control and how you want to pass the values it becomes very easy in SAS to do this, IMO. You switch your 'records' to macro variables basically and convert the filter to a SAS process.
UCLA introductory tutorial on macro variables and macros
https://stats.idre.ucla.edu/sas/seminars/sas-macros-introduction/
Tutorial on converting a working program to a macro
This method is pretty robust and helps prevent errors and makes it much easier to debug your code. Obviously biased, because I wrote it 🙂 https://github.com/statgeek/SAS-Tutorials/blob/master/Turning%20a%20program%20into%20a%20macro.md
Examples of common macro usage
https://communities.sas.com/t5/SAS-Communities-Library/SAS-9-4-Macro-Language-Reference-Has-a-New-Ap...
will look into that. Can you also let me know how to start converting the above code to macros. That would be very helpful.
Thanks for the video. I am a base programmer. But here my requirement is I need to create a table with two columns Name and year which gets it value from two different tables. I used proc sql to create the table with two variables and then inserted the data using insert into table_name(value,year) select b.value,c.year from(select * from test2)b,(select distinct year from test3)c. So that query is creating a table and inserting values into the value. Then I created another table which has distinct values of year and then set it to new dataset in data step and then assigned value to 'ALL' for all the distinct years. This serves my purpose but I want to know if there is any other alternative that will do it in single go.
Seems trivial to do in SAS. First lets convert your posted listing into an actual dataset.
data have;
infile datalines dsd dlm='|' truncover;
input StudentIdNo $
Region $
Age
agebrk :$10.
region_desc :$20.
subregion_desc :$20.
demo_age :$10.
gender :$7.
race :$20.
year
;
datalines;
1|1A|12|12 to 17|Central|Chicago|age12to17|male|white_rac|2013
2|1B|14|12 to 17|Northern|Aurora|age12to17|female|white_rac|2014
3|1C|7|6 to 11|Southern|Dallas|age6to11|male|hispanic|2015
4|1D|3|Under 3|Central|Chicago|age0to2|female|white_rac|2016
5|1D|5|3 to 5|Northern|Aurora|age3to5|female|white_rac|2017
;
Then to get the break down by region and year just use PROC SUMMARY.
proc summary data=have ;
class region year ;
output out=want;
run;
proc print data=want;
run;
Obs Region year _TYPE_ _FREQ_ 1 . 0 5 2 2013 1 1 3 2014 1 1 4 2015 1 1 5 2016 1 1 6 2017 1 1 7 1A . 2 1 8 1B . 2 1 9 1C . 2 1 10 1D . 2 2 11 1A 2013 3 1 12 1B 2014 3 1 13 1C 2015 3 1 14 1D 2016 3 1 15 1D 2017 3 1
Look at the documentation of PROC SUMMARY (also known as PROC MEANS) to what else it can do for you.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.