You keep hearing me talk about SAS, but you may
be thinking to yourself, “I don’t even know what
SAS is.” Well, in this video you’ll find out. I’ll start with an overview of SAS software, and then touch on its capabilities and uses. You’ll also see a complete SAS program and
some complete reports generated by SAS
programs. Over the next few weeks you’ll learn how to
create this, and other similar, programs. Finally, by the end of the course you’ll be able to
write your own SAS programs, and present your results in the form of an issue
brief written for general audiences – who may or may not have public health expertise. But, before we discuss SAS let’s discuss
something even more basic – data. Here’s a question for you: What is data? Data is a collection of symbols, numbers, and
letters recorded and stored in a way that collectively represent some characteristics of an
object or group of objects. Here we have a table, which is a common way
of organizing data. Each box is called a cell. Moving from left to right across the table are
columns. SAS uses the term “variables”. SAS only recognizes two types of variables:
Character and Numeric. Numeric variables are numbers. Character variables are letters, special
characters, and numerals. Numerals look like numbers, but they have no
inherent numerical value to SAS. Generally speaking, we will try to minimize the
use of character variables in our SAS data sets. More on this later. The information contained in the first cell of each
column is called the variable name. Variable names can be no longer than 32
characters and must begin with a letter or and underscore. The remaining characters in the name may be
letters, digits, or the underscore character. Moving from top to bottom across the table are
rows, which are sometimes referred to as records. SAS uses the terms “observations”. The contents of each cell are called a value. You should now be up to speed on some basic
terminology used by SAS, as well as other analytic, database, and
spreadsheet programs. These terms will be used repeatedly throughout
the course. But, you still haven't told us - what is SAS? Well, SAS is a company that was created in the
1970’s. The acronym originally stood for Statistical
Analysis System. But you know it as a computer program with its
own programming language. And once you learn to speak that language, SAS can help you do all kinds of useful things. Not that it’s very important for this class, but for the sake of completeness I should
probably also point out that SAS is not really one
piece of software, but many different pieces of software. However, in this class we will primarily focus on
base SAS. And by the end of this course you will be able to
independently use SAS to access data, manage data, analyze data, and present data. Let’s quickly take a closer look at each of these
capabilities. So what do we mean by “access data”. Well, individuals and organizations store their
data using different computer programs that store
the data in different file types. Some common examples are database files, spreadsheets, raw data files, and SAS data sets. No matter how the data is stored, you can’t do
anything with it until you can get it into SAS, in a form that SAS can use, and in a location that you can reach. In other words, accessing your data. Therefore, among our first tasks in this course
will be to access data. Next, we’ll talk about managing data Managing data can include: validating and
cleaning data, subsetting data, creating new variables, or combining data. Basically this is just manipulating data so that you
can analyze it. This is probably the capability you most closely
associate with SAS. There is no doubt that SAS is a powerful tool for
analyzing data. However, in this class we won’t go much beyond
using SAS to calculate basic descriptive
statistics. The ultimate goal is typically to present your
findings in some form or another. In SAS you can create many types of reports - both tabular and graphical. In this class you will learn how to create several
of these reports, and then learn how to enhance them. You will also learn how to use SAS in conjunction
with other software, such as Microsoft Word, to create publishable
manuscripts. I’m going to show you some examples shortly. But first, here’s a question for you: which of
these SAS capabilities is most important? Well, it’s hard to say. They are probably all equally important. But in my experience, you will probably spend
more time on managing data than any other task. Additionally, errors made during data management
tend to be the most problematic, and hardest to
detect. In this demonstration I’m going to introduce you to
some of the reports you can create using SAS. Here we have a SAS program called, “01_First
Demonstration”, which is open in an enhanced editor window. This program creates 6 different reports and one
chart. The reports and chart are based on North Texas
Regional Health Department data for laboratory
employees. For now, don’t worry about any of the specifics
of the program, or of the different boxes open on the
screen. That information will all come in due time. I just want you to get a feel for what a SAS
program “does”. The first report is in html format. This report results from the print procedure, and it
simply lists your data. At the top of the screen you see the report title,
and below the title is a table. The first column in the table contains the
observation, or obs, number, which SAS includes automatically by default. Following the obs column are the variables
employee_id, job_title, salary, gender, birth_date,
and manager_id. As you glance over the table, notice that it
includes numbers, letters, and dates. Some of these values are pretty self-explanatory. Others, like the values of job_title, are
meaningless unless you are already familiar with
the data. In the second report you can see that the values
are formatted, and much easier to interpret. The data itself hasn’t changed, but the report
now shows “Laboratory Assistant I” instead of
simply showing “I”, “Laboratory Assistant II” instead of “II” and so on. You can also see that salaries are now more
explicit, and instead of simply showing “F” and “M”, the report now shows the words “Female” and
“Male”. Looking over this report, what variable do you
think the report is grouped by? The report is grouped by gender. It first lists all of the females, and then all of the
males. This report shows similar information, but it is
displayed in the output window. This type of output is referred to as listing output. The fourth report is a frequency distribution of
laboratory employees by job title. In this report, which job title represents the
highest percentage of the total? 10 people have the job title “Laboratory Assistant
I”. This makes up roughly 56% of all laboratory
employees. This report gives us some simple summary
statistics about the salaries of laboratory
employees. What is the average salary of laboratory
employees with the job title “Laboratory Assistant
II”? The average salary of laboratory employees with
the job title, “Laboratory Assistant II” is $27,295
per year. The sixth report our program created shows the
results of the univariate procedure in html format. Later, we will discuss this report in greater detail. Finally, our program created a three dimensional
bar chart in html format. What do the height of the bars in this chart
represent? The height of the bars in this chart represent the
number of men and women, grouped separately,
with each job title. This concludes your brief look at some sample
reports in SAS. You saw html and list reports, frequency reports, statistical reports, and a chart. In this demonstration we’re just going to open
SAS in the Windows operating system. Now there’s more than one way to open SAS, but for now I’m going to show you how to open
SAS using the start menu. Watch as I click on the start menu, then on all programs, then on SAS, and then on SAS 9.4. If you are using a version of SAS other than 9.4,
don’t worry about it. It should open the exact same way. When SAS opens, you will see the interactive
SAS windowing environment. This is the interface we will use throughout this
course. Notice the results and explorer windows on the
left, and the log and enhanced editor windows on the
right. The output window is also open, but it is behind
the log and editor windows. Generally, this is the interface you will use when
you write SAS programs to perform programming
tasks. This interface provides tools for programming,
including windows for entering and editing code, checking the log and debugging programs, viewing and managing output, and viewing and managing SAS files. Each time you start SAS, you see some
messages in the log window, including copyright notes and the version of SAS
you are using. One potentially important piece of information is
the site number. If you contact SAS technical support, they will
request this site number. What is the site number in the log window shown
here? The site number in the log window shown here is
70080564. Here’s your fist practice opportunity. Good luck!