# SAS Notes

## The Data Step

### Subsetting IF

In a Data Step you can exclude some observations from the dataset with an if statement.

```
data tornados_1980s;
infile FileName;
input year city damages;
* this limits input data to 1980s data;
if 1980
```

```
if year in (1980, 1981, 1982);
```

```
if year = '1980' and city = 'Baltimore';
```

### infile 'filename'

In the data step, import data from a file with the infile command.

```
data tornados;
infile 'tornados.dat';
input year city cost;
run;
```

### Set

set.

The following creates a dataset of 1980s tornado data from the larger

set of tornado data.

```
data tornados_1980s;
set tornados;
if 1980
```

## The PROC Step

### PROC SORT

```
proc sort data=tornados;
by year city;
proc print data=tornados;
by year;
run;
```

### PROC Univariate

PROC Univariate generates descriptive statistics

```
proc univariate data=tornados;
histogram year;
run;
```

### PROC means

Use proc means when you are only interested in basic descriptive statistics.

### PROC freq

### PROC gplot

```
proc gplot data=tornados;
plot year*cost;
title 'Year by Cost tornados';
run;
```

### PROC corr

#### compute the correlation

```
proc corr data=grades;
var exam1 exam2 hwscore;
run;
```

### PROC reg

```
proc reg data=grades;
model final=exam1 hwscore / p r cli clm;
plot final*hwscore;
run;
```

## Multiple Regression Analysis

### Variable Selection

SAS has several methods for selecting variables

```
proc reg data=cdi;
model y = x1-x8 /selection=rsquare best=1;
model y = x1-x8 /selection=adjrsq best=5;
model y = x1-x8 /selection=cp best=10;
model y = x1-x8 /selection=forward slentry=0.10;
model y = x1-x8 /selection=stepwise slentry=0.10 slstay=0.10;
model y = x1-x8 /selection=backward slstay=0.10;
run;
```

