Using Joins in SAS

Other topics

Remarks:

As mentioned in the introduction, we can also use Mergeinside a data step which will be discussed under a separate topic. Joins play a very important role to blend and unify data according to the requirement.

Vertical Joining

Vertical join appends dataset B to dataset A providing both of them have similar variables. For example, we have sales for the month of Jan'17 in dataset A and sales for Feb'17 in dataset B. To create a dataset C that has sales of both Jan and Feb we use Vertical Join.

PROC SQL;
CREATE TABLE C AS
SELECT *
FROM A
UNION 
SELECT *
FROM B;
QUIT;

Now dataset C has observations from both A and B and is appended vertically.

Inner Join

Inner join creates a dataset that contains records that have matching values from both the tables. For example, we have a dataset A that contains customer information and a dataset B that contains credit card details. To get the credit card details of customers in dataset A, let us create dataset C

PROC SQL;
CREATE TABLE C AS
SELECT A.*, B.CC_NUM
FROM CUSTOMER A, CC_DETAILS B
WHERE A.CUSTOMERID=B.CUSTOMERID
QUIT;

Dataset C will have only matching observations from both the datasets.

Left Join

Left join returns all the observations in the left data set regardless of their key values but only observations with matching key values from the right data set. Considering the same example as above,

PROC SQL;
CREATE TABLE C AS
SELECT A.*, B.CC_NUMBER, B.START_DATE
FROM CUSTOMER A LEFT JOIN CC_DETAILS B
ON A.CUSTOMERID=B.CUSTOMERID
QUIT;

Dataset C contains all the values from the left table, plus matched values from the right table or missing values in the case of no match.

Right join

Like left join, right join selects all the observations from the right dataset and the matched records from the left table.

PROC SQL;
CREATE TABLE C AS
SELECT A.*, B.CC_NUMBER, B.START_DATE
FROM CUSTOMER A RIGHT JOIN CC_DETAILS B
ON A.CUSTOMERID=B.CUSTOMERID
QUIT;

Dataset C contains all the values from the right table, plus matched values from the left table or missing values in the case of no match.

Full Join

Full join selects all the observations from both data sets but there are missing values where the key value in each observation is found in one table only.

PROC SQL;
CREATE TABLE C AS
SELECT A.*, B.CC_NUMBER, B.START_DATE
FROM CUSTOMER A FULL JOIN CC_DETAILS B 
ON A.CUSTOMERID=B.CUSTOMERID
QUIT;

Dataset C will contain all records from both the tables and fill in . for missing matches on either side.

Parameters:

Type of joinOutput
Proc SqlSQL procedure inside SAS
Create TableCreates a SAS dataset
SelectSelects required variables from respective datasets
WhereSpecifies particular condition
QuitEnd the procedure

Contributors

Topic Id: 9900

Example Ids: 30452,30453,30454,30455,30456

This site is not affiliated with any of the contributors.