Quantcast

Documentation Center

  • Trial Software
  • Product Updates

outerjoin

Outer join between two tables

Syntax

Description

example

C = outerjoin(A,B) creates the table, C, as the outer join between the tables A and B by matching up rows using all the variables with the same name as key variables.

The outer join includes the rows that match between A and B, and also unmatched rows from either A or B, all with respect to the key variables. C contains all variables from both A and B, including the key variables.

example

C = outerjoin(A,B,Name,Value) performs the outer-join operation with additional options specified by one or more Name,Value pair arguments.

example

[C,ia,ib] = outerjoin(___) also returns index vectors, ia and ib, indicating the correspondence between rows in C and those in A and B respectively. You can use this syntax with any of the input arguments in the previous syntaxes.

Examples

expand all

Outer-Join Operation of Tables with One Variable in Common

Create a table, A.

A = table([5;12;23;2;15;6],...
    {'cheerios';'pizza';'salmon';'oreos';'lobster';'pizza'},...
    'VariableNames',{'Age','FavoriteFood'},...
    'RowNames',{'Amy','Bobby','Holly','Harry','Marty','Sally'})
A = 

             Age    FavoriteFood
             ___    ____________

    Amy       5     'cheerios'  
    Bobby    12     'pizza'     
    Holly    23     'salmon'    
    Harry     2     'oreos'     
    Marty    15     'lobster'   
    Sally     6     'pizza'     

Create a table, B, with one variable in common with A, called FavoriteFood.

B = table({'cheerios';'oreos';'pizza';'salmon';'cake'},...
    [110;160;140;367;243],...
    {'A-';'D';'B';'B';'C-'},...
    'VariableNames',{'FavoriteFood','Calories','NutritionGrade'})
B = 

    FavoriteFood    Calories    NutritionGrade
    ____________    ________    ______________

    'cheerios'      110         'A-'          
    'oreos'         160         'D'           
    'pizza'         140         'B'           
    'salmon'        367         'B'           
    'cake'          243         'C-'          

Use the outerjoin function to create a new table, C, with data from tables A and B.

C = outerjoin(A,B)
C = 

    Age    FavoriteFood_A    FavoriteFood_B    Calories    NutritionGrade
    ___    ______________    ______________    ________    ______________

    NaN    ''                'cake'            243         'C-'          
      5    'cheerios'        'cheerios'        110         'A-'          
     15    'lobster'         ''                NaN         ''            
      2    'oreos'           'oreos'           160         'D'           
     12    'pizza'           'pizza'           140         'B'           
      6    'pizza'           'pizza'           140         'B'           
     23    'salmon'          'salmon'          367         'B'           

Table C contains a separate variable for the key variable from A, called FavoriteFood_A, and the key variable from B, called FavoriteFood_B.

Merge Key Variable Pair to Single Variable

Create a table, A.

A = table({'a' 'b' 'c' 'e' 'h'}',[1 2 3 11 17]',...
    'VariableNames',{'Key1' 'Var1'})
A = 

    Key1    Var1
    ____    ____

    'a'      1  
    'b'      2  
    'c'      3  
    'e'     11  
    'h'     17  

Create a table, B, with common values in the variable Key1 between tables A and B, but also containing rows with values of Key1 not present in A.

B = table({'a','b','d','e'}',[4;5;6;7],...
    'VariableNames',{'Key1' 'Var2'})
B = 

    Key1    Var2
    ____    ____

    'a'     4   
    'b'     5   
    'd'     6   
    'e'     7   

Use the outerjoin function to create a new table, C, with data from tables A and B. Merge the key values into a single variable in the output table, C.

C = outerjoin(A,B,'MergeKeys',true)
C = 

    Key1    Var1    Var2
    ____    ____    ____

    'a'       1       4 
    'b'       2       5 
    'c'       3     NaN 
    'd'     NaN       6 
    'e'      11       7 
    'h'      17     NaN 

Variables in table C that came from A contain null values in the rows that have no match from B. Similarly, variables in C that came from B contain null values in those rows that had no match from A.

Outer-Join Operation of Tables and Indices to Values

Create a table, A.

A = table({'a' 'b' 'c' 'e' 'h'}',[1 2 3 11 17]',...
    'VariableNames',{'Key1' 'Var1'})
A = 

    Key1    Var1
    ____    ____

    'a'      1  
    'b'      2  
    'c'      3  
    'e'     11  
    'h'     17  

Create a table, B, with common values in the variable Key1 between tables A and B, but also containing rows with values of Key1 not present in A.

B = table({'a','b','d','e'}',[4;5;6;7],...
    'VariableNames',{'Key1' 'Var2'})
B = 

    Key1    Var2
    ____    ____

    'a'     4   
    'b'     5   
    'd'     6   
    'e'     7   

Use the outerjoin function to create a new table, C, with data from tables A and B. Match up rows with common values in the key variable, Key1, but also retain rows whose key values don't have a match.

Also, return index vectors, ia and ib indicating the correspondence between rows in C and rows in A and B respectively.

[C,ia,ib] = outerjoin(A,B)
C = 

    Key1_A    Var1    Key1_B    Var2
    ______    ____    ______    ____

    'a'         1     'a'         4 
    'b'         2     'b'         5 
    'c'         3     ''        NaN 
    ''        NaN     'd'         6 
    'e'        11     'e'         7 
    'h'        17     ''        NaN 


ia =

     1
     2
     3
     0
     4
     5


ib =

     1
     2
     0
     3
     4
     0

The index vectors ia and ib contain zeros to indicate the rows in table C that do not correspond to rows in tables A or B, respectively.

Left Outer-Join Operation of Tables and Indices to Values

Create a table, A.

A = table({'a' 'b' 'c' 'e' 'h'}',[1 2 3 11 17]',...
    'VariableNames',{'Key1' 'Var1'})
A = 

    Key1    Var1
    ____    ____

    'a'      1  
    'b'      2  
    'c'      3  
    'e'     11  
    'h'     17  

Create a table, B, with common values in the variable Key1 between tables A and B, but also containing rows with values of Key1 not present in A.

B = table({'a','b','d','e'}',[4;5;6;7],...
    'VariableNames',{'Key1' 'Var2'})
B = 

    Key1    Var2
    ____    ____

    'a'     4   
    'b'     5   
    'd'     6   
    'e'     7   

Use the outerjoin function to create a new table, C, with data from tables A and B. Ignore rows in B whose key values do not match any rows in A.

Also, return index vectors, ia and ib indicating the correspondence between rows in C and rows in A and B respectively.

[C,ia,ib] = outerjoin(A,B,'Type','left')
C = 

    Key1_A    Var1    Key1_B    Var2
    ______    ____    ______    ____

    'a'        1      'a'         4 
    'b'        2      'b'         5 
    'c'        3      ''        NaN 
    'e'       11      'e'         7 
    'h'       17      ''        NaN 


ia =

     1
     2
     3
     4
     5


ib =

     1
     2
     0
     4
     0

All values of ia are nonzero indicating that all rows in C have corresponding rows in A.

Input Arguments

expand all

A,B — Input tablestables

Input tables, specified as tables.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'Keys',2 uses the second variable in A and the second variable in B as key variables.

'Keys' — Variables to use as keyspositive integer | vector of positive integers | variable name | cell array of variable names | logical vector

Variables to use as keys, specified as the comma-separated pair consisting of 'Keys' and a positive integer, vector of positive integers, variable name, cell array of variable names, or logical vector.

You cannot use the 'Keys' name-value pair argument with the 'LeftKeys' and 'RightKeys' name-value pair arguments.

Example: 'Keys',[1 3] uses the first and third variables in A and B as a key variables.

'LeftKeys' — Variables to use as keys in Apositive integer | vector of positive integers | variable name | cell array of variable names | logical vector

Variables to use as keys in A, specified as the comma-separated pair consisting of 'LeftKeys' and a positive integer, vector of positive integers, variable name, cell array of variable names, or logical vector.

You must use the 'LeftKeys' name-value pair argument in conjunction with the 'RightKeys' name-value pair argument. 'LeftKeys' and 'RightKeys' both must specify the same number of key variables. outerjoin pairs key values based on their order.

Example: 'LeftKeys',1 uses only the first variable in A as a key variable.

'RightKeys' — Variables to use as keys in Bpositive integer | vector of positive integers | variable name | cell array of variable names | logical vector

Variables to use as keys in B, specified as the comma-separated pair consisting of 'RightKeys' and a positive integer, vector of positive integers, variable name, cell array of variable names, or logical vector.

You must use the 'RightKeys' name-value pair argument in conjunction with the 'LeftKeys' name-value pair argument. 'LeftKeys' and 'RightKeys' both must specify the same number of key variables. outerjoin pairs key values based on their order.

Example: 'RightKeys',3 uses only the third variable in B as a key variable.

'MergeKeys' — Merge keys flagfalse (default) | true | 0 | 1

Merge keys flag, specified as the comma-separated pair consisting of 'MergeKeys' and either false, true, 0 or 1.

false

outerjoin includes two separate variables in the output table, C, for each key variable pair from tables A and B.

This is the default behavior.

true

outerjoin includes a single variable in the output table, C, for each key variable pair from tables A and B.

outerjoin creates the single variable by merging the key values from A and B, taking values from A where a corresponding row exists in A, and taking values from B otherwise.

If you specify, 'MergeKeys',true, then outerjoin includes all key variables in the output table, C, and overrides the inclusion or exclusion of key variables specified via the 'LeftVariables' and 'RightVariables' name-value pair arguments.

'LeftVariables' — Variables from A to include in Cpositive integer | vector of positive integers | variable name | cell array containing one or more variable names | logical vector

Variables from A to include in C, specified as the comma-separated pair consisting of 'LeftVariables' and a positive integer, vector of positive integers, variable name, cell array of variable names, or logical vector.

You can use 'LeftVariables' to include or exclude key variables as well as nonkey variables from the output, C.

By default, outerjoin includes all variables from A.

'RightVariables' — Variables from B to include in Cpositive integer | vector of positive integers | variable name | cell array containing one or more variable names | logical vector

Variables from B to include in C, specified as the comma-separated pair consisting of 'RightVariables' and a positive integer, vector of positive integers, variable name, cell array of variable names, or logical vector.

You can use 'RightVariables' to include or exclude key variables as well as nonkey variables from the output, C.

By default, outerjoin includes all the variables from B.

'Type' — Type of outer join operation'full' (default) | 'left' | 'right'

Type of outer-join operation, specified as the comma-separated pair consisting of 'Type' and either 'full', 'left', or 'right'.

  • For a left outer join, C contains rows corresponding to key values in A that do not match any values in B, but not vice-versa.

  • For a right outer join, C contains rows corresponding to key values in B that do not match any values in A, but not vice-versa.

By default, outerjoin does a full outer join and includes unmatched rows from both A and B.

Output Arguments

expand all

C — Outer join from A and Btable

Outer join from A and B, returned as a table. The output table, C, contains one row for each pair of rows in tables A and B that share the same combination of key values. If A and B contain variables with the same name, outerjoin adds a unique suffix to the corresponding variable names in C. Variables in C that came from A contain null values in those rows that had no match from B. Similarly, variables in C that came from B contain null values in those rows that had no match from A.

In general, if there are m rows in table A and n rows in table B that all contain the same combination of values in the key variables, table C contains m*n rows for that combination. C also contains rows corresponding to key value combinations in one input table that do not match any row the other input table.

C contains the horizontal concatenation of A(ia,LeftVars) and B(ib,RightVars) sorted by the values in the key variables. By default, LeftVars consists of all the variables of A, and RightVars consists of all the from B. Otherwise, LefttVars consists of the variables specified by the 'LeftVariables' name-value pair argument, and RightVars consists of the variables specified by the 'RightVariables' name-value pair argument.

You can store additional metadata such as descriptions, variable units, variable names, and row names in the table. For more information, see Table Properties.

ia — Index to Acolumn vector

Index to A, returned as a column vector. Each element of ia identifies the row in table A that corresponds to that row in the output table, C. The vector ia contains zeros to indicate the rows in C that do not correspond to rows in A.

ib — Index to Bcolumn vector

Index to B, returned as a column vector. Each element of ib identifies the row in table B that corresponds to that row in the output table, C. The vector ib contains zeros to indicate the rows in C that do not correspond to rows in B.

More About

expand all

Key Variable

Variable used to match and combine data between the input tables, A and B.

See Also

|

Was this topic helpful?