Join-Object

Wouldn’t it be nice if you join two object arrays just like you join tables in SQL. A common task where would already be useful is updating a stored object list with a newly retrieved one. I am convinced that a Join-Object cmdlet would be used very often if was provided as a standard cmdlet. Unfortunately due to the overhead and uncertainly of a homemade cmdlet everybody tries to build their own specific implementation dealing with the related complexity and performance.

Cmdlet

Syntax

<Object[]> | InnerJoin|LeftJoin|RightJoin|FullJoin <Object[]> [-On <String>|<Array>|<ScriptBlock>] [-Merge <HashTable>|<ScriptBlock>] [-Eq <String>]

InnerJoin|LeftJoin|RightJoin|FullJoin <Object[]>,<Object[]> [-On <String>|<Array>|<ScriptBlock>] [-Merge <HashTable>|<ScriptBlock>] [-Eq <String>]

InnerJoin|LeftJoin|RightJoin|FullJoin -LeftTable <Object[]> -RightTable <Object[]> [-On <String>|<Array>|<ScriptBlock>] [-Merge <HashTable>|<ScriptBlock>] [-Eq <String>]

Commands

The Join-Object (alias Join) function is one function with several aliases that joins two tables (each consisting out of an array of PSCustomObjects) similar to the respective SQL Join instructions. The default join type is an InnerJoin.

  • InnerJoin-Object (alias InnerJoin)
    Returns records that have matching values in both tables.
  • LeftJoin-Object (alias LeftJoin)
    Returns all records from the left table and the matched records from the right table.
  • RightJoin-Object (alias RightJoin)
    Returns all records from the right table and the matched records from the right table.
  • FullJoin-Object (alias FullJoin)
    Returns all records when there is a match in either left or right table.

All Join commands are compatible with PowerShell version 2 and higher.

Parameters

-LeftTable <Object[]> and -RightTable <Object[]>

The LeftTable and RightTable parameters define the left – and right table to be joined. There are three possible syntaxis to supply the tables:

  • Using the PowerShell pipeline: <LeftTable> | Join <RightTable>
  • Supplying both tables in an array (separated by a comma) at the first argument position: Join <LeftTable>,<RightTable>
  • Supplying both tables with named arguments: Join -Left <LeftTable> -Right <RightTable>

If only one table is supplied (Join <Table>), a self self-join will be performed on the table.

-On <String>|<Array>|<ScriptBlock> and -Equals <String>

The -On (alias Using) parameter defines the condition that specify how to join tables and which rows to include in the (inner) result set. The -On parameter supports the following formats:

String [-Equals <String>]
If the -On value is a String and the -Equals <String> parameters is supplied, the property in the left column defined by the -On value requires to be equal to the property in the right column defined by the -Equals value to be included in the (inner) result set.

String or Array
If the value is a String or Array the -On parameter is similar to the SQL using clause. This means that all the listed properties require to be equal (at the left and right side) to be included in the (inner) result set. The listed properties will output a single value by default (see also -Expressions).

ScriptBlock
Any conditional expression where $Left defines the left row, $Right defines the right row.
The ScriptBlock type has the most complex comparison possibilities but is considerable slower than the other types.

If the -On parameter is omitted or an unknown type, a cross-join will be performed.

-Merge <HashTable>|<ScriptBlock>

Defines how the specific columns with the same name should be merged. The -Merge parameter accepts to types: a HashTable containing the specific merge expression for each column or ScriptBlock containing the default merge expression for all columns that have no merge expression defined.
Where in the expression:

  • $_ holds each column name.
  • $Left holds the left row and $Right holds the right row.
  • $Left.$_ holds each left value and $Right.$_ holds each right value.
  • $LeftIndex holds the current left row index and $RightIndex holds the current right row index.

Notes:

Expressions are only executed if both left value (Left.$_) and right value (Left.$_) are existing (including values that are $Null) otherwise just the exiting value is returned.

If no expression is defined for a column the expression {$Left.$_, $Right.$_} is used. This means that both values are assigned (in an array) to the current property.
The default expression for columns defined by the -On <String>, -Equals <String> and -On <Array> is: {$Left.$_} and can only be overruled by a column specific expression defined in a hash table. This means that a single value (either $Left or $Right which is not equal to $Null) is assigned to the current property.
To use column specific expressions and define a default expression use a zero length key name for the default expression, e.g. -Merge @{"" = {$Left.$_}; "Column Name" = {$Right.$_}}

Examples

Given the following two tables, $Employee and $Department:

InnerJoin on Department = Name

LeftJoin using country
(selecting Department.Name and Department.Country)

InnerJoin on Employee.Department = Department.Name and Employee.Country = Department.Country
(returning only the left name and – country)

Cross Join

Update service list
(replace existing services on name and add new ones)

Import-CSV .\Svc.csv | LeftJoin (Get-Service) Name {$Right.$_} | Export-CSV .\Svc.csv

Update process list and only insert processes with a higher CPU

Import-CSV .\CPU.csv | LeftJoin (Get-Process) ID {If ($Left.CPU -gt $Right.CPU) {$Left.$_} Else {$Right.$_}} | Export-CSV .\CPU.csv

 

The Join-Object cmdlet was originally published at StackOverflow