Set Operations

Intersection
The intersection operator is implemented as a special case of join where the join condition is equality on all of the fields.

Cross Product
The cross product is implemented as a special case of join where there is no join condition

Union
There are two valid implementations for the union operator

Sorting
As a refinement, we can generate sorted runs of $$R$$ and $$S$$ and merge these runs in parallel
 * 1) Sort $$R$$ using the combination of all the fields; similarly, sort $$S$$
 * 2) Scan the sorted $$R$$ and $$S$$ in parallel and merge them, eliminating duplicates

Hashing

 * 1) Hash $$R$$ and $$S$$ into their relative partitions on disk.
 * 2) Build an in memory hash table for $$S_l$$
 * 3) Scan $$R_l$$. If the tuple is in the hash table, discard it, otherwise add it to the hash table
 * 4) Write out the hash table when we are done scanning $$R_l$$ and write the result out to disk

Difference
There are two valid implementations for the difference operator. They are very similar to the union process set up above.

Sorting

 * 1) Sort $$R$$ using the combination of all fields; similarly, sort $$S$$
 * 2) Write tuples of $$R$$ only if they do not appear in $$S$$

Hashing

 * 1) Hash $$R$$ and $$S$$ into their relative partitions on disk
 * 2) Build an in memory hash table for $$S_l$$
 * 3) Scan $$R_l$$. If the tuple is not in the hash table, write it out to the result