Imprints for Joining

From MonetDB
  • Use same binning method as in Column Imprints paper
  • Perform initial join on two tables R, S. Make "join imprint" on smallest table R.
  • For each cache line of data in R, create an imprint of S with bits set for the bins of S which join with that cache line.
  • Compress and construct dictionary as in Column Imprints paper

Performing a Join Operation[edit]

  • Join "join imprint" of R with column imprint of S with equality being bitwise or.
  • Use result as input to final join
  • Maybe this is cheaper overall?

Expected Benefits[edit]

  • If the tables are nicely correlated, the imprint should be compressible
  • Size of index should be much less than even the size of R
  • Should be possibly to efficiently combine original column imprints with "join imprint" in select + join operations
  • Updateability: simple as in column imprints.