3 Comments

you could retrain the two embedding models jointly by minimzing the L2 distance between the embeddings of the same input and maximizing it for different inputs.

this training can be achieved using a contrastive loss.

Expand full comment

That's an interesting point... Thank you, AVI CHAWLA !

Expand full comment

Would you do a dimensional reduction, say PCA before or after concatenation?

Expand full comment