Local Entities Similarity

  • Matches parsed command components (e.g., action, location, item) to the robot's context.
  • Uses embedding models for semantic similarity with cosine similarity comparison.
  • Example: "kitchen" inferred as "kitchen_table" if only "kitchen_table" and "office" exist.
  • Benchmarking:
  • Multiple embedding models evaluated for specific tasks.
  • OpenAI text-embedding-3-small/large models used as a reference.
  • Synonyms generated using GPT-4 for evaluation metrics.

Embedding models benchmark

MODEL Accuracy (%) Average embedding time (s) Peak GPU Memory (MB)
paraphrase-TinyBERT-L6-v2 78.49 0.009 264.63
all-MiniLM-L6-v2 78.49 0.008 95.40
all-MiniLM-L12-v2 76.88 0.009 135.39
all-mpnet-base-v2 67.74 0.011 428.76
multi-qa-mpnet-base-cos-v1 80.65 0.012 426.85
paraphrase-MiniLM-L6-v2 78.49 0.011 95.40
paraphrase-distilroberta-base-v1 72.04 0.011 322.62
stsb-roberta-large 76.88 0.012 1365.72
roberta-large-nli-stsb-mean-tokens 76.88 0.014 1365.72
paraphrase-albert-small-v2 60.75 0.013 54.08
all-roberta-large-v1 46.77 0.015 1365.72
all-distilroberta-v1 56.45 0.015 323.90
multi-qa-distilbert-cos-v1 77.96 0.014 263.86
paraphrase-albert-small-v2 60.75 0.014 52.95
paraphrase-MiniLM-L3-v2 79.03 0.013 74.72
text-embedding-3-small 73.12 0.331 -
text-embedding-3-large 78.49 0.348 -