Local Entities Similarity

Matches parsed command components (e.g., action, location, item) to the robot's context.
Uses embedding models for semantic similarity with cosine similarity comparison.
Example: "kitchen" inferred as "kitchen_table" if only "kitchen_table" and "office" exist.
Benchmarking:
Multiple embedding models evaluated for specific tasks.
OpenAI text-embedding-3-small/large models used as a reference.
Synonyms generated using GPT-4 for evaluation metrics.

Embedding models benchmark

MODEL	Accuracy (%)	Average embedding time (s)	Peak GPU Memory (MB)
paraphrase-TinyBERT-L6-v2	78.49	0.009	264.63
all-MiniLM-L6-v2	78.49	0.008	95.40
all-MiniLM-L12-v2	76.88	0.009	135.39
all-mpnet-base-v2	67.74	0.011	428.76
multi-qa-mpnet-base-cos-v1	80.65	0.012	426.85
paraphrase-MiniLM-L6-v2	78.49	0.011	95.40
paraphrase-distilroberta-base-v1	72.04	0.011	322.62
stsb-roberta-large	76.88	0.012	1365.72
roberta-large-nli-stsb-mean-tokens	76.88	0.014	1365.72
paraphrase-albert-small-v2	60.75	0.013	54.08
all-roberta-large-v1	46.77	0.015	1365.72
all-distilroberta-v1	56.45	0.015	323.90
multi-qa-distilbert-cos-v1	77.96	0.014	263.86
paraphrase-albert-small-v2	60.75	0.014	52.95
paraphrase-MiniLM-L3-v2	79.03	0.013	74.72
text-embedding-3-small	73.12	0.331	-
text-embedding-3-large	78.49	0.348	-