Sampling strategy for FewShotClassifier #986

BenjaminBossan · 2023-06-26T13:18:37Z

We just released the FewShotClassifier class in #979.

One feature we thought could be interesting to implement is a way to change the sampling strategy for the few shot samples, i.e. how the samples are chosen for few-shot learning.

Right now, the sampling is hard-coded and basically tries to add each label at least once. This seems reasonable but there are situations where other strategies could make sense. Therefore, I would like to see a feature that allows setting the sampling strategy as a parameter. Options that come to mind:

Stratified sampling: roughly what we have now, but not quite
Fully random sampling: sample regardless of label
Similarity-based sampling: use the current sample to find similar samples from the training data
Custom sampling: Allow users to pass a callable that performs the sampling

The text was updated successfully, but these errors were encountered:

BenjaminBossan · 2023-06-29T12:42:00Z

Closing this in favor of #989, which collects all TODOs in relation to LLM classification.

BenjaminBossan added the enhancement label Jun 26, 2023

BenjaminBossan closed this as completed Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sampling strategy for FewShotClassifier #986

Sampling strategy for FewShotClassifier #986

BenjaminBossan commented Jun 26, 2023

BenjaminBossan commented Jun 29, 2023

Sampling strategy for FewShotClassifier #986

Sampling strategy for FewShotClassifier #986

Comments

BenjaminBossan commented Jun 26, 2023

BenjaminBossan commented Jun 29, 2023