Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the continued instruction-tuning phase #6

Open
Jiaxin-Wen opened this issue Jun 1, 2024 · 5 comments
Open

Question about the continued instruction-tuning phase #6

Jiaxin-Wen opened this issue Jun 1, 2024 · 5 comments

Comments

@Jiaxin-Wen
Copy link

In section 2.5, models are continued fine-tuned on several opensource instruction tuning datasets, which includes the training set of GSM8K and MATH.

I'm wondering after continued fine-tuning, are models evaluated still with few-shot prompting or zero-shot prompting.
For example, if the model is fine-tuned on GSM8K with the following data format:
Question:\n{question}\nAnswer:\n{answer}
In the inference stage, do you still incorporate multiple Q-A pairs into the input, or just the question (which is aligned with the continued fine-tuning stage).

@Jiaxin-Wen
Copy link
Author

Moreover, I find that simply fine-tuning SOTA LMs (e.g., llama-3-8b) on the original training set of GSM8K does not lead to any improvement compared with few-shot performance.

llama-3-8b GSM8K
few-shot prompting 55.57
fine-tuning 55.79

I would like to know if this aligns with your experiment results. If so, could you please share your data for continued instruction-tuning too? That would be really helpful to reproduce the experiment results in this paper.

@xiangyue9607
Copy link
Collaborator

Thanks! We used few-shot for the evaluation. You can find more details in our evaluation code and implementation details in the paper.

@Jiaxin-Wen
Copy link
Author

will you add eos token during pre-training or continue fine-tuning?

@Jiaxin-Wen
Copy link
Author

as all training data is in one-shot format, I'm wondering whether I should remove eos token during pre-training or fine-tuning to adapt to few-shot evaluation

@Jiaxin-Wen
Copy link
Author

or is there any other trick that you used to adapt to few-shot evaluation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants