Replies: 6 comments 10 replies
-
the training scripts save the models by default to but you can change that by changing the I don't have any particular experience training on Colab, but I will say that training on CPU is slooow compared to GPU |
Beta Was this translation helpful? Give feedback.
-
I don't think the layout has changed in the past year. There are a couple places to look for documentation: https://stanfordnlp.github.io/stanza/training.html https://github.com/stanfordnlp/stanza-train/ Ultimately it looks for the data processed into CoNLL format in |
Beta Was this translation helpful? Give feedback.
-
I agree that is weird. I will note that the error is complaining about "Old English" with no underscore, whereas the long name was written as "Old_English". Perhaps that is the source of the problem. If not, please include a full stack trace if possible |
Beta Was this translation helpful? Give feedback.
-
Well, that does tell you exactly where it's looking... are the data files where it expects, or do you have them somewhere else? |
Beta Was this translation helpful? Give feedback.
-
Hi, thanks for bearing with me. I have managed to sort the issue, thanks for your help. When running the command
I'm getting the following error:
However, my dataset already presents that information for each sentence. I'm attaching a excerpt from my dataset, and another one from the dummy data in the repo
Any ideas on how to sort this out? I'm guessing it has to do with the formatting of my dataset. It seems that in VSCode, spacing is a little bit different from the first to the second instances. However, tabulation and spaces are exactly the same. Thanks for your help! |
Beta Was this translation helpful? Give feedback.
-
I'm actually going to push back on the idea that each sentence already has
"# text", but perhaps there is a different format for some sentence, as you
suggest. I added a bit more information to the error message to hopefully
help you narrow down where it's happening. You can install Stanza with
extra information in the error message as follows:
pip install --no-deps --upgrade --force -i https://test.pypi.org/simple/
stanza==1.6.1.1
|
Beta Was this translation helpful? Give feedback.
-
Hi all!
I'm about to start training a new language model, and I have some hardware restrictions. I'm researching about Google Colab and how to train models there, and I was wondering if someone has already done this, and which is the best approach to do that.
I also would like to ask about how to save the model architecture once the training has finished. As far as I know, the model will be trained with the PyTorch library. Should I add the corresponding lines in the script for each of the processors? Is there a better way? Or is it already implemented in the code?
Thanks for your attention, and have a nice day!
Beta Was this translation helpful? Give feedback.
All reactions