How to use the library in multiple GPUs #1036
Replies: 4 comments 4 replies
-
No part of the package uses multiple GPUs at the moment. However, training
an NER model takes about an hour on a 2080ti, which doesn't seem too long.
How long is it taking you?
…On Tue, May 24, 2022, 5:39 AM Chunontherocks ***@***.***> wrote:
Hi,
I have recently used the stanza 1.3.0 version for testing. I would like to
inquire a few questions related to this.
First, I am wondering if there are some methods or samples how to use the
library in multiple GPUs by stanza 1.3.0 version and torch 1.11.0+cu113
version.
However, if these are not available, I would like to further know how we
can possibly improve the NER model training time.
Thanks!
—
Reply to this email directly, view it on GitHub
<#1036>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2AYWJYOY27546IYYQFC5DVLTEW5ANCNFSM5WZNRANQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Those sizes are in bytes? As in, 1.3MB? It seems very problematic that it
would take a week to train a model of that size! Is this a public dataset
or something you can share with us so we can take a look? I wonder if
there's something degenerate happening which we need to check for, such as
everything merged into one long sentence, for example.
For reference, one of the datasets which I did recently in an hour or two
is Germeval2014, which is 4MB as a .bio or 20MB in our .json format
…On Wed, May 25, 2022 at 5:30 AM Chunontherocks ***@***.***> wrote:
Hi there,
I spent more than a week to train the model on a A5000. Then, I checked
the GPU processing that just used 30%.
Before training, I prepared the data: the dev.json (411K), the test.json
(411K) and the train.json (1.3M).
I don't know whether the time I spent is too much or appropriate, and
whether the data size is too big to train so it spent more time.
However, since we can't use multiple GPUs, I would like to know if there
are some other ways to reduce the processing time and some places that I
need to check again.
Thanks.
—
Reply to this email directly, view it on GitHub
<#1036 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2AYWM6H6MIGE2SRTTNNKDVLYMNBANCNFSM5WZNRANQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Without being able to see what's going wrong, I'm just guessing. However, one guess that comes to mind is that if the sentences are of unusual length, that might cause a slowdown. Actually, the model itself would train the same as long as the sentences fit in the GPU, but it would loop over the same data many more times before hitting the triggers to check for the end of training. There should be a block like this when you first start:
What does it say for your dataset? Another possibility is that you can send me the dataset offline, and I'll delete it after taking a look. |
Beta Was this translation helpful? Give feedback.
-
You should be able to see it on my profile
…On Sun, May 29, 2022 at 8:01 PM Chunontherocks ***@***.***> wrote:
It was similar to be shown on the command line when the training was
starting.
I would like to know where the email I can send the datasets. Thanks!
—
Reply to this email directly, view it on GitHub
<#1036 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2AYWN7T4GUZ67A6K4HKM3VMQVSHANCNFSM5WZNRANQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi,
I have recently used the stanza 1.3.0 version for testing. I would like to inquire a few questions related to this.
First, I am wondering if there are some methods or samples how to use the library in multiple GPUs by stanza 1.3.0 version and torch 1.11.0+cu113 version.
However, if these are not available, I would like to further know how we can possibly improve the NER model training time.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions