Also download and tokenize datasets in pure C #230
matiasdelellis
started this conversation in
Ideas
Replies: 1 comment
-
That's just because we're initializing from OpenAI GPT-2 weights and we're using Python to download and write them conveniently. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
There is no need for 245MB of PyTorch or 107MB of cPython... ?
I loved this statement, but when proceeding to install the dependencies, it seems that it needs several gigabytes of python dependencies just to download the datasets.. 😞
I guess this could also be implemented in pure C... Of course I say this even without understanding how this works 😅 , but your projects are great, and I suppose this would be a good goal in line with the project...😬
Beta Was this translation helpful? Give feedback.
All reactions