-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset help #1
Comments
I am so glad to see the datasets you've found. I'm still interested in this project and still finding way to implement training strategy in TensorFlow graph execution mode. I have two strategies to solve this problem.
I haven't decided what would be the best option for me, yet. Thanks |
The way I understand the paper, all of the inputs are tokenized which implies that the main transformer block sees a constant shape at its input. The way I would envision the implementation is to build a "dataloader" that combines the tokenization and embedding part of the process (Section 2.1 and 2.2 of the paper). The first sentence of section 2.2 kind of hints towards that: The dataloader would be conditional on the modality of the input data (vision vs environment observations vs text) and would apply the corresponding tokenization and embedding functions. But at the end of it, the input to the transformer is a simple stream of tokens! |
Hence, we must pre-train the ViT for image patch tokens prior to make stream of tokens. |
Hey @OrigamiDream !
I am also looking into implementing the Gato paper. In order to complement your work, I decided to start looking into the datasets mentioned in the paper first. I started investigating to see if we could access the vision/language datasets used and if we could train SOTA agents to generate the control datasets: torch-gato/datasets
From my initial investigation, we have open-source access to about 83.62% of the datasets/environments used during training (according to the sample weight). If we use similar open-source variants of the private language/vision datasets, this number climbs to 94.32%. There is still a lot of work to do in order to build a training infrastructure for all the SOTA or near-SOTA agents to generate the control environment expert data.
Let me know if you are still interested in this implementation and how we could collaborate!
Cheers,
Thomas
The text was updated successfully, but these errors were encountered: