Corrected Attention docstring #893

louis-rf · 2024-02-05T11:19:33Z

TLDR: Corrected Attention docstring missing head dimension for arguments mask and nonbatched_bias.

The Attention module is only called in four places, with the following shapes:

- TriangleAttention:           [q_data: (B, Q, ..); m_data: (B, K, ..); mask: (B, H=1, Q=1, K); nonbatched_bias: (H, Q, K)]
- TemplateEmbedding:           [q_data: (B, Q, ..); m_data: (B, K, ..); mask: (B, H=1, Q=1, K)]
- MSARowAttentionWithPairBias: [q_data: (B, Q, ..); m_data: (B, K, ..); mask: (B, H=1, Q=1, K); nonbatched_bias: (H, Q, K)]
- MSAColumnAttention:          [q_data: (B, Q, ..); m_data: (B, K, ..); mask: (B, H=1, Q=1, K)]

The mask always has new axes for dimensions 2 and 3. In all four cases it is wrapped in alphafold.model.mapping.inference_subbatch, but this doesn't affect the dimensions, only the sizes.

The docstring gives incorrect shapes for the arguments: mask, nonbatched_bias, based on the usage it should be:

      mask: A mask for the attention, shape [batch_size, N_heads, N_queries, N_keys].
      nonbatched_bias: Shared bias, shape [N_heads, N_queries, N_keys].

instead of:

      mask: A mask for the attention, shape [batch_size, N_queries, N_keys].
      nonbatched_bias: Shared bias, shape [N_queries, N_keys].

This is clear when looking at where mask is used:

...

    logits = jnp.einsum('bqhc,bkhc->bhqk', q, k)
    if nonbatched_bias is not None:
      logits += jnp.expand_dims(nonbatched_bias, axis=0)
    logits = jnp.where(mask, logits, _SOFTMAX_MASK)
...

as the output of the einsum has shape: bhqk.

I believe some implementations of attention wont have a head dimension in the mask, since it is not used in AlphaFold maybe it would be worth removing it in the mask when attention is called (and including an expand_dims for this head dimension within the attention module). But only changing the docstring is easier, and it is still a valid implementation of Attention so I think that is the way to go.

google-cla · 2024-02-05T11:19:36Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Corrected Attention docstring missing head dimension for arguments mask and nonbatched_bias.

louis-rf mentioned this pull request Feb 5, 2024

Attention docstring missing head dimension for arguments mask and nonbatched_bias #894

Open

Corrected Attention docstring

d530c9f

Corrected Attention docstring missing head dimension for arguments mask and nonbatched_bias.

louis-rf force-pushed the main branch from cb98745 to d530c9f Compare February 5, 2024 11:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corrected Attention docstring #893

Corrected Attention docstring #893

louis-rf commented Feb 5, 2024

google-cla bot commented Feb 5, 2024

Corrected Attention docstring #893

Are you sure you want to change the base?

Corrected Attention docstring #893

Conversation

louis-rf commented Feb 5, 2024

google-cla bot commented Feb 5, 2024