TechTalks Newsletter

Share this post
A gentle introduction to transformer networks
bdtechtalks.substack.com

A gentle introduction to transformer networks

Ben Dickson
May 2
Comment
Share

In recent years, the transformer model has become one of the main highlights of advances in deep learning and deep neural networks. It is mainly used for advanced applications in natural language processing. Google is using it to enhance its search engine results. OpenAI has used transformers to create its famous GPT-2 and GPT-3 models.

Since its debut in 2017, the transformer architecture has evolved and branched out into many different variants, expanding beyond language tasks into other areas. They have been used for time series forecasting. They are the key innovation behind AlphaFold, DeepMind’s protein structure prediction model. Codex, OpenAI's source code–generation model, is based on transformers. More recently, transformers have found their way into computer vision, where they are slowly replacing convolutional neural networks (CNN) in many complicated tasks.

Researchers are still exploring ways to improve transformers and use them in new applications.

Read a brief explainer about what makes transformers exciting and how they work on TechTalks.

For more explainers:

  • What is neural architecture search?

  • What are graph neural networks (GNN)?

  • Demystifying deep reinforcement learning

  • What are membership inference attacks?

CommentComment
ShareShare

Create your profile

0 subscriptions will be displayed on your profile (edit)

Skip for now

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

TopNewCommunity

No posts

Ready for more?

© 2022 Ben Dickson
Privacy ∙ Terms ∙ Collection notice
Publish on Substack Get the app
Substack is the home for great writing