GPT-NeoX-20B

Link: https://huggingface.co/EleutherAI/gpt-neox-20b
Family: GPT
Pretraining Architecture: Decoder
Pretraining Task: LM
Extension: Similar to GPT-3 with rotary encoders instead of positional, parallel attention and feed forward layers, different initialization, and all dense layers instead of alternate dense/sparse
Application: same as GPT-3
Date (of first known publication): 04/2022
Num. Params: 20B
Corpus: Pile — 840 GB open source text dataset that combines 22 pre existing datasets
License: Open, Apache-2.0
Lab: EleutherAI

Last updated 1 year ago