[논문 리뷰] Using the Output Embedding to Improve Language Models

이번에는 Embedding vector의 weight를 같게 하는 weight tying에 대해 연구한 Using the Output Embedding to Improve Language Models 논문에 대해 리뷰해보고자 한다. Transformer를 소개한 Attention is all you need 논문에서 인용되었고, Transformer의 embedding vector를 구성할 때 이 논문을 인용하며 same weight를 공유한다길래, 관심이 생겨서 읽어보게 되었다. 논문 원문 링크는 아래와 같다. Using the Output Embedding to Improve Language Models We study the topmost weight matrix of neural network l..

원문링크 : [논문 리뷰] Using the Output Embedding to Improve Language Models - Weight tying

[논문 리뷰] Using the Output Embedding to Improve Language Models - Weight tying

등록된 다른 글

[논문 리뷰] GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding - GLUE

[논문 리뷰] REPLUG: Retrieval-Augmented Black-Box Language Models

[Dart] Dart의 data type

[Python] 튜플(Tuple)과 시퀀스의 복합 할당

[JavaScript] 변수(variable)란?

[JavaScript] 자바스크립트의 타입 변환

[자료구조] 연결 리스트(Linked list)와 구현 (1)

[Python] Callable이란?

키자드 로그인

키자드

네이버 블로그

티스토리

커뮤니티