[Short Paper Review] IMAGEBIND: One Embedding Space to Bind Them All

최근에 나온 논문을 읽어보고 간단히 정리했습니다. 혹시 부족하거나 잘못된 내용이 있다면 댓글 부탁드립니다 ️ usechatgpt init success 하나의 embedding에 여섯 개의 modality(image, text, audio, depth, thermal, IMU)가 공유되는 모델. 이를 위해서 이미지와 짝을 이루는 데이터만 있어도 충분하다. 배경 하나의 이미지에 다른 modality에 대한 여러 정보가 담길 수 있다는 것은 이미 잘 알려져 있습니다. 이미지와 소리를 결합한 예시 중 하나는 다음과 같습니다. 현재는 이뿐만 아니라 이미지와 텍스트 등 두 개의 modality를 짝지어 학습하는 경우가 적지 않습니다. 그러나 이러한 종류의 학습용 데이터는 분명히 한정되어 있기 때문에(labe..

원문링크 : [Short Paper Review] IMAGEBIND: One Embedding Space to Bind Them All

등록된 다른 글

Face Recognition(5) : Face Verification and Binary Classification

[Short Paper Review] IMAGEBIND: One Embedding Space to Bind Them All

등록된 다른 글

Face Recognition(5) : Face Verification and Binary Classification

KT 2023년도 봄학기 AI 석사과정 신입생 모집 서류 합격 및 코딩 테스트/인적성 검사 후기(비전공자)

<Prompting, Decomposition> Least-to-Most Prompting Enables Complex Reasoning in Large Language Models (2023.04)

5.4. QR법 / 5.5. 역반복법

<Multi-modal> [BEiT] Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks

Face Recognition(3) : Siamese Network

<LLM> Lost in the Middle: How Language Models Use Long Contexts

Eigenvalues and eigenvectors: Assessment

키자드 로그인

키자드

네이버 블로그

티스토리

커뮤니티