[Paper Short Review] Do sequence-to-sequence VAEs learn global features of sentences

[Paper Short Review] Do sequence-to-sequence VAEs learn global features of sentences

Keypoints

VAE architecture.
Classification but uses quite intersting training step.
It is again unclear how the latent code act.
It used $\delta-VAE$ however why? just to prevent posterior collapse.

Questions and Answers

Which model has it used?

VAE based on the seq2seq LSTM Autoencoder.
$\text{L words : }x = (x_1, x_2 ,\cdots, x_L) \\ \text{embedded L vectors : } (e_1 ,\cdots, e_L)\\ h_1, \cdots, h_L = \mathbf{LSTM}(e_1, \cdots, e_L)$

Next, generate latent vector using the last hidden vector.

$\mu = \mathit{L_1}h_L \\ \sigma^2 = \exp{(\mathit{L_2}r)} \\ q_\phi(z|x) = \mathcal{N}(z|\mu, \mathbf{diag}(\sigma^2))$

and decoding step.
$h_1', \cdots, h_L' = \mathbf{LSTM}([e_{BOS};z], [e_1;z], \cdots, [e_L;z])$

finally,
$p_\theta(x_{i+1})|x_1,\cdots, x_i, z) = softmax(wh_i'+b)$

The objective function is the marginal log-likelihood $ELBO$

$ELBO(x, \theta, \phi) = -D_{KL}(q_\phi)(z|x)||p(z)) + \mathbb{E}_{q_\phi} [\log{p\theta}(x|z)]$

What is data?

four small versions of the labeled datasets(topic or sentiment). ( $\sim70MB$ )

AG News
Amazon
Yahoo
Yelp

Dealing with posterior collapse.

modify the obejctive function which uses the free bits formulation of the $\delta-VAE$ For a desired rate $\lambda$

$\max{D_{KL}(q_\phi(z|x)||p(z)), \lambda) - \mathbb{E}_{q_\phi}[\log{p_\theta(x|z)}]}$

Contribution

Measure which words benefit most from the latent information.

Experiments

References

[1] Do sequence-to-sequence VAEs learn global features of sentences?

[2] Ali Razavi, Aaron van den Oord, Ben Poole, and Oriol Vinyals. 2019. Preventing Posterior Collapse with delta-VAEs. In International Conference on Learn- ing

저작자표시 비영리 동일조건

'딥러닝 > 자연어(NLP)' 카테고리의 다른 글

NLP의 모든 분야 탐색 (Update 중) (0)	2021.02.19
[Fairseq 1] Robert Pretrain 코드 돌리기 (0)	2021.02.16
Text Summarization 분야 탐색 (0)	2021.02.05
Transformer와 인간의 뇌 구조 (0)	2021.02.03
Byte Pair Encoding 방법 (0)	2021.02.02

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Rudi

[Paper Short Review] Do sequence-to-sequence VAEs learn global features of sentences

[Paper Short Review] Do sequence-to-sequence VAEs learn global features of sentences

Keypoints

Questions and Answers

Which model has it used?

What is data?

Dealing with posterior collapse.

Contribution

Experiments

References

'딥러닝 > 자연어(NLP)' 카테고리의 다른 글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역