Yes you are right. But LSTMs worked better for shorter sequences but for they too suffered with vanishing and exploding gradients forlonger sequences. This led to transformers and this is base of all modern day NLP breakthroughs

Ramjee Rajasekaran
Ramjee Rajasekaran

Written by Ramjee Rajasekaran

NLP Engineer, AI ML practitioner, Problem Solver

No responses yet