Problem description: I want to use this multi-label classifier for Google BERT:
https://medium.com/huggingface/multi-label-text-classification-using-bert-the-mighty-transformer-69714fa3fb3dHowever, by default, when Google BERT converts a document to features, it has a max sequence length of up to 512 WordPiece tokens. It will truncate text from articles which are longer than that.
The SQuAD classifier for BERT actually implements a sliding window solution for longer articles
I tried to splice it into the multi-label classifier but didn't get it right
Deliverable: I want a solution to this problem of ingesting long articles (>512 wordpiece tokens) into Google BERT with code in a Jupyter notebook. So perhaps the article is 1024 words long, using the doc_stride solution, it would perhaaps be ingested as 2x512 sequences, then the classification will be done across both of the articles and the arg_max of the predictions is provided.
Comments and documentation of how you created the solution would also be appreciated.
About the recuiterMember since Jul 2, 2017 Krishna
from Maharashtra, India