Area: Information Retrieval using machine learning
Since the size of data increases day by day so as the size of indexes also increases. So, it takes too much time (page fault) to load these indexes from secondary memory into primary memory for searching.
The disadvantage of WT is that it takes a lot of time in index construction so I want a model to create WT in parallel using multicore computer architecture. WBTC is a compression technique that is used to minimize the size of text (in no. of bits) and WT is used to create indexes. WT is a space-efficient data structure so by using WT and any compression (like WBTC or any other), I want to minimize space complexity.
I want a model which apply WBTC compression and WT in any corpus(documents) and construct WT in parallel.
Skills: Python, parallel processing, machine learning
Number of days for completion: 4 days
About the recuiterMember since Mar 14, 2020 Aprido Cafrianu
from Daejeon, South Korea