Manara - Qatar Research Repository
Browse

An Effective Scholarly Search by Combining Inverted Indices and Structured Search With Citation Networks Analysis

Download (2.47 MB)
journal contribution
submitted on 2024-09-12, 06:34 and posted on 2024-09-12, 06:34 authored by Shah Khalid, Shengli Wu, Abdul Wahid, Aftab Alam, Irfan Ullah

The rapid growth in the number of scholarly documents on the Web and in other digital platforms makes it challenging for researchers to find research publications most relevant to their information needs. This challenge has been mitigated to a greater extent by the major scholarly retrieval systems, such as Google Scholar, Semantic Scholar, PubMed, CiteSeerX, and others. The reason for the success of these retrieval solutions lies in the advances in ranking approaches. However, the existing studies advocate for the fact that we are still far from the method's effectiveness ceiling, leaving ample room for further improvement to meet the scholarly needs of users. The existing methods adopt different approaches; some use classical Information Retrieval (IR), others use semantics-aware methods, including Knowledge Graph (KG) to support scholarly search. However, we hypothesize that combining the best of both worlds can further improve search relevance. In this context, this work incorporates inverted index from the classical IR with BM25 as the weighting scheme, combined with Citation Networks Analysis (CNA) for the baseline search results, which are then re-ranked by passing the selected entities from the top-k initial search results as the search query to the KG. This way, not only the textual content but also the structural semantics of the research publications are well exploited in the retrieval processes. The goal is to exploit IR and KG-based retrieval techniques to gain insights into the behavior of both textual and structured information in the strategic ranking of scholarly articles. The proposed solution has been evaluated using the ACL Anthology Network (AAN) dataset. The results show that the proposed technique can comparatively improve the retrieval performance in terms of Normalized Discounted Cumulative Gain (nDCG) and precision rates.

Other Information

Published in: IEEE Access
License: https://creativecommons.org/licenses/by/4.0/
See article on publisher's website: https://dx.doi.org/10.1109/access.2021.3107939

Funding

Open Access funding provided by the Qatar National Library.

History

Language

  • English

Publisher

IEEE

Publication Year

  • 2021

License statement

This Item is licensed under the Creative Commons Attribution 4.0 International License.

Institution affiliated with

  • Hamad Bin Khalifa University
  • College of Science and Engineering - HBKU