Semantic Scholar: Searching Scholarly Literature Using AI Technology

Semantic Scholar is a free search engine that was created by the Allen Institute for Artificial Intelligence in Seattle and launched 2015. “The “trick” of the service was the technology of deep semantic understanding of data, which significantly expanded the capabilities of a conventional electronic database. The search algorithm Semantic Scholar is able not only to provide a list of results according to specified parameters, but also to “compress” the meaning of texts to a few sentences, thus facilitating the process of information selection.
Let’s understand together how this algorithm works and how it can be used as efficiently as possible.

Principle of operation and access conditions

The database indexes metadata of ~200 million publications, which is many times more than in WoS and Scopus. Then AI “enriches” the data, i.e. extracts everything necessary from the document in order to present it to the user in the most detailed form: PDF file, bibliographic description in different standards, information about authors and sources, brief description. The output is a knowledge graph, i.e. a semantic network that stores information about different entities and the relationships between them. Unlike The Lens and Dimensions, which operate under the freemium model, Semantic Scholar is a non-commercial project, which means that all its functions are 100% open access.

Search Semantic Scholar

To start a search, enter a keyword or phrase in the search field. The system automatically displays hints of articles that may be of interest to the user: the semantic analysis algorithm generates results based on the frequent use of the phrase in the text. If you want to see the full list of relevant sources, click on “search”.

Search results can be specified by field of scientific knowledge, period of coverage, author, availability of access to the full text and journal in which the text is published. Additionally, you can sort documents by relevance, most cited and influential, or novelty.

The summary of an article can traditionally be found in the abstract, but even if they seem too long, you can trust the AI to read a summary of the text in a couple of short sentences.

The bibliographic description of a document can be copied or exported to the bibliographic manager via the “cite” button.

Advanced document data

From the search results you can go to the advanced information about a particular document. On the right side, the most important quantitative indicators are displayed to help you evaluate the “weight” of the publication. In addition to the total number of citations in the works of other scholars, the system also calculates the number of the most influential citations.

A great advantage of Semantic Scholar is the ability to cross-reference. On the document page, separate tabs list the works cited by the author of this publication (list of references) and the works whose authors cited this publication. Thus, by following the chain of citations, the user can trace the path of continuity of ideas in his/her field.

Also, in a separate tab you can see similar works that the AI has picked up. Each document has a brief description, which eliminates the need to study each publication separately.

Search by author

Enter the author’s Surname and First Name in the search field and select the person you are looking for from the result by clicking on it. You can use filters to refine your search.

Data from the author’s profile and their organization in the database give the most complete picture of the researcher’s relations with other members of the academic community: separate tabs contain authors cited by the scientist himself, those who cited him, and a list of co-authors. The list of publications can be searched in the same way as the database as a whole, as if the author’s profile is a separate database. It is convenient that you can distinguish the most influential works, and thus quickly get an idea of the scientific activity of a specialist.

One glance at a publication is enough to conditionally estimate its “weight”: the number of citations and the most significant references are displayed under the title.

On the left, under the scientist’s name, his ORCID and the most important quantitative indicators of his work according to Semantic Scholar are displayed: number of publications, Hirsch index, total number of citations, and number of the most influential citations during his career. You can follow the work of top experts in your field by setting up alerts. Notifications about new publications of the scientist will be sent to your e-mail.