For some time now, publishers and other distributors of academic research have been harnessing natural language processing (NLP) technology to get more of their content into the hands of the global academic community. Most research content and discovery services are building their own AI-driven solutions or licensing this technology to connect and expose more of their content for their readers in the following ways:
- Recommending related content
- Clustering large collections of literature under subject categories
- Indexing and linking collections
- Extracting citation context and other structured information from published articles and manuscripts
NLP technology is now widely used at the pre-consumption stage of the publication cycle by authors, reviewers, and publishers to enhance content creation, discovery, and dissemination. But there’s huge untapped potential for knowledge extraction technology to benefit not only the producers but the consumers of this content.
Ways AI is making researchers of us all
Putting AI into the hands of content consumers
So, how can AI help the wide-ranging consumers of scholarly research?
Several groups of readers are currently poorly served by traditional publishing formats:
- Neurodiverse readers and those with specific learning or cognitive needs
- Readers with visual impairment
- Those with English as a second or third language who need to read papers and books predominantly published in English
- Non-academics, including journalists, patient groups, and lay readers who need to understand primary research literature
Advances in publishing technology mean that these readers have more research at their fingertips than ever and can refine their searches and uncover connections between published articles in ever more sophisticated ways. But when it comes to reading and understanding this literature readers are still impeded by the volume, density, and language of published research. Consumers of research, from students to science journalists, face the common challenge of drawing insight and knowledge from long, complex articles. Although the industry has begun to address this with graphical and video abstracts1 and selected plain language summaries2, this curated approach is hard to scale.
Let’s take a look at how recent developments in natural language processing are being applied in four new ways to make primary research more accessible to a wider community of readers.
1. Helping students with reading and learning
A well-reported frustration of educators is that students don’t read their learning materials. The reasons for this are wide-ranging and often, incorrectly, attributed to apathy. Many students, particularly those just starting higher education courses, either don’t know how or aren’t adept at identifying and extracting key information from the literature in front of them. They may also lack the background knowledge to really understand a dense piece of writing on a subject that’s new to them.3
Others feel they are given too much to read and don’t have enough time to absorb it all, which can leave them feeling frustrated and demotivated. The volume of reading required for a new course can make it difficult for students to find and focus on the important information4.
By breaking down reading materials into bite-sized chunks, providing links to background reading, and automatically highlighting key information, Scholarcy is helping students to keep up with their reading and encouraging them to read more. Often something as simple as having the key terms in a paper or chapter highlighted and defined can help orient someone who’s new to a subject. And presenting significant statements from a text as a set of short bullet points can serve as a more user-friendly route into the text.
2. Making research accessible to those with a range of learning needs
In the US and UK, around 13% of HE students self-report a disability. Among the 45,000 disabled students starting university in the UK each year, those with a specific learning difference (SpLD) such as dyslexia, dyscalculia, attention deficit disorder, or learning differences resulting from autistic-spectrum disorder, form the largest group (around 38%).5
Disabled students are eligible for additional support via DSA, such as laptops and assistive software that helps with document reading, mind-mapping, speech to text, and text to speech. Recently, Scholarcy has been approved by DSA as software to support the needs of disabled students, and we are seeing increasing demand from this user base, with some students telling us:
By breaking down learning materials into bite-sized chunks, tools like Scholarcy are helping these readers approach complex texts in a different, more manageable way. Tackling a research paper layer by layer (key concepts, important statements, and findings, section summaries), rather than attempting to read the full text from the opening sentence means the reader can acquire knowledge of the subject in a more systematic and ordered way, which can be particularly useful for those with an SpLD.
3. Helping non-native English speakers overcome the research language barrier
Recent analysis has projected ‘growth of nearly 200% in global higher education enrolments through 2040’6. With a significant increase in college enrolment rates in Latin America and East Asia in the past 20 years and total HE enrolment in Africa expected to ‘triple from 7.4 million students as of 2015 to nearly 22 million by 2040’, students whose first language isn’t English will make up most of the global college population. In the UK alone, 500,000 students in the UK (21% of HE total) are studying English as a second language. And yet, ‘more than three-quarters of scientific papers today are published in English’7.
If native speakers commonly find academic texts impenetrable, imagine how much harder it is for those who have English as their second or third language. The need for this barrier to be broken down is greater than ever if students and researchers worldwide are to be put on a level playing field in terms of access to high-quality research and be able to collaborate on solutions to global challenges.
Here again, the extraction of important terms, statements, and sections from a book or research paper and presenting these in a clearer, more visually appealing format, can go a long way to breaking down the language barrier. By starting with a set of keywords, short sentences, and summaries, those who don’t have English as a first language can get a clearer idea of a text before deciding whether to attempt it in full. Integration with translation tools also makes it more likely for non-native English speakers to read primary research than opting for more accessible, but less reliable web articles and secondary sources.
4. Helping non-academics become consumers of scholarly information
It’s not only those in an academic setting that need to access, read, and understand primary research literature. The rise in misinformation and the fact that it’s getting harder to differentiate between material that’s evidence-based, and that which isn’t, means easy access to peer-reviewed research has never been more important. Journalists, patient groups, and other lay readers regularly need to screen and translate primary research into working knowledge, either for themselves, or to share with others, but the way that material is written often makes this prohibitive. Finding this information is only part of the challenge. Reading and understanding it is what many struggle with.
NLP technology is starting to make a real impact here by generating plain-language synopses from primary research. At Scholarcy, we recently created an experimental AI that writes an overview of a paper in the style of a science journalist. The overview covers the ‘who, what, and why’, along with information on how the study builds on previous work, what the limitations were and what future work needs to be done.
This means that for the first time, lay readers can get an accurate, easy-to-read summary of a research paper that not only makes the purpose and significance of the work easy to grasp but validates this with evidence and data. Here’s an example research summary generated by Scholarcy’s Smart Synopses from a scientific study about how weather affects pain:
So we’ve looked at four ways that NLP can help a range of consumers get the most from their research materials. Perhaps there are other ways that we’ve missed. If you have any suggestions, let us know!