Can summarization tech help authors draft their manuscripts?
Scholarcy’s Manuscript Pilot with Future Science Group
Background
We recently ran a pilot with Future Science Group to evaluate how well Scholarcy’s knowledge extraction and summarisation technology worked in comparison to author generated abstracts, summary points and keywords. The pilot was based on post-acceptance, pre-publication papers supplied as un-copyedited Word files to FSG’s Editorial Director, Laura Dormer and included:
- Population-based assessment of the National Comprehensive Cancer Network recommendations for baseline imaging of rectal cancer (Abdel-Rahman and Cheung, 2019)
- Medicaid insurance status predicts postoperative mortality after total knee arthroplasty in State Inpatient Databases (Maman et al., 2019)
- Indicator of access to medicines in relation to the multiple dimensions of access (Garcia et al., 2019)
As part of Future Science Group’s submission guidelines, the authors had included a set of recommended keywords and summary points for each paper. In this study we took the three raw manuscripts and removed the abstract, keywords and author highlights from each document. We then ran these versions of the manuscripts through Scholarcy and compared the output (structured summary, highlights and keywords) with the originals.
Hypothesis
Our hypothesis was that Scholarcy could save authors time at the manuscript drafting and submission stage by auto-generating a set of key points and keywords from the paper that were equivalent to, or of a higher quality than the originals, using knowledge extraction and summarisation technology. We also wanted to explore how useful Scholarcy’s highlights and structured summaries could be for authors to draft informative abstracts in less time – a challenge widely acknowledged in the academic research community:
“What rarely gets covered in [abstracts] are the actual key findings of the article. Readers are normally left to guess what the researcher’s ‘bottom line’ conclusion or academic ‘value-added’ is, still less what key ‘take-away points’ the author would ideally want readers to remember.
Abstracts […] tend to be rather casually written, perhaps at the beginning of writing when authors don’t yet really know what they want to say, or perhaps as a rushed afterthought just before submission to a journal or a conference. Some academics actually seem to start writing their abstract only after they have begun the online submission process, and so just clutch at a few, random straws to fill up some of the wordspace allotted to them. Others discover that their earlier or conference-vintage abstract is over-long, and so have to edit it down on the spot to fit within the journal’s precise word limit.” (Writing for Research, 2020)
Results
We ran the three manuscripts through Scholarcy’s API, which converts any document, in any format, into a unified structured summary of the original and presented the results to FSG as summary flashcards in Scholarcy Library. We also presented side-by-side comparisons of the author’s abstract, summary points and keywords versus Scholarcy’s structured summary, article highlights and keywords.
Scholarcy vs original manuscript
The following is a direct comparison of the keywords and summary points from one of the papers in the case study (Indicator of access to medicines in relation to the multiple dimensions of access) versus Scholarcy’s keywords and highlights:
Author Keywords vs Scholarcy Keywords
Original manuscript: “access”, “essential medicines”, “indicator”.
Scholarcy output: “co payment”, “public health”, “essential medicine”, “middle income country”, “diabetes”, “LMIC”, “arthritis”, “Chronic Medicines Dispensing and Distribution”, “lower and middle income countries”, “health care”, “rheumatoid arthritis”, “gross domestic product”, “Central and Eastern European”, “Ethiopia”
Author Key Points vs Scholarcy Highlights
Feedback
Survey responses from Laura Dormer, Future Science Group:
Qualitative evaluation of study
Talking in more detail about the quality of Scholarcy’s output for all three paper in comparison to the original manuscripts, Laura said:
“My overall feeling was that the Scholarcy summaries were more detailed than the authors’* which would be useful for initial manuscript assessment. The key points were generally good too, so would be useful if authors omitted this section (or as a tool for authors to use themselves).”
“My main suggestions are that it would be good for the Scholarcy abstract to clearly pull out the aim of the research, which wasn’t always the case. The Scholarcy summaries seemed to pull out a lot of information in the methods section regarding the demographics of the cohort (which would be useful for manuscript assessment); however, it did omit some detail on what the authors did with that cohort. Also, the keywords were good, but were pretty long lists, so it would be good to rank these in some way.”
*authors are restricted by our author guidelines to try and keep the abstract to 120 words, so this inevitably leads to very brief overviews.
Paper 1: Population-based assessment of the National Comprehensive Cancer Network recommendations for baseline imaging of rectal cancer.
“In this example, I found the Scholarcy summary included more information on the cohort itself (male vs female, racial demographics, etc.), but less on what was actually done with the cohort.
Scholarcy’s keywords were generally good and overall I found the Scholarcy summary points to be better than the authors’ original. The only omission that would be needed was the Scholarcy summary didn’t include the overall conclusion, i.e., that clinicians could consider omitting chest and abdominal imaging.”
Paper 2: Medicaid insurance status predicts postoperative mortality after total knee arthroplasty in State Inpatient Databases.
“The Scholarcy version is useful in providing more background, but it would be good to include a succinct aim of the research too (i.e., specifically mentioning the comparison of outcomes between Medicaid and privately insured individuals).
Overall I found the Scholarcy conclusion section better, as it included study limitations, which the author version didn’t. The Scholarcy summary points were good, although there was some redundancy in the bullet points. But they would be useful to edit if the author hadn’t provided this section.”
Paper 3: Indicator of access to medicines in relation to the multiple dimensions of access.
“I thought the Scholarcy summary points were better written than the authors’ original – they summarised all the key points, and were more succinct.”
Developing Scholarcy to meet the needs of authors and editors
This study has shown that whilst Scholarcy doesn’t always produce better results than the original abstract and summary points, it can provide a useful basis for authors to draft these aspects of their manuscript in less time. By identifying the most salient keywords and highlighting the main findings, Scholarcy also facilitates post-publication promotion and discoverability of research. And whilst we have not yet developed the technology to fully automate abstract-writing, we are currently working on ways for our API to get authors 90% of the way there. The added implication of this is that it will soon be possible for authors not only to draft an informative abstract that is fully reflective of their paper in significantly less time, but they will also have a useful lay summary that they and the publisher can use to promote their work.
We’re continually working to improve our knowledge extraction and summarisation technology at Scholarcy based on the results of pilots such as this one with Future Science Group. Since this study we’ve added relevance ranking scores to keywords in our API, so we now show only the top 10 most relevant keywords for any published article or manuscript. We’ve also run a larger manuscript summarisation evaluation across five subject domains and will report on these results in our next blog post.
Bibliography
- Abdel-Rahman, O. and Cheung, W. (2019). Population-based assessment of the National Comprehensive Cancer Network recommendations for baseline imaging of rectal cancer. Journal of Comparative Effectiveness Research, 8(14), pp.1167-1172.
- Garcia, M., Barbosa, M., Silva, R., Reis, E., Alvares, J., Assis Acurcio, F., Godman, B. and Guerra Junior, A. (2019). Indicator of access to medicines in relation to the multiple dimensions of access. Journal of Comparative Effectiveness Research, 8(12), pp.1027-1041.
- Maman, S., Andreae, M., Gaber-Baylis, L., Turnbull, Z. and White, R. (2019). Medicaid insurance status predicts postoperative mortality after total knee arthroplasty in state inpatient databases. Journal of Comparative Effectiveness Research, 8(14), pp.1213-1228.
- Writing for Research. (2020). Writing informative abstracts for journal articles. [online] Available at: https://blogs.lse.ac.uk/writingforresearch/2014/02/16/writing-informative-abstracts-for-journal-articles/ [Accessed 17 Feb. 2020].