Text mining to support abstract screening for knowledge syntheses: A workflow approach

Human and computing resources required for implementing the workflow

This project was part of an on-going collaboration between a knowledge synthesis team, and a laboratory for systems, software and semantics.1,2 The project team is inter-disciplinary, including reviewers, review coordinators, review methodologists, data analysts and research scientists in semantic computing and knowledge engineering. All analyses were conducted in R.3 One author (JJ) initially developed the R codes for a workshop for doctoral students (all material available online).4 Another author (BP) adapted the R codes to the case studies, with problem solving support from other team members through 1-hour weekly meetings. Coding was initially done on a laptop (Intel core i3-4000M CPU@2.4GHz, 4GB RAM, 32-bit operating system) in R Studio,5 and run on a server (Linux Ubuntu, 64-bit operating system) through remote-communication freeware between the laptop and the server (Putty and Xming, see the references for the related URL’s),6,7 and parallel computing.8 The initial investment from the SR team was time and effort to foster collaboration with researchers with TM/ML expertise, to arrange for access to computing power, and to provide a supporting environment for SR automation.

Resources are required to integrate the implemented workflow into a review team, including access to TM/ML expertise and high performance computing (through a collaboration as described above), and the acquisition of TM skills by a team member. With TM tools that are increasingly accessible to non-specialists, the dedicated member could consider taking an introductory short-course in TM, or learning the relevant topics online. Our experience suggested that any reviewer with some interest in data analysis could learn to peruse the sample R codes to conduct the TM/ML analysis over a short time period (e.g., 3 months).

Download files


  1. KT. Knowledge Translation Program. https://knowledgetranslation.net/.
  2. LS3. Laboratory for Systems, Software and Semantics. http://ls3.rnet.ryerson.ca/.
  3. Team RC. R: A language and environment for statistical computing. 2013; http://www.R-project.org/.
  4. Jovanović J. Society for Learning Analytics Research. 2018; https://solaresearch.org/events/lasi/lasi-2018/lasi18-workshops/.
  5. RStudio. R Studio.
  6. Putty. https://www.putty.org/.
  7. Xming. https://sourceforge.net/projects/xming/.
  8. Zhou M. High performance computing in R using doSNOW package 2014.