literature reviews in qualitative research

Archer Library

Qualitative research: literature review .

Archer Library This link opens in a new window
Schedule a Reference Appointment This link opens in a new window
Qualitative Research Handout This link opens in a new window
Locating Books
ebook Collections This link opens in a new window
A to Z Database List This link opens in a new window
Research & Stats
Literature Review Resources
Citation & Reference

Exploring the literature review

Literature review model: 6 steps.

Adapted from The Literature Review , Machi & McEvoy (2009, p. 13).

Your Literature Review

Step 2: search, boolean search strategies, search limiters, ★ ebsco & google drive.

1. Select a Topic

"All research begins with curiosity" (Machi & McEvoy, 2009, p. 14)

Selection of a topic, and fully defined research interest and question, is supervised (and approved) by your professor. Tips for crafting your topic include:

Be specific. Take time to define your interest.
Topic Focus. Fully describe and sufficiently narrow the focus for research.
Academic Discipline. Learn more about your area of research & refine the scope.
Avoid Bias. Be aware of bias that you (as a researcher) may have.
Document your research. Use Google Docs to track your research process.
Research apps. Consider using Evernote or Zotero to track your research.

Consider Purpose

What will your topic and research address?

In The Literature Review: A Step-by-Step Guide for Students , Ridley presents that literature reviews serve several purposes (2008, p. 16-17). Included are the following points:

Historical background for the research;
Overview of current field provided by "contemporary debates, issues, and questions;"
Theories and concepts related to your research;
Introduce "relevant terminology" - or academic language - being used it the field;
Connect to existing research - does your work "extend or challenge [this] or address a gap;"
Provide "supporting evidence for a practical problem or issue" that your research addresses.

★ Schedule a research appointment

At this point in your literature review, take time to meet with a librarian. Why? Understanding the subject terminology used in databases can be challenging. Archer Librarians can help you structure a search, preparing you for step two. How? Contact a librarian directly or use the online form to schedule an appointment. Details are provided in the adjacent Schedule an Appointment box.

2. Search the Literature

Collect & Select Data: Preview, select, and organize

AU Library is your go-to resource for this step in your literature review process. The literature search will include books and ebooks, scholarly and practitioner journals, theses and dissertations, and indexes. You may also choose to include web sites, blogs, open access resources, and newspapers. This library guide provides access to resources needed to complete a literature review.

Books & eBooks: Archer Library & OhioLINK

Books

Databases: Scholarly & Practitioner Journals

Review the Library Databases tab on this library guide, it provides links to recommended databases for Education & Psychology, Business, and General & Social Sciences.

Expand your journal search; a complete listing of available AU Library and OhioLINK databases is available on the Databases A to Z list . Search the database by subject, type, name, or do use the search box for a general title search. The A to Z list also includes open access resources and select internet sites.

Databases: Theses & Dissertations

Review the Library Databases tab on this guide, it includes Theses & Dissertation resources. AU library also has AU student authored theses and dissertations available in print, search the library catalog for these titles.

Did you know? If you are looking for particular chapters within a dissertation that is not fully available online, it is possible to submit an ILL article request . Do this instead of requesting the entire dissertation.

Newspapers: Databases & Internet

Consider current literature in your academic field. AU Library's database collection includes The Chronicle of Higher Education and The Wall Street Journal . The Internet Resources tab in this guide provides links to newspapers and online journals such as Inside Higher Ed , COABE Journal , and Education Week .

The Chronicle of Higher Education has the nation’s largest newsroom dedicated to covering colleges and universities. Source of news, information, and jobs for college and university faculty members and administrators

The Chronicle features complete contents of the latest print issue; daily news and advice columns; current job listings; archive of previously published content; discussion forums; and career-building tools such as online CV management and salary databases. Dates covered: 1970-present.

Search Strategies & Boolean Operators

There are three basic boolean operators: AND, OR, and NOT.

Used with your search terms, boolean operators will either expand or limit results. What purpose do they serve? They help to define the relationship between your search terms. For example, using the operator AND will combine the terms expanding the search. When searching some databases, and Google, the operator AND may be implied.

Overview of boolean terms


Search results will contain of the terms.	Search results will contain of the search terms.	Search results the specified search term.
Search for ; you will find items that contain terms.	Search for ; you will find items that contain .	Search for online education: you will find items that contain .
connects terms, limits the search, and will reduce the number of results returned.	redefines connection of the terms, expands the search, and increases the number of results returned.	excludes results from the search term and reduces the number of results.

Adult learning online education:	Adult learning online education:	Adult learning online education:

About the example: Boolean searches were conducted on November 4, 2019; result numbers may vary at a later date. No additional database limiters were set to further narrow search returns.

Database Search Limiters

Database strategies for targeted search results.

Most databases include limiters, or additional parameters, you may use to strategically focus search results. EBSCO databases, such as Education Research Complete & Academic Search Complete provide options to:

Limit results to full text;
Limit results to scholarly journals, and reference available;
Select results source type to journals, magazines, conference papers, reviews, and newspapers
Publication date

Keep in mind that these tools are defined as limiters for a reason; adding them to a search will limit the number of results returned. This can be a double-edged sword. How?

If limiting results to full-text only, you may miss an important piece of research that could change the direction of your research. Interlibrary loan is available to students, free of charge. Request articles that are not available in full-text; they will be sent to you via email.
If narrowing publication date, you may eliminate significant historical - or recent - research conducted on your topic.
Limiting resource type to a specific type of material may cause bias in the research results.

Use limiters with care. When starting a search, consider opting out of limiters until the initial literature screening is complete. The second or third time through your research may be the ideal time to focus on specific time periods or material (scholarly vs newspaper).

★ Truncating Search Terms

Expanding your search term at the root.

Truncating is often referred to as 'wildcard' searching. Databases may have their own specific wildcard elements however, the most commonly used are the asterisk (*) or question mark (?). When used within your search. they will expand returned results.

Asterisk (*) Wildcard

Using the asterisk wildcard will return varied spellings of the truncated word. In the following example, the search term education was truncated after the letter "t."

Original Search
adult education	adult educat*

	Results included: educate, education, educator, educators'/educators, educating, & educational

Explore these database help pages for additional information on crafting search terms.

EBSCO Connect: Searching with Wildcards and Truncation Symbols
EBSCO Connect: Searching with Boolean Operators
EBSCO Connect: EBSCOhost Search Tips
EBSCO Connect: Basic Searching with EBSCO
ProQuest Help: Search Tips
ERIC: How does ERIC search work?

★ EBSCO Databases & Google Drive

Tips for saving research directly to Google drive.

Researching in an EBSCO database?

It is possible to save articles (PDF and HTML) and abstracts in EBSCOhost databases directly to Google drive. Select the Google Drive icon, authenticate using a Google account, and an EBSCO folder will be created in your account. This is a great option for managing your research. If documenting your research in a Google Doc, consider linking the information to actual articles saved in drive.

EBSCO Databases & Google Drive

EBSCOHost Databases & Google Drive: Managing your Research

This video features an overview of how to use Google Drive with EBSCO databases to help manage your research. It presents information for connecting an active Google account to EBSCO and steps needed to provide permission for EBSCO to manage a folder in Drive.

About the Video: Closed captioning is available, select CC from the video menu. If you need to review a specific area on the video, view on YouTube and expand the video description for access to topic time stamps. A video transcript is provided below.

EBSCOhost Databases & Google Scholar

Defining Literature Review

What is a literature review.

A definition from the Online Dictionary for Library and Information Sciences .

A literature review is "a comprehensive survey of the works published in a particular field of study or line of research, usually over a specific period of time, in the form of an in-depth, critical bibliographic essay or annotated list in which attention is drawn to the most significant works" (Reitz, 2014).

A systemic review is "a literature review focused on a specific research question, which uses explicit methods to minimize bias in the identification, appraisal, selection, and synthesis of all the high-quality evidence pertinent to the question" (Reitz, 2014).

About this page

EBSCO Connect [Discovery and Search]. (2022). Searching with boolean operators. Retrieved May, 3, 2022 from https://connect.ebsco.com/s/?language=en_US

EBSCO Connect [Discover and Search]. (2022). Searching with wildcards and truncation symbols. Retrieved May 3, 2022; https://connect.ebsco.com/s/?language=en_US

Machi, L.A. & McEvoy, B.T. (2009). The literature review . Thousand Oaks, CA: Corwin Press:

Reitz, J.M. (2014). Online dictionary for library and information science. ABC-CLIO, Libraries Unlimited . Retrieved from https://www.abc-clio.com/ODLIS/odlis_A.aspx

Ridley, D. (2008). The literature review: A step-by-step guide for students . Thousand Oaks, CA: Sage Publications, Inc.

Archer Librarians

Schedule an appointment.

Contact a librarian directly (email), or submit a request form. If you have worked with someone before, you can request them on the form.

★ Archer Library Help • Online Reqest Form
Carrie Halquist • Reference & Instruction
Jessica Byers • Reference & Curation
Don Reams • Corrections Education & Reference
Diane Schrecker • Education & Head of the IRC
Tanaya Silcox • Technical Services & Business
Sarah Thomas • Acquisitions & ATS Librarian
<< Previous: Research & Stats
Next: Literature Review Resources >>
Last Updated: Jun 27, 2024 11:14 AM
URL: https://libguides.ashland.edu/qualitative

Archer Library • Ashland University © Copyright 2023. An Equal Opportunity/Equal Access Institution.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

Knowledge Base

Methodology

How to Write a Literature Review | Guide, Examples, & Templates

How to Write a Literature Review | Guide, Examples, & Templates

Published on January 2, 2023 by Shona McCombes . Revised on September 11, 2023.

What is a literature review? A literature review is a survey of scholarly sources on a specific topic. It provides an overview of current knowledge, allowing you to identify relevant theories, methods, and gaps in the existing research that you can later apply to your paper, thesis, or dissertation topic .

There are five key steps to writing a literature review:

Search for relevant literature
Evaluate sources
Identify themes, debates, and gaps
Outline the structure
Write your literature review

A good literature review doesn’t just summarize sources—it analyzes, synthesizes , and critically evaluates to give a clear picture of the state of knowledge on the subject.

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes

What is the purpose of a literature review, examples of literature reviews, step 1 – search for relevant literature, step 2 – evaluate and select sources, step 3 – identify themes, debates, and gaps, step 4 – outline your literature review’s structure, step 5 – write your literature review, free lecture slides, other interesting articles, frequently asked questions, introduction.

Quick Run-through
Step 1 & 2

When you write a thesis , dissertation , or research paper , you will likely have to conduct a literature review to situate your research within existing knowledge. The literature review gives you a chance to:

Demonstrate your familiarity with the topic and its scholarly context
Develop a theoretical framework and methodology for your research
Position your work in relation to other researchers and theorists
Show how your research addresses a gap or contributes to a debate
Evaluate the current state of research and demonstrate your knowledge of the scholarly debates around your topic.

Writing literature reviews is a particularly important skill if you want to apply for graduate school or pursue a career in research. We’ve written a step-by-step guide that you can follow below.

Don't submit your assignments before you do this

The academic proofreading tool has been trained on 1000s of academic texts. Making it the most accurate and reliable proofreading tool for students. Free citation check included.

literature reviews in qualitative research

Try for free

Writing literature reviews can be quite challenging! A good starting point could be to look at some examples, depending on what kind of literature review you’d like to write.

Example literature review #1: “Why Do People Migrate? A Review of the Theoretical Literature” ( Theoretical literature review about the development of economic migration theory from the 1950s to today.)
Example literature review #2: “Literature review as a research methodology: An overview and guidelines” ( Methodological literature review about interdisciplinary knowledge acquisition and production.)
Example literature review #3: “The Use of Technology in English Language Learning: A Literature Review” ( Thematic literature review about the effects of technology on language acquisition.)
Example literature review #4: “Learners’ Listening Comprehension Difficulties in English Language Learning: A Literature Review” ( Chronological literature review about how the concept of listening skills has changed over time.)

You can also check out our templates with literature review examples and sample outlines at the links below.

Download Word doc Download Google doc

Before you begin searching for literature, you need a clearly defined topic .

If you are writing the literature review section of a dissertation or research paper, you will search for literature related to your research problem and questions .

Make a list of keywords

Start by creating a list of keywords related to your research question. Include each of the key concepts or variables you’re interested in, and list any synonyms and related terms. You can add to this list as you discover new keywords in the process of your literature search.

Social media, Facebook, Instagram, Twitter, Snapchat, TikTok
Body image, self-perception, self-esteem, mental health
Generation Z, teenagers, adolescents, youth

Search for relevant sources

Use your keywords to begin searching for sources. Some useful databases to search for journals and articles include:

Your university’s library catalogue
Google Scholar
Project Muse (humanities and social sciences)
Medline (life sciences and biomedicine)
EconLit (economics)
Inspec (physics, engineering and computer science)

You can also use boolean operators to help narrow down your search.

Make sure to read the abstract to find out whether an article is relevant to your question. When you find a useful book or article, you can check the bibliography to find other relevant sources.

You likely won’t be able to read absolutely everything that has been written on your topic, so it will be necessary to evaluate which sources are most relevant to your research question.

For each publication, ask yourself:

What question or problem is the author addressing?
What are the key concepts and how are they defined?
What are the key theories, models, and methods?
Does the research use established frameworks or take an innovative approach?
What are the results and conclusions of the study?
How does the publication relate to other literature in the field? Does it confirm, add to, or challenge established knowledge?
What are the strengths and weaknesses of the research?

Make sure the sources you use are credible , and make sure you read any landmark studies and major theories in your field of research.

You can use our template to summarize and evaluate sources you’re thinking about using. Click on either button below to download.

Take notes and cite your sources

As you read, you should also begin the writing process. Take notes that you can later incorporate into the text of your literature review.

It is important to keep track of your sources with citations to avoid plagiarism . It can be helpful to make an annotated bibliography , where you compile full citation information and write a paragraph of summary and analysis for each source. This helps you remember what you read and saves time later in the process.

Prevent plagiarism. Run a free check.

To begin organizing your literature review’s argument and structure, be sure you understand the connections and relationships between the sources you’ve read. Based on your reading and notes, you can look for:

Trends and patterns (in theory, method or results): do certain approaches become more or less popular over time?
Themes: what questions or concepts recur across the literature?
Debates, conflicts and contradictions: where do sources disagree?
Pivotal publications: are there any influential theories or studies that changed the direction of the field?
Gaps: what is missing from the literature? Are there weaknesses that need to be addressed?

This step will help you work out the structure of your literature review and (if applicable) show how your own research will contribute to existing knowledge.

Most research has focused on young women.
There is an increasing interest in the visual aspects of social media.
But there is still a lack of robust research on highly visual platforms like Instagram and Snapchat—this is a gap that you could address in your own research.

There are various approaches to organizing the body of a literature review. Depending on the length of your literature review, you can combine several of these strategies (for example, your overall structure might be thematic, but each theme is discussed chronologically).

Chronological

The simplest approach is to trace the development of the topic over time. However, if you choose this strategy, be careful to avoid simply listing and summarizing sources in order.

Try to analyze patterns, turning points and key debates that have shaped the direction of the field. Give your interpretation of how and why certain developments occurred.

If you have found some recurring central themes, you can organize your literature review into subsections that address different aspects of the topic.

For example, if you are reviewing literature about inequalities in migrant health outcomes, key themes might include healthcare policy, language barriers, cultural attitudes, legal status, and economic access.

Methodological

If you draw your sources from different disciplines or fields that use a variety of research methods , you might want to compare the results and conclusions that emerge from different approaches. For example:

Look at what results have emerged in qualitative versus quantitative research
Discuss how the topic has been approached by empirical versus theoretical scholarship
Divide the literature into sociological, historical, and cultural sources

Theoretical

A literature review is often the foundation for a theoretical framework . You can use it to discuss various theories, models, and definitions of key concepts.

You might argue for the relevance of a specific theoretical approach, or combine various theoretical concepts to create a framework for your research.

Like any other academic text , your literature review should have an introduction , a main body, and a conclusion . What you include in each depends on the objective of your literature review.

The introduction should clearly establish the focus and purpose of the literature review.

Depending on the length of your literature review, you might want to divide the body into subsections. You can use a subheading for each theme, time period, or methodological approach.

As you write, you can follow these tips:

Summarize and synthesize: give an overview of the main points of each source and combine them into a coherent whole
Analyze and interpret: don’t just paraphrase other researchers — add your own interpretations where possible, discussing the significance of findings in relation to the literature as a whole
Critically evaluate: mention the strengths and weaknesses of your sources
Write in well-structured paragraphs: use transition words and topic sentences to draw connections, comparisons and contrasts

In the conclusion, you should summarize the key findings you have taken from the literature and emphasize their significance.

When you’ve finished writing and revising your literature review, don’t forget to proofread thoroughly before submitting. Not a language expert? Check out Scribbr’s professional proofreading services !

This article has been adapted into lecture slides that you can use to teach your students about writing a literature review.

Scribbr slides are free to use, customize, and distribute for educational purposes.

Open Google Slides Download PowerPoint

If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.

Sampling methods
Simple random sampling
Stratified sampling
Cluster sampling
Likert scales
Reproducibility

Statistics

Null hypothesis
Statistical power
Probability distribution
Effect size
Poisson distribution

Research bias

Optimism bias
Cognitive bias
Implicit bias
Hawthorne effect
Anchoring bias
Explicit bias

A literature review is a survey of scholarly sources (such as books, journal articles, and theses) related to a specific topic or research question .

It is often written as part of a thesis, dissertation , or research paper , in order to situate your work in relation to existing knowledge.

There are several reasons to conduct a literature review at the beginning of a research project:

To familiarize yourself with the current state of knowledge on your topic
To ensure that you’re not just repeating what others have already done
To identify gaps in knowledge and unresolved problems that your research can address
To develop your theoretical framework and methodology
To provide an overview of the key findings and debates on the topic

Writing the literature review shows your reader how your work relates to existing research and what new insights it will contribute.

The literature review usually comes near the beginning of your thesis or dissertation . After the introduction , it grounds your research in a scholarly field and leads directly to your theoretical framework or methodology .

A literature review is a survey of credible sources on a topic, often used in dissertations , theses, and research papers . Literature reviews give an overview of knowledge on a subject, helping you identify relevant theories and methods, as well as gaps in existing research. Literature reviews are set up similarly to other academic texts , with an introduction , a main body, and a conclusion .

An annotated bibliography is a list of source references that has a short description (called an annotation ) for each of the sources. It is often assigned as part of the research process for a paper .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, September 11). How to Write a Literature Review | Guide, Examples, & Templates. Scribbr. Retrieved June 24, 2024, from https://www.scribbr.com/dissertation/literature-review/

Is this article helpful?

Shona McCombes

Other students also liked, what is a theoretical framework | guide to organizing, what is a research methodology | steps & tips, how to write a research proposal | examples & templates, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

Chapter 9. Reviewing the Literature

What is a “literature review”.

No researcher ever comes up with a research question that is wholly novel. Someone, somewhere, has asked the same thing. Academic research is part of a larger community of researchers, and it is your responsibility, as a member of this community, to acknowledge others who have asked similar questions and to put your particular research into this greater context. It is not simply a convention or custom to begin your study with a review of previous literature (the “ lit review ”) but an important responsibility you owe the scholarly community.

Too often, new researchers pursue a topic to study and then write something like, “No one has ever studied this before” or “This area is underresearched.” It may be that no one has studied this particular group or setting, but it is highly unlikely no one has studied the foundational phenomenon of interest. And that comment about an area being underresearched? Be careful. The statement may simply signal to others that you haven’t done your homework. Rubin ( 2021 ) refers to this as “free soloing,” and it is not appreciated in academic work:

The truth of the matter is, academics don’t really like when people free solo. It’s really bad form to omit talking about the other people who are doing or have done research in your area. Partly, I mean we need to cite their work, but I also mean we need to respond to it—agree or disagree, clarify for extend. It’s also really bad form to talk about your research in a way that does not make it understandable to other academics.…You have to explain to your readers what your story is really about in terms they care about . This means using certain terminology, referencing debates in the literature, and citing relevant works—that is, in connecting your work to something else. ( 51–52 )

A literature review is a comprehensive summary of previous research on a topic. It includes both articles and books—and in some cases reports—relevant to a particular area of research. Ideally, one’s research question follows from the reading of what has already been produced. For example, you are interested in studying sports injuries related to female gymnasts. You read everything you can find on sports injuries related to female gymnasts, and you begin to get a sense of what questions remain open. You find that there is a lot of research on how coaches manage sports injuries and much about cultures of silence around treating injuries, but you don’t know what the gymnasts themselves are thinking about these issues. You look specifically for studies about this and find several, which then pushes you to narrow the question further. Your literature review then provides the road map of how you came to your very specific question, and it puts your study in the context of studies of sports injuries. What you eventually find can “speak to” all the related questions as well as your particular one.

In practice, the process is often a bit messier. Many researchers, and not simply those starting out, begin with a particular question and have a clear idea of who they want to study and where they want to conduct their study but don’t really know much about other studies at all. Although backward, we need to recognize this is pretty common. Telling students to “find literature” after the fact can seem like a purposeless task or just another hurdle for completing a thesis or dissertation. It is not! Even if you were not motivated by the literature in the first place, acknowledging similar studies and connecting your own research to those studies are important parts of building knowledge. Acknowledgment of past research is a responsibility you owe the discipline to which you belong.

Literature reviews can also signal theoretical approaches and particular concepts that you will incorporate into your own study. For example, let us say you are doing a study of how people find their first jobs after college, and you want to use the concept of social capital . There are competing definitions of social capital out there (e.g., Bourdieu vs. Burt vs. Putnam). Bourdieu’s notion is of one form of capital, or durable asset, of a “network of more or less institutionalized relationships of mutual acquaintance or recognition” ( 1984:248 ). Burt emphasizes the “brokerage opportunities” in a social network as social capital ( 1997:355 ). Putnam’s social capital is all about “facilitating coordination and cooperation for mutual benefit” ( 2001:67 ). Your literature review can adjudicate among these three approaches, or it can simply refer to the one that is animating your own research. If you include Bourdieu in your literature review, readers will know “what kind” of social capital you are talking about as well as what kind of social scientist you yourself are. They will likely understand that you are interested more in how some people are advantaged by their social capital relative to others rather than being interested in the mechanics of how social networks operate.

The literature review thus does two important things for you: firstly, it allows you to acknowledge previous research in your area of interest, thereby situating you within a discipline or body of scholars, and, secondly, it demonstrates that you know what you are talking about. If you present the findings of your research study without including a literature review, it can be like singing into the wind. It sounds nice, but no one really hears it, or if they do catch snippets, they don’t know where it is coming from.

Examples of Literature Reviews

To help you get a grasp of what a good literature review looks like and how it can advance your study, let’s take a look at a few examples.

Reader-Friendly Example: The Power of Peers

The first is by Janice McCabe ( 2016 ) and is from an article on peer networks in the journal Contexts . Contexts presents articles in a relatively reader-friendly format, with the goal of reaching a large audience for interesting sociological research. Read this example carefully and note how easily McCabe is able to convey the relevance of her own work by situating it in the context of previous studies:

Scholars who study education have long acknowledged the importance of peers for students’ well-being and academic achievement. For example, in 1961, James Coleman argued that peer culture within high schools shapes students’ social and academic aspirations and successes. More recently, Judith Rich Harris has drawn on research in a range of areas—from sociological studies of preschool children to primatologists’ studies of chimpanzees and criminologists’ studies of neighborhoods—to argue that peers matter much more than parents in how children “turn out.” Researchers have explored students’ social lives in rich detail, as in Murray Milner’s book about high school students, Freaks, Geeks, and Cool Kids , and Elizabeth Armstrong and Laura Hamilton’s look at college students, Paying for the Party . These works consistently show that peers play a very important role in most students’ lives. They tend, however, to prioritize social over academic influence and to use a fuzzy conception of peers rather than focusing directly on friends—the relationships that should matter most for student success. Social scientists have also studied the power of peers through network analysis, which is based on uncovering the web of connections between people. Network analysis involves visually mapping networks and mathematically comparing their structures (such as the density of ties) and the positions of individuals within them (such as how central a given person is within the network). As Nicholas Christakis and James Fowler point out in their book Connected , network structure influences a range of outcomes, including health, happiness, wealth, weight, and emotions. Given that sociologists have long considered network explanations for social phenomena, it’s surprising that we know little about how college students’ friends impact their experiences. In line with this network tradition, I focus on the structure of friendship networks, constructing network maps so that the differences we see across participants are due to the underlying structure, including each participant’s centrality in their friendship group and the density of ties among their friends. ( 23 )

What did you notice? In her very second sentence, McCabe uses “for example” to introduce a study by Coleman, thereby indicating that she is not going to tell you every single study in this area but is going to tell you that (1) there is a lot of research in this area, (2) it has been going on since at least 1961, and (3) it is still relevant (i.e., recent studies are still being done now). She ends her first paragraph by summarizing the body of literature in this area (after giving you a few examples) and then telling you what may have been (so far) left out of this research. In the second paragraph, she shifts to a separate interesting focus that is related to the first but is also quite distinct. Lit reviews very often include two (or three) distinct strands of literature, the combination of which nicely backgrounds this particular study . In the case of our female gymnast study (above), those two strands might be (1) cultures of silence around sports injuries and (2) the importance of coaches. McCabe concludes her short and sweet literature review with one sentence explaining how she is drawing from both strands of the literature she has succinctly presented for her particular study. This example should show you that literature reviews can be readable, helpful, and powerful additions to your final presentation.

Authoritative Academic Journal Example: Working Class Students’ College Expectations

The second example is more typical of academic journal writing. It is an article published in the British Journal of Sociology of Education by Wolfgang Lehmann ( 2009 ):

Although this increase in post-secondary enrolment and the push for university is evident across gender, race, ethnicity, and social class categories, access to university in Canada continues to be significantly constrained for those from lower socio-economic backgrounds (Finnie, Lascelles, and Sweetman 2005). Rising tuition fees coupled with an overestimation of the cost and an underestimation of the benefits of higher education has put university out of reach for many young people from low-income families (Usher 2005). Financial constraints aside, empirical studies in Canada have shown that the most important predictor of university access is parental educational attainment. Having at least one parent with a university degree significantly increases the likelihood of a young person to attend academic-track courses in high school, have high educational and career aspirations, and ultimately attend university (Andres et al. 1999, 2000; Lehmann 2007a). Drawing on Bourdieu’s various writing on habitus and class-based dispositions (see, for example, Bourdieu 1977, 1990), Hodkinson and Sparkes (1997) explain career decisions as neither determined nor completely rational. Instead, they are based on personal experiences (e.g., through employment or other exposure to occupations) and advice from others. Furthermore, they argue that we have to understand these decisions as pragmatic, rather than rational. They are pragmatic in that they are based on incomplete and filtered information, because of the social context in which the information is obtained and processed. New experiences and information can, however, also be allowed into one’s world, where they gradually or radically transform habitus, which in turn creates the possibility for the formation of new and different dispositions. Encountering a supportive teacher in elementary or secondary school, having ambitious friends, or chance encounters can spark such transformations. Transformations can be confirming or contradictory, they can be evolutionary or dislocating. Working-class students who enter university most certainly encounter such potentially transformative situations. Granfield (1991) has shown how initially dislocating feelings of inadequacy and inferiority of working-class students at an elite US law school were eventually replaced by an evolutionary transformation, in which the students came to dress, speak and act more like their middle-class and upper-class peers. In contrast, Lehmann (2007b) showed how persistent habitus dislocation led working-class university students to drop out of university. Foskett and Hemsley-Brown (1999) argue that young people’s perceptions of careers are a complex mix of their own experiences, images conveyed through adults, and derived images conveyed by the media. Media images of careers, perhaps, are even more important for working-class youth with high ambitions as they offer (generally distorted) windows into a world of professional employment to which they have few other sources of access. It has also been argued that working-class youth who do continue to university still face unique, class-specific challenges, evident in higher levels of uncertainty (Baxter and Britton 2001; Lehmann 2004, 2007a; Quinn 2004), their higher education choices (Ball et al. 2002; Brooks 2003; Reay et al. 2001) and fears of inadequacy because of their cultural outsider status (Aries and Seider 2005; Granfield 1991). Although the number of working-class university students in Canada has slowly increased, that of middle-class students at university has risen far more steeply (Knighton and Mizra 2002). These different enrolment trajectories have actually widened the participation gap, which in tum explains our continued concerns with the potential outsider status Indeed, in a study comparing first-generation working-class and traditional students who left university without graduating, Lehmann (2007b) found that first-generation working-class students were more likely to leave university very early in some cases within the first two months of enrollment. They were also more likely to leave university despite solid academic performance. Not “fitting in,” not “feeling university,” and not being able to “relate to these people” were key reasons for eventually withdrawing from university. From the preceding review of the literature, a number of key research questions arise: How do working-class university students frame their decision to attend university? How do they defy the considerable odds documented in the literature to attend university? What are the sources of information and various images that create dispositions to study at university? What role does their social-class background- or habitus play in their transition dispositions and how does this translate into expectations for university? ( 139 )

What did you notice here? How is this different from (and similar to) the first example? Note that rather than provide you with one or two illustrative examples of similar types of research, Lehmann provides abundant source citations throughout. He includes theory and concepts too. Like McCabe, Lehmann is weaving through multiple literature strands: the class gap in higher education participation in Canada, class-based dispositions, and obstacles facing working-class college students. Note how he concludes the literature review by placing his research questions in context.

Find other articles of interest and read their literature reviews carefully. I’ve included two more for you at the end of this chapter . As you learned how to diagram a sentence in elementary school (hopefully!), try diagramming the literature reviews. What are the “different strands” of research being discussed? How does the author connect these strands to their own research questions? Where is theory in the lit review, and how is it incorporated (e.g., Is it a separate strand of its own or is it inextricably linked with previous research in this area)?

One model of how to structure your literature review can be found in table 9.1. More tips, hints, and practices will be discussed later in the chapter.

Table 9.1. Model of Literature Review, Adopted from Calarco (2020:166)

What we know about some issue	Lays the foundation for your
What we don't know about that issue	Lays foundation for your
Why that unanswered question is important to ask	Hints at of your study
What existing research tells us about the best way to answer that unanswered question	Lays foundation for justifying your
What existing research might predict as the answer to the question	Justifies your "hypothesis" or

Embracing Theory

A good research study will, in some form or another, use theory. Depending on your particular study (and possibly the preferences of the members of your committee), theory may be built into your literature review. Or it may form its own section in your research proposal/design (e.g., “literature review” followed by “theoretical framework”). In my own experience, I see a lot of graduate students grappling with the requirement to “include theory” in their research proposals. Things get a little squiggly here because there are different ways of incorporating theory into a study (Are you testing a theory? Are you generating a theory?), and based on these differences, your literature review proper may include works that describe, explain, and otherwise set forth theories, concepts, or frameworks you are interested in, or it may not do this at all. Sometimes a literature review sets forth what we know about a particular group or culture totally independent of what kinds of theoretical framework or particular concepts you want to explore. Indeed, the big point of your study might be to bring together a body of work with a theory that has never been applied to it previously. All this is to say that there is no one correct way to approach the use of theory and the writing about theory in your research proposal.

Students are often scared of embracing theory because they do not exactly understand what it is. Sometimes, it seems like an arbitrary requirement. You’re interested in a topic; maybe you’ve even done some research in the area and you have findings you want to report. And then a committee member reads over what you have and asks, “So what?” This question is a good clue that you are missing theory, the part that connects what you have done to what other researchers have done and are doing. You might stumble upon this rather accidentally and not know you are embracing theory, as in a case where you seek to replicate a prior study under new circumstances and end up finding that a particular correlation between behaviors only happens when mediated by something else. There’s theory in there, if you can pull it out and articulate it. Or it might be that you are motivated to do more research on racial microaggressions because you want to document their frequency in a particular setting, taking for granted the kind of critical race theoretical framework that has done the hard work of defining and conceptualizing “microaggressions” in the first place. In that case, your literature review could be a review of Critical Race Theory, specifically related to this one important concept. That’s the way to bring your study into a broader conversation while also acknowledging (and honoring) the hard work that has preceded you.

Rubin ( 2021 ) classifies ways of incorporating theory into case study research into four categories, each of which might be discussed somewhat differently in a literature review or theoretical framework section. The first, the least theoretical, is where you set out to study a “configurative idiographic case” ( 70 ) This is where you set out to describe a particular case, leaving yourself pretty much open to whatever you find. You are not expecting anything based on previous literature. This is actually pretty weak as far as research design goes, but it is probably the default for novice researchers. Your committee members should probably help you situate this in previous literature in some way or another. If they cannot, and it really does appear you are looking at something fairly new that no one else has bothered to research before, and you really are completely open to discovery, you might try using a Grounded Theory approach, which is a methodological approach that foregrounds the generation of theory. In that case, your “theory” section can be a discussion of “Grounded Theory” methodology (confusing, yes, but if you take some time to ponder, you will see how this works). You will still need a literature review, though. Ideally one that describes other studies that have ever looked at anything remotely like what you are looking at—parallel cases that have been researched.

The second approach is the “disciplined configurative case,” in which theory is applied to explain a particular case or topic. You are not trying to test the theory but rather assuming the theory is correct, as in the case of exploring microaggressions in a particular setting. In this case, you really do need to have a separate theory section in addition to the literature review, one in which you clearly define the theoretical framework, including any of its important concepts. You can use this section to discuss how other researchers have used the concepts and note any discrepancies in definitions or operationalization of those concepts. This way you will be sure to design your study so that it speaks to and with other researchers. If everyone who is writing about microaggressions has a different definition of them, it is hard for others to compare findings or make any judgments about their prevalence (or any number of other important characteristics). Your literature review section may then stand alone and describe previous research in the particular area or setting, irrespective of the kinds of theory underlying those studies.

The third approach is “heuristic,” one in which you seek to identify new variables, hypotheses, mechanisms, or paths not yet explained by a theory or theoretical framework. In a way, you are generating new theory, but it is probably more accurate to say that you are extending or deepening preexisting theory. In this case, having a single literature review that is focused on the theory and the ways the theory has been applied and understood (with all its various mechanisms and pathways) is probably your best option. The focus of the literature reviewed is less on the case and more on the theory you are seeking to extend.

The final approach is “theory testing,” which is much rarer in qualitative studies than in quantitative, where this is the default approach. Theory-testing cases are those where a particular case is used to see if an existing theory is accurate or accurate under particular circumstances. As with the heuristic approach, your literature review will probably draw heavily on previous uses of the theory, but you may end up having a special section specifically about cases very close to your own . In other words, the more your study approaches theory testing, the more likely there is to be a set of similar studies to draw on or even one important key study that you are setting your own study up in parallel to in order to find out if the theory generated there operates here.

If we wanted to get very technical, it might be useful to distinguish theoretical frameworks properly from conceptual frameworks. The latter are a bit looser and, given the nature of qualitative research, often fit exploratory studies. Theoretical frameworks rely on specific theories and are essential for theory-testing studies. Conceptual frameworks can pull in specific concepts or ideas that may or may not be linked to particular theories. Think about it this way: A theory is a story of how the world works. Concepts don’t presume to explain the whole world but instead are ways to approach phenomena to help make sense of them. Microaggressions are concepts that are linked to Critical Race Theory. One could contextualize one’s study within Critical Race Theory and then draw various concepts, such as that of microaggressions from the overall theoretical framework. Or one could bracket out the master theory or framework and employ the concept of microaggression more opportunistically as a phenomenon of interest. If you are unsure of what theory you are using, you might want to frame a more practical conceptual framework in your review of the literature.

Helpful Tips

How to maintain good notes for what your read.

Over the years, I have developed various ways of organizing notes on what I read. At first, I used a single sheet of full-size paper with a preprinted list of questions and points clearly addressed on the front side, leaving the second side for more reflective comments and free-form musings about what I read, why it mattered, and how it might be useful for my research. Later, I developed a system in which I use a single 4″ × 6″ note card for each book I read. I try only to use the front side (and write very small), leaving the back for comments that are about not just this reading but things to do or examine or consider based on the reading. These notes often mean nothing to anyone else picking up the card, but they make sense to me. I encourage you to find an organizing system that works for you. Then when you set out to compose a literature review, instead of staring at five to ten books or a dozen articles, you will have ten neatly printed pages or notecards or files that have distilled what is important to know about your reading.

It is also a good idea to store this data digitally, perhaps through a reference manager. I use RefWorks, but I also recommend EndNote or any other system that allows you to search institutional databases. Your campus library will probably provide access to one of these or another system. Most systems will allow you to export references from another manager if and when you decide to move to another system. Reference managers allow you to sort through all your literature by descriptor, author, year, and so on. Even so, I personally like to have the ability to manually sort through my index cards, recategorizing things I have read as I go. I use RefWorks to keep a record of what I have read, with proper citations, so I can create bibliographies more easily, and I do add in a few “notes” there, but the bulk of my notes are kept in longhand.

What kinds of information should you include from your reading? Here are some bulleted suggestions from Calarco ( 2020:113–114 ), with my own emendations:

Citation . If you are using a reference manager, you can import the citation and then, when you are ready to create a bibliography, you can use a provided menu of citation styles, which saves a lot of time. If you’ve originally formatted in Chicago Style but the journal you are writing for wants APA style, you can change your entire bibliography in less than a minute. When using a notecard for a book, I include author, title, date as well as the library call number (since most of what I read I pull from the library). This is something RefWorks is not able to do, and it helps when I categorize.

I begin each notecard with an “intro” section, where I record the aims, goals, and general point of the book/article as explained in the introductory sections (which might be the preface, the acknowledgments, or the first two chapters). I then draw a bold line underneath this part of the notecard. Everything after that should be chapter specific. Included in this intro section are things such as the following, recommended by Calarco ( 2020 ):

Key background . “Two to three short bullet points identifying the theory/prior research on which the authors are building and defining key terms.”
Data/methods . “One or two short bullet points with information about the source of the data and the method of analysis, with a note if this is a novel or particularly effective example of that method.” I use [M] to signal methodology on my notecard, which might read, “[M] Int[erview]s (n-35), B[lack]/W[hite] voters” (I need shorthand to fit on my notecard!).
Research question . “Stated as briefly as possible.” I always provide page numbers so I can go back and see exactly how this was stated (sometimes, in qualitative research, there are multiple research questions, and they cannot be stated simply).
Argument/contributions . “Two to three short bullet points briefly describing the authors’ answer to the central research question and its implication for research, theory, and practice.” I use [ARG] for argument to signify the argument, and I make sure this is prominently visible on my notecard. I also provide page numbers here.

For me, all of this fits in the “intro” section, which, if this is a theoretically rich, methodologically sound book, might take up a third or even half of the front page of my notecard. Beneath the bold underline, I report specific findings or particulars of the book as they emerge chapter by chapter. Calarco’s ( 2020 ) next step is the following:

Key findings . “Three to four short bullet points identifying key patterns in the data that support the authors’ argument.”

All that remains is writing down thoughts that occur upon finishing the article/book. I use the back of the notecard for these kinds of notes. Often, they reach out to other things I have read (e.g., “Robinson reminds me of Crusoe here in that both are looking at the effects of social isolation, but I think Robinson makes a stronger argument”). Calarco ( 2020 ) concludes similarly with the following:

Unanswered questions . “Two to three short bullet points that identify key limitations of the research and/or questions the research did not answer that could be answered in future research.”

As I mentioned, when I first began taking notes like this, I preprinted pages with prompts for “research question,” “argument,” and so on. This was a great way to remind myself to look for these things in particular. You can do the same, adding whatever preprinted sections make sense to you, given what you are studying and the important aspects of your discipline. The other nice thing about the preprinted forms is that it keeps your writing to a minimum—you cannot write more than the allotted space, even if you might want to, preventing your notes from spiraling out of control. This can be helpful when we are new to a subject and everything seems worth recording!

After years of discipline, I have finally settled on my notecard approach. I have thousands of notecards, organized in several index card filing boxes stacked in my office. On the top right of each card is a note of the month/day I finished reading the item. I can remind myself what I read in the summer of 2010 if the need or desire ever arose to do so…those invaluable notecards are like a memento of what my brain has been up to!

Where to Start Looking for Literature

Your university library should provide access to one of several searchable databases for academic books and articles. My own preference is JSTOR, a service of ITHAKA, a not-for-profit organization that works to advance and preserve knowledge and to improve teaching and learning through the use of digital technologies. JSTOR allows you to search by several keywords and to narrow your search by type of material (articles or books). For many disciplines, the “literature” of the literature review is expected to be peer-reviewed “articles,” but some disciplines will also value books and book chapters. JSTOR is particularly useful for article searching. You can submit several keywords and see what is returned, and you can also narrow your search by a particular journal or discipline. If your discipline has one or two key journals (e.g., the American Journal of Sociology and the American Sociological Review are key for sociology), you might want to go directly to those journals’ websites and search for your topic area. There is an art to when to cast your net widely and when to refine your search, and you may have to tack back and forth to ensure that you are getting all that is relevant but not getting bogged down in all studies that might have some marginal relevance.

Some articles will carry more weight than others, and you can use applications like Google Scholar to see which articles have made and are continuing to make larger impacts on your discipline. Find these articles and read them carefully; use their literature review and the sources cited in those articles to make sure you are capturing what is relevant. This is actually a really good way of finding relevant books—only the most impactful will make it into the citations of journals. Over time, you will notice that a handful of articles (or books) are cited so often that when you see, say, Armstrong and Hamilton ( 2015 ), you know exactly what book this is without looking at the full cite. This is when you know you are in the conversation.

You might also approach a professor whose work is broadly in the area of your interest and ask them to recommend one or two “important” foundational articles or books. You can then use the references cited in those recommendations to build up your literature. Just be careful: some older professors’ knowledge of the literature (and I reluctantly add myself here) may be a bit outdated! It is best that the article or book whose references and sources you use to build your body of literature be relatively current.

Keep a List of Your Keywords

When using searchable databases, it is a good idea to keep a list of all the keywords you use as you go along so that (1) you do not needlessly duplicate your efforts and (2) you can more easily adjust your search as you get a better sense of what you are looking for. I suggest you keep a separate file or even a small notebook for this and you date your search efforts.

Here’s an example:

Table 9.2. Keep a List of Your Keywords


	JSTOR search: “literature review” + “qualitative research” limited to “after 1/1/2000” and “articles” in abstracts only	5 results: go back and search titles? Change up keywords? Take out qualitative research term?
	JSTOR search: “literature review” + and “articles” in abstracts only	37,113 results – way too many!!!!

Think Laterally

How to find the various strands of literature to combine? Don’t get stuck on finding the exact same research topic you think you are interested in. In the female gymnast example, I recommended that my student consider looking for studies of ballerinas, who also suffer sports injuries and around whom there is a similar culture of silence. It turned out that there was in fact research about my student’s particular questions, just not about the subjects she was interested in. You might do something similar. Don’t get stuck looking for too direct literature but think about the broader phenomenon of interest or analogous cases.

Read Outside the Canon

Some scholars’ work gets cited by everyone all the time. To some extent, this is a very good thing, as it helps establish the discipline. For example, there are a lot of “Bourdieu scholars” out there (myself included) who draw ideas, concepts, and quoted passages from Bourdieu. This makes us recognizable to one another and is a way of sharing a common language (e.g., where “cultural capital” has a particular meaning to those versed in Bourdieusian theory). There are empirical studies that get cited over and over again because they are excellent studies but also because there is an “echo chamber effect” going on, where knowing to cite this study marks you as part of the club, in the know, and so on. But here’s the problem with this: there are hundreds if not thousands of excellent studies out there that fail to get appreciated because they are crowded out by the canon. Sometimes this happens because they are published in “lower-ranked” journals and are never read by a lot of scholars who don’t have time to read anything other than the “big three” in their field. Other times this happens because the author falls outside of the dominant social networks in the field and thus is unmentored and fails to get noticed by those who publish a lot in those highly ranked and visible spaces. Scholars who fall outside the dominant social networks and who publish outside of the top-ranked journals are in no way less insightful than their peers, and their studies may be just as rigorous and relevant to your work, so it is important for you to take some time to read outside the canon. Due to how a person’s race, gender, and class operate in the academy, there is also a matter of social justice and ethical responsibility involved here: “When you focus on the most-cited research, you’re more likely to miss relevant research by women and especially women of color, whose research tends to be under-cited in most fields. You’re also more likely to miss new research, research by junior scholars, and research in other disciplines that could inform your work. Essentially, it is important to read and cite responsibly, which means checking that you’re not just reading and citing the same white men and the same old studies that everyone has cited before you” ( Calarco 2020:112 ).

Consider Multiple Uses for Literature

Throughout this chapter, I’ve referred to the literature of interest in a rather abstract way, as what is relevant to your study. But there are many different ways previous research can be relevant to your study. The most basic use of the literature is the “findings”—for example, “So-and-so found that Canadian working-class students were concerned about ‘fitting in’ to the culture of college, and I am going to look at a similar question here in the US.” But the literature may be of interest not for its findings but theoretically—for example, employing concepts that you want to employ in your own study. Bourdieu’s definition of social capital may have emerged in a study of French professors, but it can still be relevant in a study of, say, how parents make choices about what preschools to send their kids to (also a good example of lateral thinking!).

If you are engaged in some novel methodological form of data collection or analysis, you might look for previous literature that has attempted that. I would not recommend this for undergraduate research projects, but for graduate students who are considering “breaking the mold,” find out if anyone has been there before you. Even if their study has absolutely nothing else in common with yours, it is important to acknowledge that previous work.

Describing Gaps in the Literature

First, be careful! Although it is common to explain how your research adds to, builds upon, and fills in gaps in the previous research (see all four literature review examples in this chapter for this), there is a fine line between describing the gaps and misrepresenting previous literature by failing to conduct a thorough review of the literature. A little humility can make a big difference in your presentation. Instead of “This is the first study that has looked at how firefighters juggle childcare during forest fire season,” say, “I use the previous literature on how working parents juggling childcare and the previous ethnographic studies of firefighters to explore how firefighters juggle childcare during forest fire season.” You can even add, “To my knowledge, no one has conducted an ethnographic study in this specific area, although what we have learned from X about childcare and from Y about firefighters would lead us to expect Z here.” Read more literature review sections to see how others have described the “gaps” they are filling.

Use Concept Mapping

Concept mapping is a helpful tool for getting your thoughts in order and is particularly helpful when thinking about the “literature” foundational to your particular study. Concept maps are also known as mind maps, which is a delightful way to think about them. Your brain is probably abuzz with competing ideas in the early stages of your research design. Write/draw them on paper, and then try to categorize and move the pieces around into “clusters” that make sense to you. Going back to the gymnasts example, my student might have begun by jotting down random words of interest: gymnasts * sports * coaches * female gymnasts * stress * injury * don’t complain * women in sports * bad coaching * anxiety/stress * careers in sports * pain. She could then have begun clustering these into relational categories (bad coaching, don’t complain culture) and simple “event” categories (injury, stress). This might have led her to think about reviewing literature in these two separate aspects and then literature that put them together. There is no correct way to draw a concept map, as they are wonderfully specific to your mind. There are many examples you can find online.

Ask Yourself, “How Is This Sociology (or Political Science or Public Policy, Etc.)?”

Rubin ( 2021:82 ) offers this suggestion instead of asking yourself the “So what?” question to get you thinking about what bridges there are between your study and the body of research in your particular discipline. This is particularly helpful for thinking about theory. Rubin further suggests that if you are really stumped, ask yourself, “What is the really big question that all [fill in your discipline here] care about?” For sociology, it might be “inequality,” which would then help you think about theories of inequality that might be helpful in framing your study on whatever it is you are studying—OnlyFans? Childcare during COVID? Aging in America? I can think of some interesting ways to frame questions about inequality for any of those topics. You can further narrow it by focusing on particular aspects of inequality (Gender oppression? Racial exclusion? Heteronormativity?). If your discipline is public policy, the big questions there might be, How does policy get enacted, and what makes a policy effective? You can then take whatever your particular policy interest is—tax reform, student debt relief, cap-and-trade regulations—and apply those big questions. Doing so would give you a handle on what is otherwise an intolerably vague subject (e.g., What about student debt relief?).

Sometimes finding you are in new territory means you’ve hit the jackpot, and sometimes it means you’ve traveled out of bounds for your discipline. The jackpot scenario is wonderful. You are doing truly innovative research that is combining multiple literatures or is addressing a new or under-examined phenomenon of interest, and your research has the potential to be groundbreaking. Congrats! But that’s really hard to do, and it might be more likely that you’ve traveled out of bounds, by which I mean, you are no longer in your discipline . It might be that no one has written about this thing—at least within your field— because no one in your field actually cares about this topic . ( Rubin 2021:83 ; emphases added)

Don’t Treat This as a Chore

Don’t treat the literature review as a chore that has to be completed, but see it for what it really is—you are building connections to other researchers out there. You want to represent your discipline or area of study fairly and adequately. Demonstrate humility and your knowledge of previous research. Be part of the conversation.

Supplement: Two More Literature Review Examples

Elites by harvey ( 2011 ).

In the last two decades, there has been a small but growing literature on elites. In part, this has been a result of the resurgence of ethnographic research such as interviews, focus groups, case studies, and participant observation but also because scholars have become increasingly interested in understanding the perspectives and behaviors of leaders in business, politics, and society as a whole. Yet until recently, our understanding of some of the methodological challenges of researching elites has lagged behind our rush to interview them.

There is no clear-cut definition of the term elite, and given its broad understanding across the social sciences, scholars have tended to adopt different approaches. Zuckerman (1972) uses the term ultraelites to describe individuals who hold a significant amount of power within a group that is already considered elite. She argues, for example, that US senators constitute part of the country’s political elite but that among them are the ultraelites: a “subset of particularly powerful or prestigious influentials” (160). She suggests that there is a hierarchy of status within elite groups. McDowell (1998) analyses a broader group of “professional elites” who are employees working at different levels for merchant and investment banks in London. She classifies this group as elite because they are “highly skilled, professionally competent, and class-specific” (2135). Parry (1998:2148) uses the term hybrid elites in the context of the international trade of genetic material because she argues that critical knowledge exists not in traditional institutions “but rather as increasingly informal, hybridised, spatially fragmented, and hence largely ‘invisible,’ networks of elite actors.” Given the undertheorization of the term elite, Smith (2006) recognizes why scholars have shaped their definitions to match their respondents . However, she is rightly critical of the underlying assumption that those who hold professional positions necessarily exert as much influence as initially perceived. Indeed, job titles can entirely misrepresent the role of workers and therefore are by no means an indicator of elite status (Harvey 2010).

Many scholars have used the term elite in a relational sense, defining them either in terms of their social position compared to the researcher or compared to the average person in society (Stephens 2007). The problem with this definition is there is no guarantee that an elite subject will necessarily translate this power and authority in an interview setting. Indeed, Smith (2006) found that on the few occasions she experienced respondents wanting to exert their authority over her, it was not from elites but from relatively less senior workers. Furthermore, although business and political elites often receive extensive media training, they are often scrutinized by television and radio journalists and therefore can also feel threatened in an interview, particularly in contexts that are less straightforward to prepare for such as academic interviews. On several occasions, for instance, I have been asked by elite respondents or their personal assistants what they need to prepare for before the interview, which suggests that they consider the interview as some form of challenge or justification for what they do.

In many cases, it is not necessarily the figureheads or leaders of organizations and institutions who have the greatest claim to elite status but those who hold important social networks, social capital, and strategic positions within social structures because they are better able to exert influence (Burt 1992; Parry 1998; Smith 2005; Woods 1998). An elite status can also change, with people both gaining and losing theirs over time. In addition, it is geographically specific, with people holding elite status in some but not all locations. In short, it is clear that the term elite can mean many things in different contexts, which explains the range of definitions. The purpose here is not to critique these other definitions but rather to highlight the variety of perspectives.

When referring to my research, I define elites as those who occupy senior-management- and board-level positions within organizations. This is a similar scope of definition to Zuckerman’s (1972) but focuses on a level immediately below her ultraelite subjects. My definition is narrower than McDowell’s (1998) because it is clear in the context of my research that these people have significant decision-making influence within and outside of the firm and therefore present a unique challenge to interview. I deliberately use the term elite more broadly when drawing on examples from the theoretical literature in order to compare my experiences with those who have researched similar groups.

”Changing Dispositions among the Upwardly Mobile” by Curl, Lareau, and Wu ( 2018 )

There is growing interest in the role of cultural practices in undergirding the social stratification system. For example, Lamont et al. (2014) critically assess the preoccupation with economic dimensions of social stratification and call for more developed cultural models of the transmission of inequality. The importance of cultural factors in the maintenance of social inequality has also received empirical attention from some younger scholars, including Calarco (2011, 2014) and Streib (2015). Yet questions remain regarding the degree to which economic position is tied to cultural sensibilities and the ways in which these cultural sensibilities are imprinted on the self or are subject to change. Although habitus is a core concept in Bourdieu’s theory of social reproduction, there is limited empirical attention to the precise areas of the habitus that can be subject to change during upward mobility as well as the ramifications of these changes for family life.

In Bourdieu’s (1984) highly influential work on the importance of class-based cultural dispositions, habitus is defined as a “durable system of dispositions” created in childhood. The habitus provides a “matrix of perceptions” that seems natural while also structuring future actions and pathways. In many of his writings, Bourdieu emphasized the durability of cultural tastes and dispositions and did not consider empirically whether these dispositions might be changed or altered throughout one’s life (Swartz 1997). His theoretical work does permit the possibility of upward mobility and transformation, however, through the ability of the habitus to “improvise” or “change” due to “new experiences” (Friedman 2016:131). Researchers have differed in opinion on the durability of the habitus and its ability to change (King 2000). Based on marital conflict in cross-class marriages, for instance, Streib (2015) argues that cultural dispositions of individuals raised in working-class families are deeply embedded and largely unchanging. In a somewhat different vein, Horvat and Davis (2011:152) argue that young adults enrolled in an alternative educational program undergo important shifts in their self-perception, such as “self-esteem” and their “ability to accomplish something of value.” Others argue there is variability in the degree to which habitus changes dependent on life experience and personality (Christodoulou and Spyridakis 2016). Recently, additional studies have investigated the habitus as it intersects with lifestyle through the lens of meaning making (Ambrasat et al. 2016). There is, therefore, ample discussion of class-based cultural practices in self-perception (Horvat and Davis 2011), lifestyle (Ambrasat et al. 2016), and other forms of taste (Andrews 2012; Bourdieu 1984), yet researchers have not sufficiently delineated which aspects of the habitus might change through upward mobility or which specific dimensions of life prompt moments of class-based conflict.

Bourdieu (1999:511; 2004) acknowledged simmering tensions between the durable aspects of habitus and those aspects that have been transformed—that is, a “fractured” or “cleft” habitus. Others have explored these tensions as a “divided” or “fragmented” habitus (Baxter and Britton 2001; Lee and Kramer 2013). Each of these conceptions of the habitus implies that changes in cultural dispositions are possible but come with costs. Exploration of the specific aspects of one’s habitus that can change and generate conflict contributes to this literature.

Scholars have also studied the costs associated with academic success for working-class undergraduates (Hurst 2010; Lee and Kramer 2013; London 1989; Reay 2017; Rondini 2016; Stuber 2011), but we know little about the lasting effects on adults. For instance, Lee and Kramer (2013) point to cross-class tensions as family and friends criticize upwardly mobile individuals for their newly acquired cultural dispositions. Documenting the tension many working-class students experience with their friends and families of origin, they find that the source of their pain or struggle is “shaped not only by their interactions with non-mobile family and friends but also within their own minds, by their own assessments of their social positions, and by how those positions are interpreted by others” (Lee and Kramer 2013:29). Hurst (2010) also explores the experiences of undergraduates who have been academically successful and the costs associated with that success. She finds that decisions about “class allegiance and identity” are required aspects of what it means to “becom[e] educated” (4) and that working-class students deal with these cultural changes differently. Jack (2014, 2016) also argues that there is diversity among lower-income students, which yields varied college experiences. Naming two groups, the “doubly disadvantaged” and the “privileged poor,” he argues that previous experience with “elite environments” (2014:456) prior to college informs students’ ability to take on dominant cultural practices, particularly around engagement, such as help seeking or meeting with professors (2016). These studies shed light on the role college might play as a “lever for mobility” (2016:15) and discuss the pain and difficulty associated with upward mobility among undergraduates, but the studies do not illuminate how these tensions unfold in adulthood. Neither have they sufficiently addressed potential enduring tensions with extended family members as well as the specific nature of the difficulties.

Some scholars point to the positive outcomes upwardly mobile youth (Lehmann 2009) and adults (Stuber 2005) experience when they maintain a different habitus than their newly acquired class position, although, as Jack (2014, 2016) shows, those experiences may vary depending on one’s experience with elite environments in their youth. Researchers have not sufficiently explored the specific aspects of the habitus that upwardly mobile adults change or the conflicts that emerge with family and childhood friends as they reach adulthood and experience colliding social worlds. We contribute to this scholarship with clear examples of self-reported changes to one’s cultural dispositions in three specific areas: “horizons,” food and health, and communication. We link these changes to enduring tension with family members, friends, and colleagues and explore varied responses to this tension based on race.

Libraries | Research Guides

Literature reviews, what is a literature review, learning more about how to do a literature review.

Planning the Review
The Research Question
Choosing Where to Search
Organizing the Review
Writing the Review

A literature review is a review and synthesis of existing research on a topic or research question. A literature review is meant to analyze the scholarly literature, make connections across writings and identify strengths, weaknesses, trends, and missing conversations. A literature review should address different aspects of a topic as it relates to your research question. A literature review goes beyond a description or summary of the literature you have read.

Sage Research Methods Core Collection This link opens in a new window SAGE Research Methods supports research at all levels by providing material to guide users through every step of the research process. SAGE Research Methods is the ultimate methods library with more than 1000 books, reference works, journal articles, and instructional videos by world-leading academics from across the social sciences, including the largest collection of qualitative methods books available online from any scholarly publisher. – Publisher

Next: Planning the Review >>
Last Updated: May 2, 2024 10:39 AM
URL: https://libguides.northwestern.edu/literaturereviews

Locations and Hours
UCLA Library
Research Guides
Biomedical Library Guides

Systematic Reviews

Types of Literature Reviews

What Makes a Systematic Review Different from Other Types of Reviews?

Planning Your Systematic Review
Database Searching
Creating the Search
Search Filters and Hedges
Grey Literature
Managing and Appraising Results
Further Resources

Reproduced from Grant, M. J. and Booth, A. (2009), A typology of reviews: an analysis of 14 review types and associated methodologies. Health Information & Libraries Journal, 26: 91–108. doi:10.1111/j.1471-1842.2009.00848.x


	Aims to demonstrate writer has extensively researched literature and critically evaluated its quality. Goes beyond mere description to include degree of analysis and conceptual innovation. Typically results in hypothesis or mode	Seeks to identify most significant items in the field	No formal quality assessment. Attempts to evaluate according to contribution	Typically narrative, perhaps conceptual or chronological	Significant component: seeks to identify conceptual contribution to embody existing or derive new theory
	Generic term: published materials that provide examination of recent or current literature. Can cover wide range of subjects at various levels of completeness and comprehensiveness. May include research findings	May or may not include comprehensive searching	May or may not include quality assessment	Typically narrative	Analysis may be chronological, conceptual, thematic, etc.
Mapping review/ systematic map	Map out and categorize existing literature from which to commission further reviews and/or primary research by identifying gaps in research literature	Completeness of searching determined by time/scope constraints	No formal quality assessment	May be graphical and tabular	Characterizes quantity and quality of literature, perhaps by study design and other key features. May identify need for primary or secondary research
	Technique that statistically combines the results of quantitative studies to provide a more precise effect of the results	Aims for exhaustive, comprehensive searching. May use funnel plot to assess completeness	Quality assessment may determine inclusion/ exclusion and/or sensitivity analyses	Graphical and tabular with narrative commentary	Numerical analysis of measures of effect assuming absence of heterogeneity
	Refers to any combination of methods where one significant component is a literature review (usually systematic). Within a review context it refers to a combination of review approaches for example combining quantitative with qualitative research or outcome with process studies	Requires either very sensitive search to retrieve all studies or separately conceived quantitative and qualitative strategies	Requires either a generic appraisal instrument or separate appraisal processes with corresponding checklists	Typically both components will be presented as narrative and in tables. May also employ graphical means of integrating quantitative and qualitative studies	Analysis may characterise both literatures and look for correlations between characteristics or use gap analysis to identify aspects absent in one literature but missing in the other
	Generic term: summary of the [medical] literature that attempts to survey the literature and describe its characteristics	May or may not include comprehensive searching (depends whether systematic overview or not)	May or may not include quality assessment (depends whether systematic overview or not)	Synthesis depends on whether systematic or not. Typically narrative but may include tabular features	Analysis may be chronological, conceptual, thematic, etc.
	Method for integrating or comparing the findings from qualitative studies. It looks for ‘themes’ or ‘constructs’ that lie in or across individual qualitative studies	May employ selective or purposive sampling	Quality assessment typically used to mediate messages not for inclusion/exclusion	Qualitative, narrative synthesis	Thematic analysis, may include conceptual models
	Assessment of what is already known about a policy or practice issue, by using systematic review methods to search and critically appraise existing research	Completeness of searching determined by time constraints	Time-limited formal quality assessment	Typically narrative and tabular	Quantities of literature and overall quality/direction of effect of literature
	Preliminary assessment of potential size and scope of available research literature. Aims to identify nature and extent of research evidence (usually including ongoing research)	Completeness of searching determined by time/scope constraints. May include research in progress	No formal quality assessment	Typically tabular with some narrative commentary	Characterizes quantity and quality of literature, perhaps by study design and other key features. Attempts to specify a viable review
	Tend to address more current matters in contrast to other combined retrospective and current approaches. May offer new perspectives	Aims for comprehensive searching of current literature	No formal quality assessment	Typically narrative, may have tabular accompaniment	Current state of knowledge and priorities for future investigation and research
	Seeks to systematically search for, appraise and synthesis research evidence, often adhering to guidelines on the conduct of a review	Aims for exhaustive, comprehensive searching	Quality assessment may determine inclusion/exclusion	Typically narrative with tabular accompaniment	What is known; recommendations for practice. What remains unknown; uncertainty around findings, recommendations for future research
	Combines strengths of critical review with a comprehensive search process. Typically addresses broad questions to produce ‘best evidence synthesis’	Aims for exhaustive, comprehensive searching	May or may not include quality assessment	Minimal narrative, tabular summary of studies	What is known; recommendations for practice. Limitations
	Attempt to include elements of systematic review process while stopping short of systematic review. Typically conducted as postgraduate student assignment	May or may not include comprehensive searching	May or may not include quality assessment	Typically narrative with tabular accompaniment	What is known; uncertainty around findings; limitations of methodology
	Specifically refers to review compiling evidence from multiple reviews into one accessible and usable document. Focuses on broad condition or problem for which there are competing interventions and highlights reviews that address these interventions and their results	Identification of component reviews, but no search for primary studies	Quality assessment of studies within component reviews and/or of reviews themselves	Graphical and tabular with narrative commentary	What is known; recommendations for practice. What remains unknown; recommendations for future research

<< Previous: Home
Next: Planning Your Systematic Review >>
Last Updated: Apr 17, 2024 2:02 PM
URL: https://guides.library.ucla.edu/systematicreviews

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings
My Bibliography
Collections
Citation manager

Save citation to file

Email citation, add to collections.

Create a new collection
Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

Search in PubMed
Search in NLM Catalog
Add to Search

A Guide to Writing a Qualitative Systematic Review Protocol to Enhance Evidence-Based Practice in Nursing and Health Care

Affiliations.

1 PhD candidate, School of Nursing and Midwifey, Monash University, and Clinical Nurse Specialist, Adult and Pediatric Intensive Care Unit, Monash Health, Melbourne, Victoria, Australia.
2 Lecturer, School of Nursing and Midwifery, Monash University, Melbourne, Victoria, Australia.
3 Senior Lecturer, School of Nursing and Midwifery, Monash University, Melbourne, Victoria, Australia.
PMID: 26790142
DOI: 10.1111/wvn.12134

Background: The qualitative systematic review is a rapidly developing area of nursing research. In order to present trustworthy, high-quality recommendations, such reviews should be based on a review protocol to minimize bias and enhance transparency and reproducibility. Although there are a number of resources available to guide researchers in developing a quantitative review protocol, very few resources exist for qualitative reviews.

Aims: To guide researchers through the process of developing a qualitative systematic review protocol, using an example review question.

Methodology: The key elements required in a systematic review protocol are discussed, with a focus on application to qualitative reviews: Development of a research question; formulation of key search terms and strategies; designing a multistage review process; critical appraisal of qualitative literature; development of data extraction techniques; and data synthesis. The paper highlights important considerations during the protocol development process, and uses a previously developed review question as a working example.

Implications for research: This paper will assist novice researchers in developing a qualitative systematic review protocol. By providing a worked example of a protocol, the paper encourages the development of review protocols, enhancing the trustworthiness and value of the completed qualitative systematic review findings.

Linking evidence to action: Qualitative systematic reviews should be based on well planned, peer reviewed protocols to enhance the trustworthiness of results and thus their usefulness in clinical practice. Protocols should outline, in detail, the processes which will be used to undertake the review, including key search terms, inclusion and exclusion criteria, and the methods used for critical appraisal, data extraction and data analysis to facilitate transparency of the review process. Additionally, journals should encourage and support the publication of review protocols, and should require reference to a protocol prior to publication of the review results.

Keywords: guidelines; meta synthesis; qualitative; systematic review protocol.

PubMed Disclaimer

LinkOut - more resources

Full text sources.

Ovid Technologies, Inc.

Other Literature Sources

scite Smart Citations

Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

University Libraries

Research Guides
Blackboard Learn
Interlibrary Loan
Study Rooms
University of Arkansas

Literature Reviews

Qualitative or Quantitative?
Getting Started
Finding articles
Primary sources? Peer-reviewed?
Review Articles/ Annual Reviews...?
Books, ebooks, dissertations, book reviews

Qualitative researchers TEND to:

Researchers using qualitative methods tend to:

t hink that social sciences cannot be well-studied with the same methods as natural or physical sciences
feel that human behavior is context-specific; therefore, behavior must be studied holistically, in situ, rather than being manipulated
employ an 'insider's' perspective; research tends to be personal and thereby more subjective.
do interviews, focus groups, field research, case studies, and conversational or content analysis.

reasons to make a qualitative study; From https://www.editage.com/insights/qualitative-quantitative-or-mixed-methods-a-quick-guide-to-choose-the-right-design-for-your-research?refer-type=infographics

Image from https://www.editage.com/insights/qualitative-quantitative-or-mixed-methods-a-quick-guide-to-choose-the-right-design-for-your-research?refer-type=infographics

Qualitative Research (an operational definition)

Qualitative Research: an operational description

Purpose : explain; gain insight and understanding of phenomena through intensive collection and study of narrative data

Approach: inductive; value-laden/subjective; holistic, process-oriented

Hypotheses: tentative, evolving; based on the particular study

Lit. Review: limited; may not be exhaustive

Setting: naturalistic, when and as much as possible

Sampling : for the purpose; not necessarily representative; for in-depth understanding

Measurement: narrative; ongoing

Design and Method: flexible, specified only generally; based on non-intervention, minimal disturbance, such as historical, ethnographic, or case studies

Data Collection: document collection, participant observation, informal interviews, field notes

Data Analysis: raw data is words/ ongoing; involves synthesis

Data Interpretation: tentative, reviewed on ongoing basis, speculative

Qualitative research with more structure and less subjectivity
Increased application of both strategies to the same study ("mixed methods")
Evidence-based practice emphasized in more fields (nursing, social work, education, and others).

Some Other Guidelines

Guide for formatting Graphs and Tables
Critical Appraisal Checklist for an Article On Qualitative Research

Quantitative researchers TEND to:

Researchers using quantitative methods tend to:

think that both natural and social sciences strive to explain phenomena with confirmable theories derived from testable assumptions
attempt to reduce social reality to variables, in the same way as with physical reality
try to tightly control the variable(s) in question to see how the others are influenced.
Do experiments, have control groups, use blind or double-blind studies; use measures or instruments.

reasons to do a quantitative study. From https://www.editage.com/insights/qualitative-quantitative-or-mixed-methods-a-quick-guide-to-choose-the-right-design-for-your-research?refer-type=infographics

Quantitative Research (an operational definition)

Quantitative research: an operational description

Purpose: explain, predict or control phenomena through focused collection and analysis of numberical data

Approach: deductive; tries to be value-free/has objectives/ is outcome-oriented

Hypotheses : Specific, testable, and stated prior to study

Lit. Review: extensive; may significantly influence a particular study

Setting: controlled to the degree possible

Sampling: uses largest manageable random/randomized sample, to allow generalization of results to larger populations

Measurement: standardized, numberical; "at the end"

Design and Method: Strongly structured, specified in detail in advance; involves intervention, manipulation and control groups; descriptive, correlational, experimental

Data Collection: via instruments, surveys, experiments, semi-structured formal interviews, tests or questionnaires

Data Analysis: raw data is numbers; at end of study, usually statistical

Data Interpretation: formulated at end of study; stated as a degree of certainty

This page on qualitative and quantitative research has been adapted and expanded from a handout by Suzy Westenkirchner. Used with permission.

Images from https://www.editage.com/insights/qualitative-quantitative-or-mixed-methods-a-quick-guide-to-choose-the-right-design-for-your-research?refer-type=infographics.

<< Previous: Books, ebooks, dissertations, book reviews
Last Updated: Jan 8, 2024 2:51 PM
URL: https://uark.libguides.com/litreview
See us on Instagram
Follow us on Twitter
Phone: 479-575-4104

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

Advanced Search
Journal List
BMC Med Res Methodol

Methods for the synthesis of qualitative research: a critical review

Elaine barnett-page.

1 Evidence for Policy and Practice Information and Co-ordinating (EPPI-) Centre, Social Science Research Unit, 18 Woburn Square, London WC1H 0NS, UK

James Thomas

Associated data.

In recent years, a growing number of methods for synthesising qualitative research have emerged, particularly in relation to health-related research. There is a need for both researchers and commissioners to be able to distinguish between these methods and to select which method is the most appropriate to their situation.

A number of methodological and conceptual links between these methods were identified and explored, while contrasting epistemological positions explained differences in approaches to issues such as quality assessment and extent of iteration. Methods broadly fall into 'realist' or 'idealist' epistemologies, which partly accounts for these differences.

Methods for qualitative synthesis vary across a range of dimensions. Commissioners of qualitative syntheses might wish to consider the kind of product they want and select their method – or type of method – accordingly.

The range of different methods for synthesising qualitative research has been growing over recent years [ 1 , 2 ], alongside an increasing interest in qualitative synthesis to inform health-related policy and practice [ 3 ]. While the terms 'meta-analysis' (a statistical method to combine the results of primary studies), or sometimes 'narrative synthesis', are frequently used to describe how quantitative research is synthesised, far more terms are used to describe the synthesis of qualitative research. This profusion of terms can mask some of the basic similarities in approach that the different methods share, and also lead to some confusion regarding which method is most appropriate in a given situation. This paper does not argue that the various nomenclatures are unnecessary, but rather seeks to draw together and review the full range of methods of synthesis available to assist future reviewers in selecting a method that is fit for their purpose. It also represents an attempt to guide the reader through some of the varied terminology to spring up around qualitative synthesis. Other helpful reviews of synthesis methods have been undertaken in recent years with slightly different foci to this paper. Two recent studies have focused on describing and critiquing methods for the integration of qualitative research with quantitative [ 4 , 5 ] rather than exclusively examining the detail and rationale of methods for the synthesis of qualitative research. Two other significant pieces of work give practical advice for conducting the synthesis of qualitative research, but do not discuss the full range of methods available [ 6 , 7 ]. We begin our Discussion by outlining each method of synthesis in turn, before comparing and contrasting characteristics of these different methods across a range of dimensions. Readers who are more familiar with the synthesis methods described here may prefer to turn straight to the 'dimensions of difference' analysis in the second part of the Discussion.

Overview of synthesis methods

Meta-ethnography.

In their seminal work of 1988, Noblit and Hare proposed meta-ethnography as an alternative to meta-analysis [ 8 ]. They cited Strike and Posner's [ 9 ] definition of synthesis as an activity in which separate parts are brought together to form a 'whole'; this construction of the whole is essentially characterised by some degree of innovation, so that the result is greater than the sum of its parts. They also borrowed from Turner's theory of social explanation [ 10 ], a key tenet of which was building 'comparative understanding' [[ 8 ], p22] rather than aggregating data.

To Noblit and Hare, synthesis provided an answer to the question of 'how to "put together" written interpretive accounts' [[ 8 ], p7], where mere integration would not be appropriate. Noblit and Hare's early work synthesised research from the field of education.

Three different methods of synthesis are used in meta-ethnography. One involves the 'translation' of concepts from individual studies into one another, thereby evolving overarching concepts or metaphors. Noblit and Hare called this process reciprocal translational analysis (RTA). Refutational synthesis involves exploring and explaining contradictions between individual studies. Lines-of-argument (LOA) synthesis involves building up a picture of the whole (i.e. culture, organisation etc) from studies of its parts. The authors conceptualised this latter approach as a type of grounded theorising.

Britten et al [ 11 ] and Campbell et al [ 12 ] have both conducted evaluations of meta-ethnography and claim to have succeeded, by using this method, in producing theories with greater explanatory power than could be achieved in a narrative literature review. While both these evaluations used small numbers of studies, more recently Pound et al [ 13 ] conducted both an RTA and an LOA synthesis using a much larger number of studies (37) on resisting medicines. These studies demonstrate that meta-ethnography has evolved since Noblit and Hare first introduced it. Campbell et al claim to have applied the method successfully to non-ethnographical studies. Based on their reading of Schutz [ 14 ], Britten et al have developed both second and third order constructs in their synthesis (Noblit and Hare briefly allude to the possibility of a 'second level of synthesis' [[ 8 ], p28] but do not demonstrate or further develop the idea).

In a more recent development, Sandelowski & Barroso [ 15 ] write of adapting RTA by using it to ' integrate findings interpretively, as opposed to comparing them interpretively' (p204). The former would involve looking to see whether the same concept, theory etc exists in different studies; the latter would involve the construction of a bigger picture or theory (i.e. LOA synthesis). They also talk about comparing or integrating imported concepts (e.g. from other disciplines) as well as those evolved 'in vivo'.

Grounded theory

Kearney [ 16 ], Eaves [ 17 ] and Finfgeld [ 18 ] have all adapted grounded theory to formulate a method of synthesis. Key methods and assumptions of grounded theory, as originally formulated and subsequently refined by Glaser and Strauss [ 19 ] and Strauss and Corbin [ 20 , 21 ], include: simultaneous phases of data collection and analysis; an inductive approach to analysis, allowing the theory to emerge from the data; the use of the constant comparison method; the use of theoretical sampling to reach theoretical saturation; and the generation of new theory. Eaves cited grounded theorists Charmaz [ 22 ] and Chesler [ 23 ], as well as Strauss and Corbin [ 20 ], as informing her approach to synthesis.

Glaser and Strauss [ 19 ] foresaw a time when a substantive body of grounded research should be pushed towards a higher, more abstract level. As a piece of methodological work, Eaves undertook her own synthesis of the synthesis methods used by these authors to produce her own clear and explicit guide to synthesis in grounded formal theory. Kearney stated that 'grounded formal theory', as she termed this method of synthesis, 'is suited to study of phenomena involving processes of contextualized understanding and action' [[ 24 ], p180] and, as such, is particularly applicable to nurses' research interests.

As Kearney suggested, the examples examined here were largely dominated by research in nursing. Eaves synthesised studies on care-giving in rural African-American families for elderly stroke survivors; Finfgeld on courage among individuals with long-term health problems; Kearney on women's experiences of domestic violence.

Kearney explicitly chose 'grounded formal theory' because it matches 'like' with 'like': that is, it applies the same methods that have been used to generate the original grounded theories included in the synthesis – produced by constant comparison and theoretical sampling – to generate a higher-level grounded theory. The wish to match 'like' with 'like' is also implicit in Eaves' paper. This distinguishes grounded formal theory from more recent applications of meta-ethnography, which have sought to include qualitative research using diverse methodological approaches [ 12 ].

Thematic Synthesis

Thomas and Harden [ 25 ] have developed an approach to synthesis which they term 'thematic synthesis'. This combines and adapts approaches from both meta-ethnography and grounded theory. The method was developed out of a need to conduct reviews that addressed questions relating to intervention need, appropriateness and acceptability – as well as those relating to effectiveness – without compromising on key principles developed in systematic reviews. They applied thematic synthesis in a review of the barriers to, and facilitators of, healthy eating amongst children.

Free codes of findings are organised into 'descriptive' themes, which are then further interpreted to yield 'analytical' themes. This approach shares characteristics with later adaptations of meta-ethnography, in that the analytical themes are comparable to 'third order interpretations' and that the development of descriptive and analytical themes using coding invoke reciprocal 'translation'. It also shares much with grounded theory, in that the approach is inductive and themes are developed using a 'constant comparison' method. A novel aspect of their approach is the use of computer software to code the results of included studies line-by-line, thus borrowing another technique from methods usually used to analyse primary research.

Textual Narrative Synthesis

Textual narrative synthesis is an approach which arranges studies into more homogenous groups. Lucas et al [ 26 ] comment that it has proved useful in synthesising evidence of different types (qualitative, quantitative, economic etc). Typically, study characteristics, context, quality and findings are reported on according to a standard format and similarities and differences are compared across studies. Structured summaries may also be developed, elaborating on and putting into context the extracted data [ 27 ].

Lucas et al [ 26 ] compared thematic synthesis with textual narrative synthesis. They found that 'thematic synthesis holds most potential for hypothesis generation' whereas textual narrative synthesis is more likely to make transparent heterogeneity between studies (as does meta-ethnography, with refutational synthesis) and issues of quality appraisal. This is possibly because textual narrative synthesis makes clearer the context and characteristics of each study, while the thematic approach organises data according to themes. However, Lucas et al found that textual narrative synthesis is 'less good at identifying commonality' (p2); the authors do not make explicit why this should be, although it may be that organising according to themes, as the thematic approach does, is comparatively more successful in revealing commonality.

Paterson et al [ 28 ] have evolved a multi-faceted approach to synthesis, which they call 'meta-study'. The sociologist Zhao [ 29 ], drawing on Ritzer's work [ 30 ], outlined three components of analysis, which they proposed should be undertaken prior to synthesis. These are meta-data-analysis (the analysis of findings), meta-method (the analysis of methods) and meta-theory (the analysis of theory). Collectively, these three elements of analysis, culminating in synthesis, make up the practice of 'meta-study'. Paterson et al pointed out that the different components of analysis may be conducted concurrently.

Paterson et al argued that primary research is a construction; secondary research is therefore a construction of a construction. There is need for an approach that recognises this, and that also recognises research to be a product of its social, historical and ideological context. Such an approach would be useful in accounting for differences in research findings. For Paterson et al, there is no such thing as 'absolute truth'.

Meta-study was developed to study the experiences of adults living with a chronic illness. Meta-data-analysis was conceived of by Paterson et al in similar terms to Noblit and Hare's meta-ethnography (see above), in that it is essentially interpretive and seeks to reveal similarities and discrepancies among accounts of a particular phenomenon. Meta-method involves the examination of the methodologies of the individual studies under review. Part of the process of meta-method is to consider different aspects of methodology such as sampling, data collection, research design etc, similar to procedures others have called 'critical appraisal' (CASP [ 31 ]). However, Paterson et al take their critique to a deeper level by establishing the underlying assumptions of the methodologies used and the relationship between research outcomes and methods used. Meta-theory involves scrutiny of the philosophical and theoretical assumptions of the included research papers; this includes looking at the wider context in which new theory is generated. Paterson et al described meta-synthesis as a process which creates a new interpretation which accounts for the results of all three elements of analysis. The process of synthesis is iterative and reflexive and the authors were unwilling to oversimplify the process by 'codifying' procedures for bringing all three components of analysis together.

Meta-narrative

Greenhalgh et al [ 32 ]'s meta-narrative approach to synthesis arose out of the need to synthesise evidence to inform complex policy-making questions and was assisted by the formation of a multi-disciplinary team. Their approach to review was informed by Thomas Kuhn's The Structure of Scientific Revolutions [ 33 ], in which he proposed that knowledge is produced within particular paradigms which have their own assumptions about theory, about what is a legitimate object of study, about what are legitimate research questions and about what constitutes a finding. Paradigms also tend to develop through time according to a particular set of stages, central to which is the stage of 'normal science', in which the particular standards of the paradigm are largely unchallenged and seen to be self-evident. As Greenhalgh et al pointed out, Kuhn saw paradigms as largely incommensurable: 'that is, an empirical discovery made using one set of concepts, theories, methods and instruments cannot be satisfactorily explained through a different paradigmatic lens' [[ 32 ], p419].

Greenhalgh et al synthesised research from a wide range of disciplines; their research question related to the diffusion of innovations in health service delivery and organisation. They thus identified a need to synthesise findings from research which contains many different theories arising from many different disciplines and study designs.

Based on Kuhn's work, Greenhalgh et al proposed that, across different paradigms, there were multiple – and potentially mutually contradictory – ways of understanding the concept at the heart of their review, namely the diffusion of innovation. Bearing this in mind, the reviewers deliberately chose to select key papers from a number of different research 'paradigms' or 'traditions', both within and beyond healthcare, guided by their multidisciplinary research team. They took as their unit of analysis the 'unfolding "storyline" of a research tradition over time' [[ 32 ], p417) and sought to understand diffusion of innovation as it was conceptualised in each of these traditions. Key features of each tradition were mapped: historical roots, scope, theoretical basis; research questions asked and methods/instruments used; main empirical findings; historical development of the body of knowledge (how have earlier findings led to later findings); and strengths and limitations of the tradition. The results of this exercise led to maps of 13 'meta-narratives' in total, from which seven key dimensions, or themes, were identified and distilled for the synthesis phase of the review.

Critical Interpretive Synthesis

Dixon-Woods et al [ 34 ] developed their own approach to synthesising multi-disciplinary and multi-method evidence, termed 'critical interpretive synthesis', while researching access to healthcare by vulnerable groups. Critical interpretive synthesis is an adaptation of meta-ethnography, as well as borrowing techniques from grounded theory. The authors stated that they needed to adapt traditional meta-ethnographic methods for synthesis, since these had never been applied to quantitative as well as qualitative data, nor had they been applied to a substantial body of data (in this case, 119 papers).

Dixon-Woods et al presented critical interpretive synthesis as an approach to the whole process of review, rather than to just the synthesis component. It involves an iterative approach to refining the research question and searching and selecting from the literature (using theoretical sampling) and defining and applying codes and categories. It also has a particular approach to appraising quality, using relevance – i.e. likely contribution to theory development – rather than methodological characteristics as a means of determining the 'quality' of individual papers [ 35 ]. The authors also stress, as a defining characteristic, critical interpretive synthesis's critical approach to the literature in terms of deconstructing research traditions or theoretical assumptions as a means of contextualising findings.

Dixon-Woods et al rejected reciprocal translational analysis (RTA) as this produced 'only a summary in terms that have already been used in the literature' [[ 34 ], p5], which was seen as less helpful when dealing with a large and diverse body of literature. Instead, Dixon-Woods et al adopted a lines-of-argument (LOA) synthesis, in which – rejecting the difference between first, second and third order constructs – they instead developed 'synthetic constructs' which were then linked with constructs arising directly from the literature.

The influence of grounded theory can be seen in particular in critical interpretive synthesis's inductive approach to formulating the review question and to developing categories and concepts, rejecting a 'stage' approach to systematic reviewing, and in selecting papers using theoretical sampling. Dixon-Woods et al also claim that critical interpretive synthesis is distinct in its 'explicit orientation towards theory generation' [[ 34 ], p9].

Ecological Triangulation

Jim Banning is the author of 'ecological triangulation' or 'ecological sentence synthesis', applying this method to the evidence for what works for youth with disabilities. He borrows from Webb et al [ 36 ] and Denzin [ 37 ] the concept of triangulation, in which phenomena are studied from a variety of vantage points. His rationale is that building an 'evidence base' of effectiveness requires the synthesis of cumulative, multi-faceted evidence in order to find out 'what intervention works for what kind of outcomes for what kind of persons under what kind of conditions' [[ 38 ], p1].

Ecological triangulation unpicks the mutually interdependent relationships between behaviour, persons and environments. The method requires that, for data extraction and synthesis, 'ecological sentences' are formulated following the pattern: 'With this intervention, these outcomes occur with these population foci and within these grades (ages), with these genders ... and these ethnicities in these settings' [[ 39 ], p1].

Framework Synthesis

Brunton et al [ 40 ] and Oliver et al [ 41 ] have applied a 'framework synthesis' approach in their reviews. Framework synthesis is based on framework analysis, which was outlined by Pope, Ziebland and Mays [ 42 ], and draws upon the work of Ritchie and Spencer [ 43 ] and Miles and Huberman [ 44 ]. Its rationale is that qualitative research produces large amounts of textual data in the form of transcripts, observational fieldnotes etc. The sheer wealth of information poses a challenge for rigorous analysis. Framework synthesis offers a highly structured approach to organising and analysing data (e.g. indexing using numerical codes, rearranging data into charts etc).

Brunton et al applied the approach to a review of children's, young people's and parents' views of walking and cycling; Oliver et al to an analysis of public involvement in health services research. Framework synthesis is distinct from the other methods outlined here in that it utilises an a priori 'framework' – informed by background material and team discussions – to extract and synthesise findings. As such, it is largely a deductive approach although, in addition to topics identified by the framework, new topics may be developed and incorporated as they emerge from the data. The synthetic product can be expressed in the form of a chart for each key dimension identified, which may be used to map the nature and range of the concept under study and find associations between themes and exceptions to these [ 40 ].

'Fledgling' approaches

There are three other approaches to synthesis which have not yet been widely used. One is an approach using content analysis [ 45 , 46 ] in which text is condensed into fewer content-related categories. Another is 'meta-interpretation' [ 47 ], featuring the following: an ideographic rather than pre-determined approach to the development of exclusion criteria; a focus on meaning in context; interpretations as raw data for synthesis (although this feature doesn't distinguish it from other synthesis methods); an iterative approach to the theoretical sampling of studies for synthesis; and a transparent audit trail demonstrating the trustworthiness of the synthesis.

In addition to the synthesis methods discussed above, Sandelowski and Barroso propose a method they call 'qualitative metasummary' [ 15 ]. It is mentioned here as a new and original approach to handling a collection of qualitative studies but is qualitatively different to the other methods described here since it is aggregative; that is, findings are accumulated and summarised rather than 'transformed'. Metasummary is a way of producing a 'map' of the contents of qualitative studies and – according to Sandelowski and Barroso – 'reflect [s] a quantitative logic' [[ 15 ], p151]. The frequency of each finding is determined and the higher the frequency of a particular finding, the greater its validity. The authors even discuss the calculation of 'effect sizes' for qualitative findings. Qualitative metasummaries can be undertaken as an end in themselves or may serve as a basis for a further synthesis.

Dimensions of difference

Having outlined the range of methods identified, we now turn to an examination of how they compare with one another. It is clear that they have come from many different contexts and have different approaches to understanding knowledge, but what do these differences mean in practice? Our framework for this analysis is shown in Additional file 1 : dimensions of difference [ 48 ]. We have examined the epistemology of each of the methods and found that, to some extent, this explains the need for different methods and their various approaches to synthesis.

Epistemology

The first dimension that we will consider is that of the researchers' epistemological assumptions. Spencer et al [ 49 ] outline a range of epistemological positions, which might be organised into a spectrum as follows:

Subjective idealism : there is no shared reality independent of multiple alternative human constructions

Objective idealism : there is a world of collectively shared understandings

Critical realism : knowledge of reality is mediated by our perceptions and beliefs

Scientific realism : it is possible for knowledge to approximate closely an external reality

Naïve realism : reality exists independently of human constructions and can be known directly [ 49 , 45 , 46 ].

Thus, at one end of the spectrum we have a highly constructivist view of knowledge and, at the other, an unproblematized 'direct window onto the world' view.

Nearly all of positions along this spectrum are represented in the range of methodological approaches to synthesis covered in this paper. The originators of meta-narrative synthesis, critical interpretive synthesis and meta-study all articulate what might be termed a 'subjective idealist' approach to knowledge. Paterson et al [ 28 ] state that meta-study shies away from creating 'grand theories' within the health or social sciences and assume that no single objective reality will be found. Primary studies, they argue, are themselves constructions; meta-synthesis, then, 'deals with constructions of constructions' (p7). Greenhalgh et al [ 32 ] also view knowledge as a product of its disciplinary paradigm and use this to explain conflicting findings: again, the authors neither seek, nor expect to find, one final, non-contestable answer to their research question. Critical interpretive synthesis is similar in seeking to place literature within its context, to question its assumptions and to produce a theoretical model of a phenomenon which – because highly interpretive – may not be reproducible by different research teams at alternative points in time [[ 34 ], p11].

Methods used to synthesise grounded theory studies in order to produce a higher level of grounded theory [ 24 ] appear to be informed by 'objective idealism', as does meta-ethnography. Kearney argues for the near-universal applicability of a 'ready-to-wear' theory across contexts and populations. This approach is clearly distinct from one which recognises multiple realities. The emphasis is on examining commonalities amongst, rather than discrepancies between, accounts. This emphasis is similarly apparent in most meta-ethnographies, which are conducted either according to Noblit and Hare's 'reciprocal translational analysis' technique or to their 'lines-of-argument' technique and which seek to provide a 'whole' which has a greater explanatory power. Although Noblit and Hare also propose 'refutational synthesis', in which contradictory findings might be explored, there are few examples of this having been undertaken in practice, and the aim of the method appears to be to explain and explore differences due to context, rather than multiple realities.

Despite an assumption of a reality which is perhaps less contestable than those of meta-narrative synthesis, critical interpretive synthesis and meta-study, both grounded formal theory and meta-ethnography place a great deal of emphasis on the interpretive nature of their methods. This still supposes a degree of constructivism. Although less explicit about how their methods are informed, it seems that both thematic synthesis and framework synthesis – while also involving some interpretation of data – share an even less problematized view of reality and a greater assumption that their synthetic products are reproducible and correspond to a shared reality. This is also implicit in the fact that such products are designed directly to inform policy and practice, a characteristic shared by ecological triangulation. Notably, ecological triangulation, according to Banning, can be either realist or idealist. Banning argues that the interpretation of triangulation can either be one in which multiple viewpoints converge on a point to produce confirming evidence (i.e. one definitive answer to the research question) or an idealist one, in which the complexity of multiple viewpoints is represented. Thus, although ecological triangulation views reality as complex, the approach assumes that it can be approximately knowable (at least when the realist view of ecological triangulation is adopted) and that interventions can and should be modelled according to the products of its syntheses.

While pigeonholing different methods into specific epistemological positions is a problematic process, we do suggest that the contrasting epistemologies of different researchers is one way of explaining why we have – and need – different methods for synthesis.

Variation in terms of the extent of iteration during the review process is another key dimension. All synthesis methods include some iteration but the degree varies. Meta-ethnography, grounded theory and thematic synthesis all include iteration at the synthesis stage; both framework synthesis and critical interpretive synthesis involve iterative literature searching – in the case of critical interpretive synthesis, it is not clear whether iteration occurs during the rest of the review process. Meta-narrative also involves iteration at every stage. Banning does not mention iteration in outlining ecological triangulation and neither do Lucas or Thomas and Harden for thematic narrative synthesis.

It seems that the more idealist the approach, the greater the extent of iteration. This might be because a large degree of iteration does not sit well with a more 'positivist' ideal of procedural objectivity; in particular, the notion that the robustness of the synthetic product depends in part on the reviewers stating up front in a protocol their searching strategies, inclusion/exclusion criteria etc, and being seen not to alter these at a later stage.

Quality assessment

Another dimension along which we can look at different synthesis methods is that of quality assessment. When the approaches to the assessment of the quality of studies retrieved for review are examined, there is again a wide methodological variation. It might be expected that the further towards the 'realism' end of the epistemological spectrum a method of synthesis falls, the greater the emphasis on quality assessment. In fact, this is only partially the case.

Framework synthesis, thematic narrative synthesis and thematic synthesis – methods which might be classified as sharing a 'critical realist' approach – all have highly specified approaches to quality assessment. The review in which framework synthesis was developed applied ten quality criteria: two on quality and reporting of sampling methods, four to the quality of the description of the sample in the study, two to the reliability and validity of the tools used to collect data and one on whether studies used appropriate methods for helping people to express their views. Studies which did not meet a certain number of quality criteria were excluded from contributing to findings. Similarly, in the example review for thematic synthesis, 12 criteria were applied: five related to reporting aims, context, rationale, methods and findings; four relating to reliability and validity; and three relating to the appropriateness of methods for ensuring that findings were rooted in participants' own perspectives. Studies which were deemed to have significant flaws were excluded and sensitivity analyses were used to assess the possible impact of study quality on the review's findings. Thomas and Harden's use of thematic narrative synthesis similarly applied quality criteria and developed criteria additional to those they found in the literature on quality assessment, relating to the extent to which people's views and perspectives had been privileged by researchers. It is worth noting not only that these methods apply quality criteria but that they are explicit about what they are: assessing quality is a key component in the review process for both of these methods. Likewise, Banning – the originator of ecological triangulation – sees quality assessment as important and adapts the Design and Implementation Assessment Device (DIAD) Version 0.3 (a quality assessment tool for quantitative research) for use when appraising qualitative studies [ 50 ]. Again, Banning writes of excluding studies deemed to be of poor quality.

Greenhalgh et al's meta-narrative review [ 32 ] modified a range of existing quality assessment tools to evaluate studies according to validity and robustness of methods; sample size and power; and validity of conclusions. The authors imply, but are not explicit, that this process formed the basis for the exclusion of some studies. Although not quite so clear about quality assessment methods as framework and thematic synthesis, it might be argued that meta-narrative synthesis shows a greater commitment to the concept that research can and should be assessed for quality than either meta-ethnography or grounded formal theory. The originators of meta-ethnography, Noblit and Hare [ 8 ], originally discussed quality in terms of quality of metaphor, while more recent use of this method has used amended versions of CASP (the Critical Appraisal Skills Programme tool, [ 31 ]), yet has only referred to studies being excluded on the basis of lack of relevance or because they weren't 'qualitative' studies [ 8 ]. In grounded theory, quality assessment is only discussed in terms of a 'personal note' being made on the context, quality and usefulness of each study. However, contrary to expectation, meta-narrative synthesis lies at the extreme end of the idealism/realism spectrum – as a subjective idealist approach – while meta-ethnography and grounded theory are classified as objective idealist approaches.

Finally, meta-study and critical interpretive synthesis – two more subjective idealist approaches – look to the content and utility of findings rather than methodology in order to establish quality. While earlier forms of meta-study included only studies which demonstrated 'epistemological soundness', in its most recent form [ 51 ] this method has sought to include all relevant studies, excluding only those deemed not to be 'qualitative' research. Critical interpretive synthesis also conforms to what we might expect of its approach to quality assessment: quality of research is judged as the extent to which it informs theory. The threshold of inclusion is informed by expertise and instinct rather than being articulated a priori.

In terms of quality assessment, it might be important to consider the academic context in which these various methods of synthesis developed. The reason why thematic synthesis, framework synthesis and ecological triangulation have such highly specified approaches to quality assessment may be that each of these was developed for a particular task, i.e. to conduct a multi-method review in which randomised controlled trials (RCTs) were included. The concept of quality assessment in relation to RCTs is much less contested and there is general agreement on criteria against which quality should be judged.

Problematizing the literature

Critical interpretive synthesis, the meta-narrative approach and the meta-theory element of meta-study all share some common ground in that their review and synthesis processes include examining all aspects of the context in which knowledge is produced. In conducting a review on access to healthcare by vulnerable groups, critical interpretive synthesis sought to question 'the ways in which the literature had constructed the problematics of access, the nature of the assumptions on which it drew, and what has influenced its choice of proposed solutions' [[ 34 ], p6]. Although not claiming to have been directly influenced by Greenhalgh et al's meta-narrative approach, Dixon-Woods et al do cite it as sharing similar characteristics in the sense that it critiques the literature it reviews.

Meta-study uses meta-theory to describe and deconstruct the theories that shape a body of research and to assess its quality. One aspect of this process is to examine the historical evolution of each theory and to put it in its socio-political context, which invites direct comparison with meta-narrative synthesis. Greenhalgh et al put a similar emphasis on placing research findings within their social and historical context, often as a means of seeking to explain heterogeneity of findings. In addition, meta-narrative shares with critical interpretive synthesis an iterative approach to searching and selecting from the literature.

Framework synthesis, thematic synthesis, textual narrative synthesis, meta-ethnography and grounded theory do not share the same approach to problematizing the literature as critical interpretive synthesis, meta-study and meta-narrative. In part, this may be explained by the extent to which studies included in the synthesis represented a broad range of approaches or methodologies. This, in turn, may reflect the broadness of the review question and the extent to which the concepts contained within the question are pre-defined within the literature. In the case of both the critical interpretive synthesis and meta-narrative reviews, terminology was elastic and/or the question formed iteratively. Similarly, both reviews placed great emphasis on employing multi-disciplinary research teams. Approaches which do not critique the literature in the same way tend to have more narrowly-focused questions. They also tend to include a more limited range of studies: grounded theory synthesis includes grounded theory studies, meta-ethnography (in its original form, as applied by Noblit and Hare) ethnographies. The thematic synthesis incorporated studies based on only a narrow range of qualitative methodologies (interviews and focus groups) which were informed by a similarly narrow range of epistemological assumptions. It may be that the authors of such syntheses saw no need for including such a critique in their review process.

Similarities and differences between primary studies

Most methods of synthesis are applicable to heterogeneous data (i.e. studies which use contrasting methodologies) apart from early meta-ethnography and synthesis informed by grounded theory. All methods of synthesis state that, at some level, studies are compared; many are not so explicit about how this is done, though some are. Meta-ethnography is one of the most explicit: it describes the act of 'translation' where terms and concepts which have resonance with one another are subsumed into 'higher order constructs'. Grounded theory, as represented by Eaves [ 17 ], is undertaken according to a long list of steps and sub-steps, includes the production of generalizations about concepts/categories, which comes from classifying these categories. In meta-narrative synthesis, comparable studies are grouped together at the appraisal phase of review.

Perhaps more interesting are the ways in which differences between studies are explored. Those methods with a greater emphasis on critical appraisal may tend (although this is not always made explicit) to use differences in method to explain differences in finding. Meta-ethnography proposes 'refutational synthesis' to explain differences, although there are few examples of this in the literature. Some synthesis methods – for example, thematic synthesis – look at other characteristics of the studies under review, whether types of participants and their context vary, and whether this can explain differences in perspective.

All of these methods, then, look within the studies to explain differences. Other methods look beyond the study itself to the context in which it was produced. Critical interpretive synthesis and meta-study look at differences in theory or in socio-economic context. Critical interpretive synthesis, like meta-narrative, also explores epistemological orientation. Meta-narrative is unique in concerning itself with disciplinary paradigm (i.e. the story of the discipline as it progresses). It is also distinctive in that it treats conflicting findings as 'higher order data' [[ 32 ], p420], so that the main emphasis of the synthesis appears to be on examining and explaining contradictions in the literature.

Going 'beyond' the primary studies

Synthesis is sometimes defined as a process resulting in a product, a 'whole', which is more than the sum of its parts. However, the methods reviewed here vary in the extent to which they attempt to 'go beyond' the primary studies and transform the data. Some methods – textual narrative synthesis, ecological triangulation and framework synthesis – focus on describing and summarising their primary data (often in a highly structured and detailed way) and translating the studies into one another. Others – meta-ethnography, grounded theory, thematic synthesis, meta-study, meta-narrative and critical interpretive synthesis – seek to push beyond the original data to a fresh interpretation of the phenomena under review. A key feature of thematic synthesis is its clear differentiation between these two stages.

Different methods have different mechanisms for going beyond the primary studies, although some are more explicit than others about what these entail. Meta-ethnography proposes a 'Line of Argument' (LOA) synthesis in which an interpretation is constructed to both link and explain a set of parts. Critical interpretive synthesis based its synthesis methods on those of meta-ethnography, developing an LOA using what the authors term 'synthetic constructs' (akin to 'third order constructs' in meta-ethnography) to create a 'synthesising argument'. Dixon-Woods et al claim that this is an advance on Britten et al's methods, in that they reject the difference between first, second and third order constructs.

Meta-narrative, as outlined above, focuses on conflicting findings and constructs theories to explain these in terms of differing paradigms. Meta study derives questions from each of its three components to which it subjects the dataset and inductively generates a number of theoretical claims in relation to it. According to Eaves' model of grounded theory [ 17 ], mini-theories are integrated to produce an explanatory framework. In ecological triangulation, the 'axial' codes – or second level codes evolved from the initial deductive open codes – are used to produce Banning's 'ecological sentence' [ 39 ].

The synthetic product

In overviewing and comparing different qualitative synthesis methods, the ultimate question relates to the utility of the synthetic product: what is it for? It is clear that some methods of synthesis – namely, thematic synthesis, textual narrative synthesis, framework synthesis and ecological triangulation – view themselves as producing an output that is directly applicable to policy makers and designers of interventions. The example of framework synthesis examined here (on children's, young people's and parents' views of walking and cycling) involved policy makers and practitioners in directing the focus of the synthesis and used the themes derived from the synthesis to infer what kind of interventions might be most effective in encouraging walking and cycling. Likewise, the products of the thematic synthesis took the form of practical recommendations for interventions (e.g. 'do not promote fruit and vegetables in the same way in the same intervention'). The extent to which policy makers and practitioners are involved in informing either synthesis or recommendation is less clear from the documents published on ecological triangulation, but the aim certainly is to directly inform practice.

The outputs of synthesis methods which have a more constructivist orientation – meta-study, meta-narrative, meta-ethnography, grounded theory, critical interpretive synthesis – tend to look rather different. They are generally more complex and conceptual, sometimes operating on the symbolic or metaphorical level, and requiring a further process of interpretation by policy makers and practitioners in order for them to inform practice. This is not to say, however, that they are not useful for practice, more that they are doing different work. However, it may be that, in the absence of further interpretation, they are more useful for informing other researchers and theoreticians.

Looking across dimensions

After examining the dimensions of difference of our included methods, what picture ultimately emerges? It seems clear that, while similar in some respects, there are genuine differences in approach to the synthesis of what is essentially textual data. To some extent, these differences can be explained by the epistemological assumptions that underpin each method. Our methods split into two broad camps: the idealist and the realist (see Table Table1 1 for a summary). Idealist approaches generally tend to have a more iterative approach to searching (and the review process), have less a priori quality assessment procedures and are more inclined to problematize the literature. Realist approaches are characterised by a more linear approach to searching and review, have clearer and more well-developed approaches to quality assessment, and do not problematize the literature.

Summary table

	Idealist	Realist
Searching	Iterative	Linear
Quality assessment	Less clear, less a priori; quality of content rather than method	Clear and a priori
Problematizing the literature	Yes	No
Question	Explore	Answer
Heterogeneity	Lots	Little
Synthetic product	Complex	Clear for policy makers and practitioners

N.B.: In terms of the above dimensions, it is generally a question of degree rather than of absolute distinctions.

Mapping the relationships between methods

What is interesting is the relationship between these methods of synthesis, the conceptual links between them, and the extent to which the originators cite – or, in some cases, don't cite – one another. Some methods directly build on others – framework synthesis builds on framework analysis, for example, while grounded theory and constant comparative analysis build on grounded theory. Others further develop existing methods – meta-study, critical interpretive synthesis and meta-narrative all adapt aspects of meta-ethnography, while also importing concepts from other theorists (critical interpretive synthesis also adapts grounded theory techniques).

Some methods share a clear conceptual link, without directly citing one another: for example, the analytical themes developed during thematic synthesis are comparable to the third order interpretations of meta-ethnography. The meta-theory aspect of meta-study is echoed in both meta-narrative synthesis and critical interpretive synthesis (see 'Problematizing the literature, above); however, the originators of critical interpretive synthesis only refer to the originators of meta-study in relation to their use of sampling techniques.

While methods for qualitative synthesis have many similarities, there are clear differences in approach between them, many of which can be explained by taking account of a given method's epistemology.

However, within the two broad idealist/realist categories, any differences between methods in terms of outputs appear to be small.

Since many systematic reviews are designed to inform policy and practice, it is important to select a method – or type of method – that will produce the kind of conclusions needed. However, it is acknowledged that this is not always simple or even possible to achieve in practice.

The approaches that result in more easily translatable messages for policy-makers and practitioners may appear to be more attractive than the others; but we do need to take account lessons from the more idealist end of the spectrum, that some perspectives are not universal.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

Both authors made substantial contributions, with EBP taking a lead on writing and JT on the analytical framework. Both authors read and approved the final manuscript.

Pre-publication history

The pre-publication history for this paper can be accessed here:

http://www.biomedcentral.com/1471-2288/9/59/prepub

Supplementary Material

Dimensions of difference . Ranging from subjective idealism through objective idealism and critical realism to scientific realism to naïve realism

Acknowledgements

The authors would like to acknowledge the helpful contributions of the following in commenting on earlier drafts of this paper: David Gough, Sandy Oliver, Angela Harden, Mary Dixon-Woods, Trisha Greenhalgh and Barbara L. Paterson. We would also like to thank the peer reviewers: Helen J Smith, Rosaline Barbour and Mark Rodgers for their helpful reviews. The methodological development was supported by the Department of Health (England) and the ESRC through the Methods for Research Synthesis Node of the National Centre for Research Methods (NCRM). An earlier draft of this paper currently appears as a working paper on the National Centre for Research Methods' website http://www.ncrm.ac.uk/ .

Dixon-Woods M, Agarwhal S, Jones D, Young B, Sutton A. Synthesising qualitative and quantitative evidence: a review of possible methods. J Health Serv Res Pol. 2005; 10 (1):45–53b. doi: 10.1258/1355819052801804. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Barbour RS, Barbour M. Evaluating and synthesizing qualitative research: the need to develop a distinctive approach. J Eval Clin Pract. 2003; 9 (2):179–186. doi: 10.1046/j.1365-2753.2003.00371.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Mays N, Pope C, Popay J. Systematically reviewing qualitative and quantitative evidence to inform management and policy-making in the health field. J Health Serv Res Pol. 2005; 10 (Suppl 1):6–20. doi: 10.1258/1355819054308576. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Dixon-Woods M, Bonas S, Booth A, Jones DR, Miller T, Shaw RL, Smith J, Sutton A, Young B. How can systematic reviews incorporate qualitative research? A critical perspective. Qual Res. 2006; 6 :27–44. doi: 10.1177/1468794106058867. [ CrossRef ] [ Google Scholar ]
Pope C, Mays N, Popay J. Synthesizing Qualitative and Quantitative Health Evidence: a Guide to Methods. Maidenhead: Open University Press; 2007. [ Google Scholar ]
Thorne S, Jenson L, Kearney MH, Noblit G, Sandelowski M. Qualitative metasynthesis: reflections on methodological orientation and ideological agenda. Qual Health Res. 2004; 14 :1342–1365. doi: 10.1177/1049732304269888. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Centre for Reviews and Dissemination. Systematic Reviews. CRD's Guidance for Undertaking Reviews in Health Care. York: CRD; 2008. [ Google Scholar ]
Noblit GW, Hare RD. Meta-Ethnography: Synthesizing Qualitative Studies. London: Sage; 1988. [ Google Scholar ]
Strike K, Posner G. In: Knowledge Structure and Use. Ward S, Reed L, editor. Philadelphia: Temple University Press; 1983. Types of synthesis and their criteria. [ Google Scholar ]
Turner S. Sociological Explanation as Translation. New York: Cambridge University Press; 1980. [ Google Scholar ]
Britten N, Campbell R, Pope C, Donovan J, Morgan M, Pill R. Using meta-ethnography to synthesis qualitative research: a worked example. J Health Serv Res. 2002; 7 :209–15. doi: 10.1258/135581902320432732. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Campbell R, Pound P, Pope C, Britten N, Pill R, Morgan M, Donovan J. Evaluating meta-ethnography: a synthesis of qualitative research on lay experiences of diabetes and diabetes care. Soc Sci Med. 2003; 65 :671–84. doi: 10.1016/S0277-9536(02)00064-3. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Pound P, Britten N, Morgan M, Yardley L, Pope C, Daker-White G, Campbell R. Resisting medicines: a synthesis of qualitative studies of medicine taking. Soc Sci Med. 2005; 61 :133–155. doi: 10.1016/j.socscimed.2004.11.063. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Schutz A. Collected Paper. Vol. 1. The Hague: Martinus Nijhoff; 1962. [ Google Scholar ]
Sandelowski M, Barroso J. Handbook for Synthesizing Qualitative Research. New York: Springer Publishing Company; 2007. [ Google Scholar ]
Kearney MH. Enduring love: a grounded formal theory of women's experience of domestic violence. Research Nurs Health. 2001; 24 :270–82. doi: 10.1002/nur.1029. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Eaves YD. A synthesis technique for grounded theory data analysis. J Adv Nurs. 2001; 35 :654–63. doi: 10.1046/j.1365-2648.2001.01897.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Finfgeld D. Courage as a process of pushing beyond the struggle. Qual Health Res. 1999; 9 :803–814. doi: 10.1177/104973299129122298. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Glaser BG, Strauss AL. The Discovery of Grounded Theory: Strategies for Qualitative Research. New York: Aldine De Gruyter; 1967. [ Google Scholar ]
Strauss AL, Corbin J. Basics of Qualitative Research: Grounded Theory Procedures and Techniques. Newbury Park, CA: Sage; 1990. [ Google Scholar ]
Strauss AL, Corbin J. Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory. Thousand Oaks, CA: Sage; 1998. [ Google Scholar ]
Charmaz K. In: Contemporary Field Research: A Collection of Readings. Emerson RM, editor. Waveland Press: Prospect Heights, IL; 1983. The grounded theory method: an explication and interpretation; pp. 109–126. [ Google Scholar ]
Chesler MA. Professionals' Views of the Dangers of Self-Help Groups: Explicating a Grounded Theoretical Approach. [Michigan]: Department of Sociology, University of Michigan, Ann Arbour Centre for Research on Social Organisation, Working Paper Series; 1987. [ Google Scholar ]
Kearney MH. Ready-to-wear: discovering grounded formal theory. Res Nurs Health. 1988; 21 :179–186. doi: 10.1002/(SICI)1098-240X(199804)21:2<179::AID-NUR8>3.0.CO;2-G. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Thomas J, Harden A. Methods for the thematic synthesis of qualitative research in systematic reviews. BMC Med Res Meth. 2008; 8 :45. doi: 10.1186/1471-2288-8-45. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Lucas PJ, Arai L, Baird, Law C, Roberts HM. Worked examples of alternative methods for the synthesis of qualitative and quantitative research in systematic reviews. BMC Med Res Meth. 2007; 7 (4) [ PMC free article ] [ PubMed ] [ Google Scholar ]
Harden A, Garcia J, Oliver S, Rees R, Shepherd J, Brunton G, Oakley A. Applying systematic review methods to studies of people's views: an example from public health research. J Epidemiol Community H. 2004; 58 :794–800. doi: 10.1136/jech.2003.014829. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Paterson BL, Thorne SE, Canam C, Jillings C. Meta-Study of Qualitative Health Research. A Practical Guide to Meta-Analysis and Meta-Synthesis. Thousand Oaks, CA: Sage Publications; 2001. [ Google Scholar ]
Zhao S. Metatheory, metamethod, meta-data-analysis: what, why and how? Sociol Perspect. 1991; 34 :377–390. [ Google Scholar ]
Ritzer G. Metatheorizing in Sociology. Lexington, MA: Lexington Books; 1991. [ Google Scholar ]
CASP (Critical Appraisal Skills Programme) http://www.phru.nhs.uk/Pages/PHD/CASP.htm date unknown.
Greenhalgh T, Robert G, Macfarlane F, Bate P, Kyriakidou O, Peacock R. Storylines of research in diffusion of innovation: a meta-narrative approach to systematic review. Soc Sci Med. 2005; 61 :417–30. doi: 10.1016/j.socscimed.2004.12.001. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Kuhn TS. The Structure of Scientific Revolutions. Chicago: University of Chicago Press; 1962. [ Google Scholar ]
Dixon-Woods M, Cavers D, Agarwal S, Annandale E, Arthur A, Harvey J, Hsu R, Katbamna S, Olsen R, Smith L, Riley R, Sutton AJ. Conducting a critical interpretive synthesis of the literature on access to healthcare by vulnerable groups. BMC Med Res Meth. 2006; 6 (35) [ PMC free article ] [ PubMed ] [ Google Scholar ]
Gough D. In: Applied and Practice-based Research. 2. Furlong J, Oancea A, editor. Vol. 22. Special Edition of Research Papers in Education; 2007. Weight of evidence: a framework for the appraisal of the quality and relevance of evidence; pp. 213–228. [ Google Scholar ]
Webb EJ, Campbell DT, Schwartz RD, Sechrest L. Unobtrusive Measures. Chicago: Rand McNally; 1966. [ Google Scholar ]
Denzin NK. The Research Act: a Theoretical Introduction to Sociological Methods. New York: McGraw-Hill; 1978. [ Google Scholar ]
Banning J. Ecological Triangulation. http://mycahs.colostate.edu/James.H.Banning/PDFs/Ecological%20Triangualtion.pdf
Banning J. Ecological Sentence Synthesis. http://mycahs.colostate.edu/James.H.Banning/PDFs/Ecological%20Sentence%20Synthesis.pdf
Brunton G, Oliver S, Oliver K, Lorenc T. A Synthesis of Research Addressing Children's, Young People's and Parents' Views of Walking and Cycling for Transport. London: EPPI-Centre, Social Science Research Unit, Institute of Education, University of London; 2006. [ Google Scholar ]
Oliver S, Rees R, Clarke-Jones L, Milne R, Oakley A, Gabbay J, Stein K, Buchanan P, Gyte G. A multidimensional conceptual framework for analysing public involvement in health services research. Health Expect. 2008; 11 :72–84. doi: 10.1111/j.1369-7625.2007.00476.x. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Pope C, Ziebland S, Mays N. Qualitative research in health care: analysing qualitative data. BMJ. 2000; 320 :114–116. doi: 10.1136/bmj.320.7227.114. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Ritchie J, Spencer L. In: Analysing Qualitative Data. Bryman A, Burgess R, editor. London: Routledge; 1993. Qualitative data analysis for applied policy research; pp. 173–194. [ Google Scholar ]
Miles M, Huberman A. Qualitative Data Analysis. London: Sage; 1984. [ Google Scholar ]
Evans D, Fitzgerald M. Reasons for physically restraining patients and residents: a systematic review and content analysis. Int J Nurs Stud. 2002; 39 :739–743. doi: 10.1016/S0020-7489(02)00015-9. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Suikkala A, Leino-Kilpi H. Nursing student-patient relationships: a review of the literature from 1984–1998. J Adv Nurs. 2000; 33 :42–50. doi: 10.1046/j.1365-2648.2001.01636.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
Weed M. 'Meta-interpretation': a method for the interpretive synthesis of qualitative research. Forum: Qual Soc Res. 2005; 6 :Art 37. [ Google Scholar ]
Gough D, Thomas J. Dimensions of difference in systematic reviews. http://www.ncrm.ac.uk/RMF2008/festival/programme/sys1
Spencer L, Ritchie J, Lewis J, Dillon L. Quality in Qualitative Evaluation: a Framework for Assessing Research Evidence. London: Government Chief Social Researcher's Office; 2003. [ Google Scholar ]
Banning J. Design and Implementation Assessment Device (DIAD) Version 0.3: A response from a qualitative perspective. http://mycahs.colostate.edu/James.H.Banning/PDFs/Design%20and%20Implementation%20Assessment%20Device.pdf
Paterson BL. In: Reviewing Research Evidence for Nursing Practice. Webb C, Roe B, editor. [Oxford]: Blackwell Publishing Ltd; 2007. Coming out as ill: understanding self-disclosure in chronic illness from a meta-synthesis of qualitative research; pp. 73–83. [ Google Scholar ]

About Systematic Reviews

Qualitative Data Analysis in Systematic Reviews

Automate every stage of your literature review to produce evidence-based research faster and more accurately.

What is a qualitative systematic review.

A qualitative systematic review aggregates integrates and interprets data from qualitative studies, which is collected through observation, interviews, and verbal interactions. Included studies may also use other qualitative methodologies of data collection in the relevant literature. The use of qualitative systematic reviews analyzes the information and focuses on the meanings derived from it.

A qualitative systematic review generally follows the same steps as indicated by most systematic review guidelines , including the application of eligibility criteria in systematic reviews , and the steps for searching and screening available literature. All of these then conclude in the final write-up, which involves tabulating the data into a summary of findings table in the systematic review , and reporting on findings and conclusions. Qualitative systematic reviews are different in that, they incorporate qualitative studies and use only qualitative methods in analyzing and synthesizing data.

Why Are Qualitative Systematic Reviews Valuable?

Apart from the rigorous, methodical, and reproducible process used, qualitative systematic reviews derive their conclusions from qualitative data, they bring a human perspective into the process of answering the focused research question. This brings valuable findings, which cannot be expressed in quantitative means, into the view of the reader. Results that are better stated that calculated, like feelings of compliance or satisfaction following treatment using a new anti-depressant.

Another example, if a systematic review that deals with pain associated with a certain drug considers qualitative data, it can come up with conclusions that consider how subjects feel when taking the medicine, e.g., the level of pain and tolerance, etc.

Types Of Qualitative Systematic Reviews

Pioneers of qualitative systematic reviews suggest that qualitative systematic reviews can be segregated into two types: aggregated and interpretive.

Aggregated Systematic Review

An aggregated systematic review simply summarizes the collected data. It generates a summary of the studies using aggregate data obtained from individual studies within the scoped literature.

Interpretive Systematic Review

An interpretive systematic review, which is the more common of the two types, analyzes the data. From the analysis, researchers can derive a new understanding that may lead to the development of a theory and can help understand or predict behavior as it relates to the topic of the review.

Learn More About DistillerSR

(Article continues below)

How to Analyze Data in a Qualitative Systematic Review

Qualitative systematic reviews deal with a lot of textual studies. This is why undertaking one requires a well-planned, systematic, and sustainable approach, as defined in your protocol. It also helps to employ literature review software like DistillerSR to take out a significant amount of manual labor, as it automates key stages in the entire methodology.

Here are four steps to take for qualitative data analysis in systematic reviews.

Collect and Review the Data

Based on your eligibility criteria, search and screen the studies relevant to your review. This involves scouring libraries and databases, gathering documents, and printing or saving transcripts. You can also check for studies in the reference lists of already eligible studies. The recommendation of similar articles by databases during searching should also be checked.

Once you’ve collected your data, get a sense of what it contains by reading the collected studies (you’ll likely need to do this several times).

This step can be easier with systematic review software, such as DistillerSR which gives you access to more sources and applies AI to identify the literature you need.

Create And Identify Codes

Connect your data by creating and identifying common ideas. Highlight keywords, and categorize information; it may even be helpful to create concept maps for easy reference.

Develop Themes

Combine your codes and revise them into themes, recognizing recurring concepts, language, opinions, beliefs, etc.

Derive Conclusions and Summarize Findings

Present the themes that you’ve collected in a cohesive manner, using them to answer your review’s research question. Finally, derive conclusions from the data, and summarize your findings in a report.

3 Reasons to Connect

Communicative Sciences and Disorders

Online Learners: Quick Links
ASHA Journals
Research Tip 1: Define the Research Question
Reference Resources
Evidence Summaries & Clinical Guidelines
Drug Information
Health Data & Statistics
Patient/Consumer Facing Materials
Images/Streaming Video
Database Tutorials
Crafting a Search
Cited Reference Searching
Research Tip 4: Find Grey Literature
Research Tip 5: Save Your Work
Cite and Manage Your Sources
Critical Appraisal
What are Literature Reviews?
Conducting & Reporting Systematic Reviews
Finding Systematic Reviews
Tutorials & Tools for Literature Reviews
Point of Care Tools (Mobile Apps)

Choosing a Review Type

For guidance related to choosing a review type, see:

"What Type of Review is Right for You?" - Decision Tree (PDF) This decision tree, from Cornell University Library, highlights key difference between narrative, systematic, umbrella, scoping and rapid reviews.
Reviewing the literature: choosing a review design Noble, H., & Smith, J. (2018). Reviewing the literature: Choosing a review design. Evidence Based Nursing, 21(2), 39–41. https://doi.org/10.1136/eb-2018-102895
What synthesis methodology should I use? A review and analysis of approaches to research synthesis Schick-Makaroff, K., MacDonald, M., Plummer, M., Burgess, J., & Neander, W. (2016). What synthesis methodology should I use? A review and analysis of approaches to research synthesis. AIMS Public Health, 3 (1), 172-215. doi:10.3934/publichealth.2016.1.172 More information less... ABSTRACT: Our purpose is to present a comprehensive overview and assessment of the main approaches to research synthesis. We use "research synthesis" as a broad overarching term to describe various approaches to combining, integrating, and synthesizing research findings.
Right Review - Decision Support Tool Not sure of the most suitable review method? Answer a few questions and be guided to suitable knowledge synthesis methods. Updated in 2022 and featured in the Journal of Clinical Epidemiology 10.1016/j.jclinepi.2022.03.004

Types of Evidence Synthesis / Literature Reviews

Literature reviews are comprehensive summaries and syntheses of the previous research on a given topic. While narrative reviews are common across all academic disciplines, reviews that focus on appraising and synthesizing research evidence are increasingly important in the health and social sciences.

Most evidence synthesis methods use formal and explicit methods to identify, select and combine results from multiple studies, making evidence synthesis a form of meta-research.

The review purpose, methods used and the results produced vary among different kinds of literature reviews; some of the common types of literature review are detailed below.

Common Types of Literature Reviews 1

Narrative (literature) review.

A broad term referring to reviews with a wide scope and non-standardized methodology
Search strategies, comprehensiveness of literature search, time range covered and method of synthesis will vary and do not follow an established protocol

Integrative Review

A type of literature review based on a systematic, structured literature search
Often has a broadly defined purpose or review question
Seeks to generate or refine and theory or hypothesis and/or develop a holistic understanding of a topic of interest
Relies on diverse sources of data (e.g. empirical, theoretical or methodological literature; qualitative or quantitative studies)

Systematic Review

Systematically and transparently collects and categorize existing evidence on a question of scientific, policy or management importance
Follows a research protocol that is established a priori
Some sub-types of systematic reviews include: SRs of intervention effectiveness, diagnosis, prognosis, etiology, qualitative evidence, economic evidence, and more.
Time-intensive and often takes months to a year or more to complete
The most commonly referred to type of evidence synthesis; sometimes confused as a blanket term for other types of reviews

Meta-Analysis

Statistical technique for combining the findings from disparate quantitative studies
Uses statistical methods to objectively evaluate, synthesize, and summarize results
Often conducted as part of a systematic review

Scoping Review

Systematically and transparently collects and categorizes existing evidence on a broad question of scientific, policy or management importance
Seeks to identify research gaps, identify key concepts and characteristics of the literature and/or examine how research is conducted on a topic of interest
Useful when the complexity or heterogeneity of the body of literature does not lend itself to a precise systematic review
Useful if authors do not have a single, precise review question
May critically evaluate existing evidence, but does not attempt to synthesize the results in the way a systematic review would
May take longer than a systematic review

Rapid Review

Applies a systematic review methodology within a time-constrained setting
Employs methodological "shortcuts" (e.g., limiting search terms and the scope of the literature search), at the risk of introducing bias
Useful for addressing issues requiring quick decisions, such as developing policy recommendations

Umbrella Review

Reviews other systematic reviews on a topic
Often defines a broader question than is typical of a traditional systematic review
Most useful when there are competing interventions to consider

1. Adapted from:

Eldermire, E. (2021, November 15). A guide to evidence synthesis: Types of evidence synthesis. Cornell University LibGuides. https://guides.library.cornell.edu/evidence-synthesis/types

Nolfi, D. (2021, October 6). Integrative Review: Systematic vs. Scoping vs. Integrative. Duquesne University LibGuides. https://guides.library.duq.edu/c.php?g=1055475&p=7725920

Delaney, L. (2021, November 24). Systematic reviews: Other review types. UniSA LibGuides. https://guides.library.unisa.edu.au/SystematicReviews/OtherReviewTypes

Integrative Reviews

"The integrative review method is an approach that allows for the inclusion of diverse methodologies (i.e. experimental and non-experimental research)." (Whittemore & Knafl, 2005, p. 547).

The integrative review: Updated methodology Whittemore, R., & Knafl, K. (2005). The integrative review: Updated methodology. Journal of Advanced Nursing, 52 (5), 546–553. doi:10.1111/j.1365-2648.2005.03621.x More information less... ABSTRACT: The aim of this paper is to distinguish the integrative review method from other review methods and to propose methodological strategies specific to the integrative review method to enhance the rigour of the process....An integrative review is a specific review method that summarizes past empirical or theoretical literature to provide a more comprehensive understanding of a particular phenomenon or healthcare problem....Well-done integrative reviews present the state of the science, contribute to theory development, and have direct applicability to practice and policy.

Conducting integrative reviews: A guide for novice nursing researchers Dhollande, S., Taylor, A., Meyer, S., & Scott, M. (2021). Conducting integrative reviews: A guide for novice nursing researchers. Journal of Research in Nursing, 26(5), 427–438. https://doi.org/10.1177/1744987121997907
Rigour in integrative reviews Whittemore, R. (2007). Rigour in integrative reviews. In C. Webb & B. Roe (Eds.), Reviewing Research Evidence for Nursing Practice (pp. 149–156). John Wiley & Sons, Ltd. https://doi.org/10.1002/9780470692127.ch11

Scoping Reviews

Scoping reviews are evidence syntheses that are conducted systematically, but begin with a broader scope of question than traditional systematic reviews, allowing the research to 'map' the relevant literature on a given topic.

Scoping studies: Towards a methodological framework Arksey, H., & O'Malley, L. (2005). Scoping studies: Towards a methodological framework. International Journal of Social Research Methodology, 8 (1), 19-32. doi:10.1080/1364557032000119616 More information less... ABSTRACT: We distinguish between different types of scoping studies and indicate where these stand in relation to full systematic reviews. We outline a framework for conducting a scoping study based on our recent experiences of reviewing the literature on services for carers for people with mental health problems.
Scoping studies: Advancing the methodology Levac, D., Colquhoun, H., & O'Brien, K. K. (2010). Scoping studies: Advancing the methodology. Implementation Science, 5 (1), 69. doi:10.1186/1748-5908-5-69 More information less... ABSTRACT: We build upon our experiences conducting three scoping studies using the Arksey and O'Malley methodology to propose recommendations that clarify and enhance each stage of the framework.
Methodology for JBI scoping reviews Peters, M. D. J., Godfrey, C. M., McInerney, P., Baldini Soares, C., Khalil, H., & Parker, D. (2015). The Joanna Briggs Institute reviewers’ manual: Methodology for JBI scoping reviews [PDF]. Retrieved from The Joanna Briggs Institute website: http://joannabriggs.org/assets/docs/sumari/Reviewers-Manual_Methodology-for-JBI-Scoping-Reviews_2015_v2.pdf More information less... ABSTRACT: Unlike other reviews that address relatively precise questions, such as a systematic review of the effectiveness of a particular intervention based on a precise set of outcomes, scoping reviews can be used to map the key concepts underpinning a research area as well as to clarify working definitions, and/or the conceptual boundaries of a topic. A scoping review may focus on one of these aims or all of them as a set.

Systematic vs. Scoping Reviews: What's the Difference?

YouTube Video 4 minutes, 45 seconds

Rapid Reviews

Rapid reviews are systematic reviews that are undertaken under a tighter timeframe than traditional systematic reviews.

Evidence summaries: The evolution of a rapid review approach Khangura, S., Konnyu, K., Cushman, R., Grimshaw, J., & Moher, D. (2012). Evidence summaries: The evolution of a rapid review approach. Systematic Reviews, 1 (1), 10. doi:10.1186/2046-4053-1-10 More information less... ABSTRACT: Rapid reviews have emerged as a streamlined approach to synthesizing evidence - typically for informing emergent decisions faced by decision makers in health care settings. Although there is growing use of rapid review "methods," and proliferation of rapid review products, there is a dearth of published literature on rapid review methodology. This paper outlines our experience with rapidly producing, publishing and disseminating evidence summaries in the context of our Knowledge to Action (KTA) research program.
What is a rapid review? A methodological exploration of rapid reviews in Health Technology Assessments Harker, J., & Kleijnen, J. (2012). What is a rapid review? A methodological exploration of rapid reviews in Health Technology Assessments. International Journal of Evidence‐Based Healthcare, 10 (4), 397-410. doi:10.1111/j.1744-1609.2012.00290.x More information less... ABSTRACT: In recent years, there has been an emergence of "rapid reviews" within Health Technology Assessments; however, there is no known published guidance or agreed methodology within recognised systematic review or Health Technology Assessment guidelines. In order to answer the research question "What is a rapid review and is methodology consistent in rapid reviews of Health Technology Assessments?", a study was undertaken in a sample of rapid review Health Technology Assessments from the Health Technology Assessment database within the Cochrane Library and other specialised Health Technology Assessment databases to investigate similarities and/or differences in rapid review methodology utilised.
Rapid Review Guidebook Dobbins, M. (2017). Rapid review guidebook. Hamilton, ON: National Collaborating Centre for Methods and Tools.
NCCMT Summary and Tool for Dobbins' Rapid Review Guidebook National Collaborating Centre for Methods and Tools. (2017). Rapid review guidebook. Hamilton, ON: McMaster University. Retrieved from http://www.nccmt.ca/knowledge-repositories/search/308
<< Previous: Literature Reviews
Next: Conducting & Reporting Systematic Reviews >>
Last Updated: Jun 26, 2024 3:00 PM
URL: https://guides.nyu.edu/speech

Open access
Published: 26 June 2024

WHO, WHEN, HOW: a scoping review on flexible at-home respite for informal caregivers of older adults

Maude Viens 1 , 2 ,
Alexandra Éthier 1 , 2 ,
Véronique Provencher 1 , 2 &
Annie Carrier 1 , 2

BMC Health Services Research volume 24 , Article number: 767 ( 2024 ) Cite this article

20 Accesses

Metrics details

As the world population is aging, considerable efforts need to be put towards developing and maintaining evidenced-based care for older adults. Respite services are part of the selection of homecare offered to informal caregivers. Although current best practices around respite are rooted in person centeredness, there is no integrated synthesis of its flexible components. Such a synthesis could offer a better understanding of key characteristics of flexible respite and, as such, support its implementation and use.

To map the literature around the characteristics of flexible at-home respite for informal caregivers of older adults, a scoping study was conducted. Qualitative data from the review was analyzed using content analysis. The characterization of flexible at-home respite was built on three dimensions: WHO , WHEN and HOW . To triangulate the scoping results, an online questionnaire was distributed to homecare providers and informal caregivers of older adults.

A total of 42 documents were included in the review. The questionnaire was completed by 105 participants. The results summarize the characteristics of flexible at-home respite found in the literature. Flexibility in respite can be understood through three dimensions: (1) WHO is tendering it, (2) WHEN it is tendered and (3) HOW it is tendered. Firstly, human resources ( WHO ) must be compatible with the homecare sector as well as being trained and qualified to offer respite to informal caregivers of older adults. Secondly, flexible respite includes considerations of time, duration, frequency, and predictability ( WHEN ). Lastly, flexible at-home respite exhibits approachability, appropriateness, affordability, availability, and acceptability ( HOW ). Overall, flexible at-home respite adjusts to the needs of the informal caregiver and care recipient in terms of WHO , WHEN , and HOW .

This review is a step towards a more precise definition of flexible at-home respite. Flexibility of homecare, in particular respite, must be considered when designing, implementing and evaluating services.

Peer Review reports

It is an undeniable fact that the world population is aging [ 1 ]. The World Health Organization [ 1 ] estimates that from 2015 to 2050, the percentage of people over 60 years of age will nearly double (from 12 to 22%). Governments must therefore put in place policies, laws and funding infrastructures to provide evidence-based social services and healthcare that are in line with best practices to allow people to age in place [ 2 ]. Aging in place refers to “the ability to live in one’s own home and community safely, independently, and comfortably, regardless of age, income, or ability level” [ 3 ]. Relevant literature indicates that people do not want to age or end their lives in institutionalized care; most wish to receive care in their home and remain in their community with their informal caregivers [ 4 ].

There is then a need to adequately support informal caregivers (caregiver) in the crucial role that they have in allowing older adults to age in their own home. A caregiver is “a person who provides some type of unpaid, ongoing assistance with activities of daily living or instrumental activities of daily living” [ 5 ]. In their duties, caregivers of older adults are responsible for a considerable amount of homecare [ 6 ]: Transportation, management of appointments and bills, domestic chores, etc. Private and public organizations offer a plethora of services to support caregivers of older adults (e.g., support groups, housekeeping, etc.), including respite. Respite is a service for caregivers consisting in “the temporary provision of care for a person, at home or in an institution, by people other than the primary caregiver” [ 7 ]. Maayan and collaborators [ 7 ] characterize all respite services according to three dimensions: (1) WHERE : The place; in a private home, a daycare centre or a residential setting, (2) WHEN : The duration and planning; ranging from a couple of hours to a number of weeks, planned or unplanned, and finally, (3) WHO : The person providing the service; this may be trained or untrained individuals, paid staff or volunteers. Respite is widely recognized as necessary to support caregivers of older adults [ 8 , 9 ]. Indeed, a large number of studies identify the need and use for respite [ 9 , 10 , 11 , 12 ]. For example, Dal Santo and colleagues (2007) found that caregivers of older adults ( n = 1643) used respite to manage stressful caregiving situations, but also to have a “time away”, without having to worry about their caregiving role [ 13 ]. At-home respite seems to be favoured over other forms of respite, even with the perceived drawbacks, such as the privacy breach of having a care worker in one’s home [ 14 , 15 ].

Studies suggest that caregivers of older adults seek flexibility as a main component of respite [ 16 , 17 , 18 ]. Flexibility, in line with person-centered care, allows respite that addresses their needs, rather than being services that are prescribed according to other criteria [ 16 , 17 ]. Thus, flexibility, both in accessing and in the respite itself, is essential [ 19 , 20 , 21 , 22 , 23 ]. Although there seems to be a consensus around the broader definition of respite, there is no literature reviewing the characteristics of flexible at-home respite. Some studies and reports from organizations and governments document the flexible characteristics of their models, but there are few literature reviews that address them, specifically [ 18 , 22 , 24 ]. Both reviews by Shaw et al. [ 18 ] and Neville et al. [ 19 ] concede that an operational definition of respite ( WHEN , WHERE , WHO ) is not clear. Neville et al. [ 19 ] conclude that “respite has the potential to be delivered in flexible and positive ways”, without addressing these ways. The absence of a unified definition for flexible at-home respite contributes to the challenges of implementing and evaluating services, as well as measuring their effect. Although respite services are deemed necessary, they are seldom used [ 19 , 25 , 26 , 27 ]; as little as 6% of all caregivers receiving any kind support services in Canada actually use them. In scientific literature, the under-usage of respite services is a shared reality around the world [ 28 ]. One of the main reasons for this under-usage is the overall lack of flexibility in both obtaining and using respite [ 29 , 30 ]. Synthesizing the characteristics of flexible at-home respite services is the first steppingstone to a common operational definition. This could contribute to increasing respite use through the implementation or enrichment of programs in ways that answer the dyad’s (caregiver and older adult) needs.

Consequently, to support the implementation and evaluation of homecare programs, the objective of this study was to synthesize the knowledge on the characteristics of flexible at-home respite services offered to caregivers of older adults.

A scoping review [ 32 , 33 , 34 ] was conducted, as part of a larger multi-method participatory research known as the AMORA project [ 31 ] to characterize flexible at-home respite. Scoping reviews allow to map the extent of literature on a specific topic [ 32 , 34 ]. The six steps proposed by Levac et al. [ 32 ] were followed: [ 1 ] Identifying the research question; [ 2 ] searching and [ 3 ] selecting pertinent documents; [ 4 ] extracting ( or charting ) relevant data; [ 5 ] collating, summarizing and reporting findings; [ 6 ] consultation with stakeholders. The sixth step is optional.

Identifying the research question

The research question was: “What are the characteristics of flexible at-home respite services offered to caregivers of older adults?” As the research was conducted, this question was divided into three sub-questions:

WHO is tendering flexible respite?

WHEN is flexible respite tendered?

HOW is flexible respite tendered?

Identifying relevant documents

The search strategy consisted of two methods. First, the key words (1) respite (2) informal caregivers (3) older adults in the title or abstract allowed to identify relevant documents (Table 1 ). Initially included, the term “ flexib *” was removed from the search, given the low number generated (60 versus 1,179 documents without). The first author and a librarian specialized in health sciences research documentation conducted the literature research in July of 2021 and updated it in December of 2022 in 6 databases ( Ageline , Cochrane , CINAHL , Medline , PsychInfo , and Abstracts in Social Gerontology ). The expanded research strategy then consisted of the identification of relevant documents from the selected bibliography and one article that was found by searching for unavailable references (alternative article).

Study selection

To review the most recent literature on flexible at-home respite service characteristics, the research team focused on writings within a 20-year span, as have other reviews (e.g., [ 35 , 36 ]); documents thus had to be published between 2001 and 2022. The research team selected documents written in French or English, only. Included documents had to come from either (1) scientific literature (i.e., articles in an academic journal presenting an empirical study or reviews) or (2) reports and briefs from government, homecare organizations or research centres. All study designs were included. The research team convened that at-home respite is an (1) individual (i.e., not in a group) service (although, theoretically, two persons living in the same household could receive it) from (2) a professional or a volunteer that occurs (3) in the home and that (4) it requires no transport for the dyad. To select documents related to flexible at-home respite, the research team identified those in which the respite displayed an ability to adapt to the dyad’s needs on at least one characteristic of the service, as presented by Maayan and collaborators ( WHERE [Not relevant to this review, as it focuses on at-home respite] , WHO , WHEN ). The team concluded that these three dimensions lacked the precision to globally characterize the service. Indeed, they did not describe access to or activities occurring during respite, or, as the team called it, the HOW (Fig. 1 ). Excluded documents were those covering several services at once, preventing the differentiation of elements that were specific to at-home respite services. As this is a scoping review, the research team did not include a critical appraisal of individual sources of evidence [ 32 , 34 ].

Following the step-by-step Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMAScR) guidelines [ 37 ], the research team met to define the selection strategy. First, they screened the documents by their titles and abstracts, before determining their eligibility, based on their full text. Considering the limited human and financial resources, at each step of the PRISMAScR, a second team member assessed 10% of the documents independently to co-validate the selection; the goal was to reach 80% of agreement between both team members regarding document inclusion or exclusion. If an agreement was not reached, they would meet to obtain a consensus. The research team used Zotero reference management software to store documents as well as a cloud-based website to collaborate on the selection.

Conceptual mapping of results: HOW , WHEN , WHO

Charting the data

The first author charted (or extracted) both quantitative and qualitative data. To quantitatively characterize documents, contextual data (country of origin, year of publication, type of documents, etc.) was extracted in a Microsoft Excel table. For the qualitative data, the research team created an extraction table in Microsoft Word that included the three dimensions of respite ( WHO , WHEN and HOW) and one “ other ” dimension, as to not force any excerpts under the three dimensions. To co-validate the data charting, the second and third authors replicated 10% of the process. Expressly, the first author extracted elements related to a flexible characteristic of the at-home respite ( WHO , WHEN , HOW or other ). Considering limited resources, the third and second authors both co-validated the extraction of 10% of the documents. Authors met to reach a consensus where a disagreement arose.

Collating, summarizing, and reporting the results

The research team used content analysis to “attain a condensed and broad description of the phenomenon” [ 38 ]. To do so, data was prepared (familiarization with the data and extraction of pertinent excerpts) and organized (classification of excerpts) to build a characterization of flexible at-home respite. In this scoping review, a deductive content analysis began with three main categories ( WHO , WHEN , HOW ), with the addition of the temporary “ other ” category. Content analysis aimed to divide these categories into several generic categories, which subdivided into sub-categories (Fig. 2 ), inductively. This allowed to define the three main categories. While the WHO and the WHEN categories describe the service itself (time, duration, qualified staff, etc.), the HOW category is specific to the interface between the organization offering respite and the dyad (assessing the needs of the dyad, coordinating care, etc.). An interface is a situation where two “subjects” interact and affect each other [ 39 ]. In the context of homecare services, Levesque, Harris and Russell (2013) have defined that interface as access [ 40 ]. Therefore, to define the generic categories of the HOW , the team used the five dimensions of their access to care framework: Approachability, appropriateness, affordability, availability and acceptability [ 40 ]. Approachability relates to users recognizing the existence and accessibility of a service [ 40 ]. Appropriateness encompasses the alignment between services and users’ needs, considering timeliness and assessment of needs [ 40 ]. Affordability pertains to users’ economic capacity to allocate resources for accessing suitable services [ 40 ]. Availability signifies that services can be reached, both physically and in a timely manner [ 40 ]. Acceptability involves cultural and social factors influencing users’ willingness to accept services [ 40 ]. In other words, the HOW category focuses on the organizational or professional aspects of the service and how they can be adapted to the dyad.

To co-validate the classification, the research team met until they were all satisfied with the categorization. The first author then completed the classification. After classifying 20% of the documents, the second author would comment the classification. When the authors reached an agreement, the first author would move on to the classification of another 20%. First and second authors would meet when disagreements about classification and categories arose, to confer and adjust. Finally, all categories were discussed with the third author, until a consensus was reached. Once categorization was achieved, the team prepared a synthesis report. In this report, the team defined the main categories ( WHO , WHEN, HOW , other ) and their generic and sub-categories (Fig. 2 ) with pertinent excerpts from the reviewed literature. In summary, the results of the scoping review characterize flexible at-home respite under three attributes: WHO , WHEN and HOW .

Content analysis: Types of categories according to Elo and Kyngäs (2007) ( with examples from results )

Consultation

Rather than conducting a focus group as suggested by Levac and collaborators [ 32 ], the team chose to triangulate the results with those from a survey as a consultation strategy. Specifically, the research team took advantage of a survey being conducted with relevant stakeholders in the larger study (AMORA project), as it allowed to respect the scoping review’s allocated resources. The survey aimed to define flexible at-home respite and the factors affecting its implementation and delivery. A committee including a researcher, a doctoral student and a representative of an organization funding homecare services in Québec (Canada), developed the survey following the three stages proposed by Corbière and Fraccaroli [ 41 ]. It originally included a total of 21 items: Thirteen [ 13 ] close-ended and 8 open-ended questions. Of these 8, 2 addressed the characteristics of an ideal at-home service and suggestions regarding respite and were used here for triangulation purposes. The questionnaire was published online, in French, on the Microsoft Forms ® platform in the summer of 2020. Recruitment of participants (caregivers and people from the homecare sector) was done via email, by contacting regional organizations (Eastern Townships, Québec, Canada). In addition, the 18 senior consultation tables spread throughout the territory of the province of Québec were solicited; working in collaboration with governmental instances in charge of services to older adults and caregivers, these tables bring together representatives for associations, groups or organizations concerned with their living conditions.

The goal was to triangulate the scoping review’s results, i.e., to identify what was common between the literature and real-world experiences, and, as such, to bring contextual value to the results. Accordingly, the team analyzed data using mixed categorization [ 42 ]. The categories from the scoping review served as a starting point (closed categorization), leaving room to create new categories, as the analysis progressed (open categorization). Once all the data (scoping and survey) was categorized, the team identified the characteristics according to sources. To do so, the team tabulated the reoccurrence of each category in the survey, in the scoping review, or in both. They then integrated the results to provide one unified categorization of flexible at-home respite. The AMORA project was approved by the research ethics committee of the Integrated University Health and Social Services Centre (CIUSSS) of the Eastern Townships (project number: 2021–3703).

Of the 1,301 papers retrieved through the database searches, 1,146 were not eligible based on title and abstract, while 116 were excluded after reading their full texts, resulting in 39 included documents (Fig. 3 ). Documents were mainly excluded because they did not provide details about the respite service and its flexibility. The expanded search yielded three additional documents, resulting in a total of 42 documents, included in this scoping review. This section details (1) the characteristics of the selected documents and (2) the characterization of flexible at-home respite.

Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMAScR) flow chart of the scoping review process [ 37 ]

Characteristics of selected documents

The majority (86%) of the documents in the review (Table 2 ) are from after 2005, with only 14% of the documents published before 2005, and are from 9 countries; United States ( n = 18; 42%), United Kingdom ( n = 11; 26%), Australia ( n = 4; 10%), Canada ( n = 2; 5%), Ireland ( n = 2; 5%), France ( n = 2; 5%), Belgium ( n = 2; 5%), Germany ( n = 1; 2%), New Zealand ( n = 1). The types of documents were diverse: 68% ( n = 28) were empirical studies, 31% ( n = 13) theoretical papers and 1% ( n = 2) government briefs. Most ( n = 23; 56%) of the documents did not specify their research approach, while 10 and 9 took, respectively, a qualitative (23%) or quantitative approach (21%). Most documents address respite in the context of caregiving for someone living with Alzheimer’s disease or other neurocognitive disorders ( n = 25; 60%), while some targeted older adults in general ( n = 14; 34%), people in palliative care ( n = 4; 9%) or other older adult populations (for example, veterans) ( n = 3; 1%). Respite was usually tendered by community organizations specialized in homecare ( n = 32; 78%). Although the majority of the documents ( n = 31; 75%) did not address the type of region (rural, urban, or mixed) surrounding the caregivers, those who did ( n = 11; 26%) mainly reported being in a mixed environment ( n = 9; 21%).

Characteristics of survey participants

Although all 100 participants completed the questionnaire, 71 participants answered at least 1 of the 2 open-ended questions: Each question had 66 and 41 answers. Of those 71 participants, most of them were women ( n = 60; 85%). All participants were aged on average 55 years old (SD = 15). They were mostly from the Eastern Townships area ( n = 56; 79%). Most participants were either caregivers ( n = 24; 34%) or homecare workers ( n = 28; 39%), while some were service administrators ( n = 11; 15%), and some reported being both caregivers as well as working in the formal caregiving sector ( n = 7; 10%). Only one person reported themselves as an older adult having a caregiver.

Characterization of flexible at-home respite

The characterization of flexible at-home respite will be presented below in three main categories which are WHO , WHEN , and HOW . Of note, 10 (24%) of the included documents had three categories of flexible components, 16 (38%) had 2 categories and 1 category. Almost all documents discussed the HOW of flexible at-home respite ( n = 40, 95%). Out of the 33 categories constructed with the scoping review, only 6 (18%) were not reported in the questionnaire: (1) planned respite ( WHEN ), (2) screening of dyads ( HOW ), (3) determining frequency of respite ( HOW ), (4) coordination of care ( HOW ), (5) voucher approach ( HOW ) and (6) acceptability to low-income households ( HOW ). Moreover, the questionnaire added three characteristics that were not present in the scoping review: (1) respite needs to be approachable, (2) the organization must be prompt** and adhocratic** and (3) able to deliver respite regardless of the season** (availability). Generic or sub-categories present only in the scoping review are identified with 1 asterisk (*), while those present only in the questionnaire have 2 (**).

In the selected documents, the WHO dimension of flexible at-home respite services can be broken down into three qualifiers: (1) Compatible , (2) qualified and (3) trained (Table 3 ). This dimension includes all human resources contributing to homecare (administrative staff, governing bodies, paid and volunteer care workers). First, the workforce behind flexible respite is compatible , meaning it has personal characteristics and profiles relevant to homecare for caregivers of older adults [ 17 , 53 , 62 , 63 , 68 ]. Gendron and Adam explain this by describing how the role of the care worker in Baluchon Alzheimer™ goes beyond training: “The nature of their work with [Baluchon Alzheimer™] requires particular human and professional qualities that are quite as important as academic credentials” [ 53 ]. Personal characteristics such as flexibility [ 53 , 62 , 63 , 68 ], empathy and patience [ 17 , 53 , 62 ] are deemed essential attributes. Secondly, the workforce is qualified : It has the necessary skills, abilities and knowledge from past professional [ 14 , 45 , 62 , 70 ] and personal experience [ 62 ] to work, or volunteer, with caregivers of older adults. For a program like Baluchon Alzheimer™, “the backgrounds of the baluchonneuses vary […]; all have experience in gerontology” [ 53 ]. Other areas of qualification in the included documents are a nursing background [ 18 , 45 ] or knowledge related to dementia [ 69 ]. Finally, flexible at-home respite requires a trained workforce engaged in the process of acquiring knowledge and learning the skills to provide respite services to caregivers of older adults. For example, homecare organizations can offer specific training on various topics, depending on their target clientele: Dementia [ 44 ], palliative care [ 59 ], or homecare in general [ 44 ].

The WHEN dimension of flexible at-home respite contains 4 temporal features: (1) Time , (2) duration , (3) frequency and (4) predictability (Table 4 ). First, flexible respite is available on a wide range of possible time slots. For example, the service is “available 24 hours, but typically from 9 am to 10 pm” [ 64 ]. Secondly, flexible respite is accessible on a wide range of possible durations . The Community Dementia Support Service (CDSS) is an example of flexibility in duration by “[being] totally flexible, being available from 2 to 15 hours per week” [ 69 ]. Thirdly, the service is offered in different frequencies : It can be either recurrent or occasional, or a combination of both [ 18 , 64 , 66 ]. The last feature of the WHEN dimension is flexibility in predictability ; the respite service can be planned* or not. A study on respite services in South Australia found that most providers (93%) planned the respite care with the dyad, but that emergency or crisis services were still offered by 35% of them [ 50 ].

At-home respite is flexible when it demonstrates approachability : Caregivers can identify that some form of respite exists and can be reached (Table 5 ). For the respite service to be approachable, the organization needs to be reaching out to dyads; it proactively makes sure that caregivers of older adults have information on services, know of their existence and that they can be used. For example, the El Portal program put in place “advisory groups that included the local clergy, representatives from businesses, caregivers, and service providers who were used for outreach work” [ 66 ]. The organization also screens* dyads to assess their eligibility for respite, as well as for other services from the same program or organization. For example, the North Carolina (U.S.A.) Project C.A.R.E. has an initial assessment that considers the range of homecare services available, rather than just assessing for eligibility for a program [ 57 ]. In addition, flexible respite requires the organization to set attainable and inclusive requirements for eligibility, as to not discourage use [ 24 , 57 , 61 , 66 ]. Finally, the organization communicates consistently with the dyad. As Shanley explains in their literature review, “there are clear and open ways for carers to express concerns about the service, and an open mechanism is available for dealing with these concerns constructively” [ 17 ]. In addition, the survey participants discussed two other characteristics. First, for respite to be approachable, the organization is prompt**, respecting a reasonable delay between the request and the beginning of the service (wait list). Second, it is adhocratic**, meaning the organization does not depend on complex systems of rules and procedures to operate i.e., bureaucracy.

The second access dimension of flexible at-home respite is appropriateness (Table 6 ): The fit between respite services and the dyad’s needs, its timeliness, the amount of care spent in assessing their needs and determining the correct respite service. For the respite service to be appropriate, the organization assesses needs by collecting details about the dyad’s needs; this can include, but is not limited to, clinical, psychological, or social evaluation. The organization then proposes respite services from a wide range of options or packages: A multi-respite package, as presented by Arksey et al., can simply be the combination of at least two different respite services [ 44 ]. For the service to be appropriate, the organization also paces the respite. Apprehension towards service appropriateness can be mitigated by a gradual introduction to homecare, for example when the respite is presented as a trial [ 68 ]. The organization determines the service with the dyad and defines its different characteristics ( WHEN * , WHO ) so interventions correspond to their needs. The organization then determines the appropriate activities to do with the dyad during the respite. For example, the caregiver of older adults can be encouraged to use respite time for leisure (sleep, physical activity, etc.) [ 45 ], while the care worker supports the beneficiary in engaging in an activity such as a walk or a board game [ 14 ]. Furthermore, the organization coordinates* the services for the dyad and acts as a “respite broker” to arrange all aspects of care; this is especially relevant for programs that include a “care budget” that can be used at the caregivers’ discretion [ 58 ]. Finally, for the respite to be appropriate, the organization assures that it is in continuity with other health services, by connecting the dyads to pertinent resources. As described by Shaw, respite should be “embedded in a context that includes assessment, carer education, case management and counselling” [ 18 ].

The third access dimension of flexible at-home respite is affordability , referring to the economic capacity of the dyad to spend resources to use appropriate respite services (Table 7 ). The included documents only explored the direct cost of respite: The amount of money a dyad must pay to receive services. For the respite to be affordable, its direct cost is either (1) adapted, where the cost is modulated according to the dyad’s financial resources, for example on a sliding scale, based on income or (2) nonexistent [ 44 ].

Next, flexible at-home respite must demonstrate availability (Table 8 ): Services can be reached both physically and in a timely manner. Firstly, the organization offers respite in the dyads’ geographic area. Shanley described an at-home mobile respite program designed to reach rural and remote areas, where two care workers visit different locations for set periods of time [ 17 ]. Moreover, one sub-characteristic identified exclusively by the survey participants was seasonality. Indeed, the dyad has access to respite, regardless of the season**. Thus, the geography category is broken down between the access to service (1) in rural or remote areas and (2) notwithstanding the season. Flexibility in availability also requires that the dyads have access to unlimited respite time; the organization does not assign a finite bank of hours. Finally, the organization proposes diverse payment methods to the dyads. The consumer-directed approach is a way that homecare organizations offer flexibility. A care budget is allocated to the caregiver to purchase hours from homecare agencies or to hire their own respite workers. This includes payments to family members or friends to provide respite care [ 79 ]. An example of a type of consumer-directed approach is the use of vouchers*: Credit notes or coupons to purchase service hours from homecare agencies [ 44 ].

Finally, access to flexible at-home respite also relates to acceptability (Table 9 ): The cultural and social factors determining the possibility for the dyad to accept respite and the perception of the appropriateness of seeking services. For the respite to be acceptable, the organization targets and caters to the cultural diversity represented in their local population. The organization is also able to identify and to accommodate underserved groups. In the included documents, underserved groups lacked access to respite for two reasons: (1) Geographic isolation or (2) the requirements to be eligible to “traditional homecare” does not apply to them, for example, for younger people with dementia and people with HIV/AIDS [ 17 ]. The organization can target and cater to low-income households*. Rosenthal Gelman and his collaborators detail a program where, after realizing that low-income caregivers have greater unmet needs, special funds were set aside for respite care vouchers to be distributed [ 70 ].

This scoping review conducted with Levac and colleagues’ method [ 32 ] synthesized the knowledge on the characteristics of flexible at-home respite services offered to caregivers of older adults, from 42 documents. The results provide a synthesis of the characteristics of flexible at-home respite discussed in the literature. The three dimensions of flexibility in respite relate to (1) WHO is tendering it, (2) WHEN it is tendered and (3) HOW it is tendered. First, human resources ( WHO ) must be compatible with the homecare sector as well as being trained and qualified to offer respite to caregivers of older adults. The second feature of flexible respite is temporality ( WHEN ): The time, duration, frequency, and predictability of the service. The last dimension, access ( HOW ), refers to the interface between the respite and the users. Flexible at-home respite exhibits approachability, appropriateness, affordability, availability, and acceptability. In the light of what we learned, flexible at-home respite could be characterized as a service that has the ability to adjust to the needs of the dyad on all three dimensions ( W HO , WHEN , HOW ). However, this seems to be more of an ideal than a reflection of reality.

The survey provided complementary results to the review; the concordance between the two is strong (27/33 = 82%). Six [ 6 ] characteristics were missing from the survey results, including planned respite and the voucher approach ( HOW ). Moreover, the survey added three elements to the review results: The organization’s adhocracy ( HOW ) and promptness ( HOW ) as well as its ability to offer services, regardless of the season ( HOW ). These mismatches might reflect the Québec (and possibly Canadian) landscape of homecare. For example, in the Québec homecare system, respite is mostly planned, it is therefore not surprising that people only mention that unplanned respite is lacking. The “voucher system” was not mentioned in the survey, probably in part because it does not exist in the province of Québec. Additionally, navigating the healthcare system to have free or affordable homecare can be treacherous [ 80 ]. In short, older adults have to go through (1) evaluation(s) by a social worker from a hospital or another public healthcare organization and (2) various administrative tasks ( adhocratic ) [ 2 ], before possibly being put on a waiting list ( prompt ) [ 81 ]. In addition, Canada can experience harsh winters ( seasonality ) that can make transport, which is an integral part of homecare, particularly laborious. Although those categories could reflect the particularity of homecare in Canada, a promising follow up on this review would be to compare the characteristics of flexible respite from one territory to another. It would contribute to providing a more operational definition of flexible at-home respite.

The remainder of this discussion will focus on two main points before touching on the limitations and strengths of this review. First, flexibility in at-home respite seems exceptional. Second, respite care workers are as skilled as they are underappreciated.

This review, in coherence with the literature, highlights the fact that respite services generally lack flexibility: This is the conclusion of several studies on respite [ 7 , 64 , 82 ]. A pattern seems to emerge in the countries represented in the review: Community organizations specialized in homecare (public and/or privately funded) offer respite on predetermined time slots, usually prescribed between traditional office hours (9 AM to 6 PM) [ 50 ]. This lack of flexibility could be explained in part by the rigidity of the structure of homecare services and the fact that its funding does not allow for customizable and punctual services [ 17 , 62 , 73 ]. Nevertheless, there were some examples of flexible respite models, such as Baluchon Alzheimer™ and consumer-directed approaches. Baluchon Alzheimer™ offers long-term at-home respite (4 to 14 days) by qualified and trained baluchonneuses . Prior to the relay of the caregivers, the baluchonneuse takes the time to learn about the dyad, including their environment and routine [ 53 , 62 ]. Caregivers report feeling refreshed upon their return and appreciate the diaries (or logbooks) that the baluchonneuse meticulously fills out [ 53 ]. Another example would be consumer-directed approaches, where caregivers are attributed a budget to hire their own care worker. Allowing caregivers to choose their care worker (either from a self-employed carer or family and friends) can increase the quality of care and satisfaction, while providing relatively affordable care, especially in a situation of labour shortage [ 51 , 79 ]. Even though these two models are a demonstration of how respite can be adapted to the caregiver-senior dyad, for the most part, flexibility is lacking on all three dimensions of respite ( WHO , WHEN , HOW ).

Secondly, the results from the scoping review highlight how homecare as a profession is often overlooked. Indeed, the reviewed documents state the necessary set of skills to offer respite; the level described is one of highly specialized care professionals with important liability. These skills must also transcend advanced knowledge and qualifications, to include interpersonal capabilities [ 17 , 53 , 62 , 63 , 68 ]. Furthermore, care workers must also be flexible to offer a wide range of service time and duration, in addition to being ready to provide “on-the-go” respite [ 53 , 68 ]. Yet, the occupation of homecare worker is an underappreciated and underpaid position [ 83 ]. Community care, like respite, is generally not a priority for social and healthcare funding [ 24 ]. This can be explained in part by the neoliberal approach to care in which the target is to minimize spending and maximize (measurable) outcomes [ 84 ]. Homecare outcomes are often overlooked in favour of service delivery evaluation, in part because they are difficult to measure [ 44 ]. This approach can also lead to prioritizing third party contracting instead of including respite in the range of public services, as to save on expenses related to employment (insurance and other benefits) [ 85 ]. Another contributor is that funding is used for service administration and not to adequately provide services or remunerate care workers [ 86 ]. Finally, care workers are mostly women, known for doing the invisible work that is at the heart of respite care (emotional support, etc.) [ 87 ]. A telling example from the reviewed documents is that Baluchon Alzheimer™ refers to their care workers as baluchonneuses (feminine form) and not baluchoneurs (masculine form) [ 53 ]. Consequently, the homecare sector is faced with recruitment and retention challenges [ 44 , 64 , 88 ]. Authors of the documents included in the review addressed the fact that flexibility in service meant that service providers had to function with excess capacity; for example, by building an “employee bank” to cover all the hours of the day and emergency calls [ 44 ]. Ultimately, staff turnover and shortage caused in part by the work being underappreciated could create a vicious cycle, leading to inflexibility in respite. In short, overlooking and underestimating the crucial and specialized work of homecare workers can contribute to staff turnover, which in turn could result in a lack of flexibility of at-home respite.

Limitations and strengths

The review’s methodological approach has some limitations and strengths. First, according to Levac, Colquhoun and O’Brien [ 32 ], research teams could conduct a sixth step in their scoping study, consisting in consulting experts through a focus group or workshop. This last phase aims at providing further insight into the review’s results and to begin the knowledge translation process. The team did not conduct a traditional consultation phase. Instead, they triangulated the review’s results through a questionnaire. This method was of interest, because of the natural concordance between the results and the considerable number of participants ( n = 100). The survey still allowed to refine the characterization of respite, but further knowledge transfer to homecare actors and caregivers is necessary. Although innovative, there is a need to further investigate the validity of this approach as a consultation phase. Secondly, the theme of flexible at-home respite may have narrowed the search and identification of relevant documentation, and therefore caused the team to overlook some of the literature. Empirical studies and reviews on respite seldom include a detailed description of services [ 89 , 90 , 91 ]. This made it challenging to understand what services are like, operationally, for the dyad and to judge their flexibility. In addition, it complexified the extraction of relevant data, as descriptions were sparse and scattered throughout the documents. The team worked to mitigate these limitations in the documentation research and data charting phase. To begin, they sorted through all the literature on at-home respite for caregivers of older adults. In other words, the team not only searched for, but also included any explicit mention of flexibility. After selection, the extraction tables allowed enough versatility to include all the flexible characteristics of service, regardless of their placement in the text (introduction, methodology or discussion) or length. Another limitation is that, due to resource constraints, only 10% of the document selection and extraction was assessed by two reviewers, although a minimum of 80% of agreement was met and discussions were used to reach consensus where a disagreement arose. To conclude, strengths of this review include the extensiveness and diversity of the documents and its rigorous methodology, co-validated by a peer and an experienced researcher, with assistance from a specialized librarian.

This review has both scientific and practical implications. From a scientific point of view, the results contribute to the body of knowledge on flexible respite service models for caregivers of seniors, an under-documented topic. To our knowledge, this is the first review that aims to characterize flexible at-home respite. Our results suggest the relevance of further documenting the factors influencing the implementation and delivery of flexible respite services, as well as the consequences of the lack of flexibility in respite services, which may lead to service underuse. Moreover, researchers could focus on documenting respite programs in countries that are not represented in this review. There were notably no documents from the continents of Asia and Africa. Unfortunately, good practices can go unreported in peer-reviewed publications; therefore, a review focusing on government reports and publications aimed at professionals could shed some light on promising respite models. From a practical point of view, this review serves as a starting point for the implementation of flexible home respite that is tailored to the caregivers’ and older adults’ needs. Our characterization of flexible at-home respite can be used to guide the improvement of existing respite services and to design new resources that reflect best practices in homecare, ultimately contributing to successful aging in place for older adults.

Data availability

The data supporting this study’s findings are available from the corresponding author, upon reasonable request.

Abbreviations

Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews

World Health Organization. Ageing and health. In Newsroom. 2022. https://www.who.int/news-room/fact-sheets/detail/ageing-and-health . Accessed 3 Feb 2023.

Ministère de la Santé et des Services sociaux. Chez soi: le premier choix, politique de soutien à domicile. 2003. https://publications.msss.gouv.qc.ca/msss/document-001351/ . Accessed 20 Mar 2022.

Centers for Disease Control and Prevention. Healthy places terminology. In Healthy places. 2017. https://www.cdc.gov/healthyplaces/terminology.htm . Accessed 10 Mar 2022.

Low LF, Yap M, Brodaty H. A systematic review of different models of home and community care services for older persons. BMC Health Serv Res. 2011;11:1–15.

Article Google Scholar

Roth DL, Fredman L, Haley WE. Informal caregiving and its impact on health: a reappraisal from population-based studies. Gerontologist. 2015;55(2):309–19.

Article PubMed PubMed Central Google Scholar

Vandepitte S, Putman K, Van Den Noortgate N, Verhaeghe N, Annemans L. Cost-effectiveness of an in-home respite care program to support informal caregivers of persons with dementia: a model-based analysis. Int J Geriatr Psychiatry. 2020;35(6):601–9.

Article PubMed Google Scholar

Maayan N, Soares-Weiser K, Lee H. Respite care for people with dementia and their carers. Cochrane Database Syst Rev. 2014;(1):CD004396.

Yun-Hee Jeon, Brodaty H, Chesterson J. Respite care for caregivers and people with severe mental illness: literature review. J Adv Nurs Wiley-Blackwell. 2005;49(3):297–306.

Google Scholar

O’connell B, Hawkins M, Ostaszkiewicz J, Millar L. Carers’ perspectives of respite care in Australia: an evaluative study. Contemp Nurse J Aust Nurs Prof. 2012;41(1):111–9.

Chan J. What do people with acquired brain injury think about respite care and other support services? Int J Rehabil Res Int Z Rehabil Rev Int Rech Readaptation. 2008;31(1):3–11.

Chappell NL, Reid RC, Dow E. Respite reconsidered: a typology of meanings based on the caregiver’s point of view. J Aging Stud. 2001;15(2):201–16.

Strang VR, Haughey M, Gerdner LA, Teel CS, Strang VR. Respite - a coping strategy for family caregivers. West J Nurs Res. 1999;21(4):450–71.

CAS PubMed Google Scholar

Dal Santo TS, Scharlach AE, Nielsen J, Fox PJ. Stress process model of family caregiver service utilization: factors associated with respite and counseling service use. J Gerontol Soc Work. 2007;49(4):29–49.

Ryan T, Noble R, Thorpe P, Nolan M. Out and about: a valued community respite service. J Dement Care. 2008;16(2):34–5.

Grant I, McKibbin CL, Taylor MJ, Mills P, Dimsdale J, Ziegler M, et al. In-home respite intervention reduces plasma epinephrine in stressed Alzheimer caregivers. Am J Geriatr Psychiatry. 2003;11(1):62–72.

O’Shea E, Timmons S, O’Shea E, Irving K. Multiple stakeholders’ perspectives on respite service access for people with dementia and their carers. Gerontologist. 2019;59(5):e490–500.

PubMed Google Scholar

Shanley C. Developing more flexible approaches to respite for people living with dementia and their carers. Am J Alzheimers Dis Other Demen. 2006;21(4):234–41.

Shaw C, McNamara R, Abrams K, Cannings-John R, Hood K, Longo M, et al. Systematic review of respite care in the frail elderly. Health Technol Assess. 2009;13(37):1–246.

Neville C, Beattie E, Fielding E, MacAndrew M. Literature review: use of respite by carers of people with dementia. Health Soc Care Community. 2015(1):51–3.

Ashworth M, Baker AH. Time and space: carers’ views about respite care. Health Soc Care Community. 2000;8(1):50–6.

Vandepitte S, Van Den Noortgate N, Putman K, Verhaeghe S, Annemans L. Effectiveness and cost-effectiveness of an in-home respite care program in supporting informal caregivers of people with dementia: design of a comparative study. BMC Geriatr. 2016;16:207–207.

Dubé V, Ducharme F, Lachance L, Perreault O. Résultats de l’enquête sur la satisfaction des proches aidants concernant les services obtenus par des organismes communautaires financés par les Appuis régionaux du Québec: Rapport présenté à l’Appui national. 2018. https://www.lappui.org/Organisations/Medias/Fichiers/National-Fichiers/Publications/Resultats-de-l-enquete-sur-la-satisfaction-des-proches-aidants . Accessed 13 Jul 2022.

Funk LM. Relieving the burden of navigating health and social services for older adults and caregivers. IRPP Study. 2019;(73):1.

Feinberg LF, Newman SL. Preliminary experiences of the States in implementing the National Family Caregiver Support Program: a 50-state study. J Aging Soc Policy. 2006;18(3/4):95–113.

Albouy FX, Lorenzi JH, Villemeur A, Khan S. Propositions pour une Société du Vieillissement harmonieuse: Pour un accompagnement renforcé, optimal et solidaire des aidants ! 2020. http://www.tdte.fr/article/show/les-positions-de-la-chaire-tdte-pour-un-accompagnement-renforce-optimal-et-solidaire-des-aidants-263 . Accessed 20 Mar 2020.

L’Appui pour les proches aidants d’aînés. Portrait démographique des proches aidants d’aînés au Québec. 2016. https://www.lappui.org/Organisations/Boite-a-outils/Portrait-demographique-des-proches-aidants-d-aines-au-Quebec . Accessed 20 Mar 2020.

Brandão D, Ribeiro O, Martín I. Underuse and unawareness of residential respite care services in dementia caregiving: constraining the need for relief. Health Soc Work. 2016;41(4):254–62.

O’Shea E, Timmons S, O’Shea E, Fox S, Irving K, Shea EO, et al. Key stakeholders’ experiences of respite services for people with dementia and their perspectives on respite service development: a qualitative systematic review. BMC Geriatr. 2017;17:1–14.

Huang HL, Shyu YIL, Chang MY, Weng LC, Lee I. Willingness to use respite care among family caregivers in Northern Taiwan. J Clin Nurs. 2008;18(2):191–8.

Leocadie MC, Roy MH, Rothan-Tondeur M. Barriers and enablers in the use of respite interventions by caregivers of people with dementia: an integrative review. Arch Public Health Arch Belg Sante Publique. 2018;76:72–72.

Laboratoire d’innovation par et pour les aînés. Projet AMORA. 2022. https://lippa.recherche.usherbrooke.ca/projet-amora/ . Accessed 10 Ap 2023.

Levac D, Colquhoun H, O’Brien KK. Scoping studies: advancing the methodology. Implement Sci. 2010;5(1):69.

Anderson S, Allen P, Peckham S, Goodwin N. Asking the right questions: scoping studies in the commissioning of research on the organisation and delivery of health services. Health Res Policy Syst. 2008;6(1):1–12.

Arksey H, O’Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. 2005;8(1):19–32.

Wittenberg Y, Kwekkeboom R, Staaks J, Verhoeff A, de Boer A. Informal caregivers’ views on the division of responsibilities between themselves and professionals: a scoping review. Health Soc Care Community. 2018;26(4):e460–73.

Nissen RM, Serwe KM. Occupational therapy Telehealth Applications for the dementia-caregiver Dyad: a scoping review. Phys Occup Ther Geriatr. 2018;36(4):366–79.

Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for scoping reviews (PRISMAScR): Checklist and Explanation. Ann Intern Med. 2018;169:467–73. https://doi.org/10.7326/M18-0850 .

Elo S, Kyngäs H. The qualitative content analysis process. J Adv Nurs. 2008;62(1):107–15.

Collins English Dictionary [Internet]. Glasgow (Scotland): HarperCollins; c2024. Interface. [cited 2024 feb 29]; [about 15 of screens]. https://www.collinsdictionary.com/dictionary/english/interface .

Levesque JF, Harris MF, Russell G. Patient-centred access to health care: Conceptualising access at the interface of health systems and populations. Int J Equity Health. 2013;12(1):1–9.

Corbière M, Fraccaroli F. La conception, la validation, la traduction et l’adaptation transculturelle d’outils de mesure. Méthodes qualitatives, quantitatives et mixtes : Dans La recherche en sciences humaines, sociales et de la santé. Québec (QC): Presses de l’Université du Québec; 2014. pp. 577–623.

Miles H, Huberman AM, Saldana J. Qualitative data analysis: a methods sourcebook. 4 éd. Thousand Oaks, CA: Sage; 2019.

Administration for Community Living. The Lifespan Respite Care Program. 2020. https://acl.gov/sites/default/files/programs/2018-05/Fact%20Sheet_Lifespan_Respite_Care_2018.pdf . Accessed 20 Mar 2020.

Arksey H, Jackson K, Croucher K, Weatherly H, Golder S, Hare P et al. Review of respite services and short-term breaks for carers of people with dementia. 2004. http://eprints.whiterose.ac.uk/73255/ . Accessed 20 Mar 2020.

Barrett M, Wheatland B, Haselby P, Larson A, Kristjanson L, Whyatt D. Palliative respite services using nursing staff reduces hospitalization of patients and improves acceptance among carers. Int J Palliat Nurs. 2009;15(8):389–95.

Article CAS PubMed Google Scholar

Bayly M, Morgan D, Froehlich Chow A, Kosteniuk J, Elliot V. Dementia-related education and support service availability, accessibility, and use in rural areas: barriers and solutions. Can J Aging. 2020;39(4):545–85.

Bunn B, Baker C. Network. What a difference three hours can make. J Dement Care. 2006;14(4):10–1.

Caulfield M, Seddon D, Williams S, Hedd Jones C. Planning, commissioning and delivering bespoke short breaks for carers and their partner living with dementia: Challenges and opportunities. Health Soc Care Community. 2021. https://search.ebscohost.com/login.aspx?direct=true&db=mnh&AN=34363262&site=ehost-live . Accessed 20 Mar 2020.

Derence K. Dementia-specific respite: the key to effective caregiver support. N C Med J. 2005;66(1):48–51.

Evans D, Lee E. Respite services for older people. Int J Nurs Pract. 2013;19(4):431–6.

Feinberg LF. Ahead of the curve: emerging trends and practices in family caregiver support. 2006. https://search.ebscohost.com/login.aspx?direct=true&db=gnh&AN=110981&site=ehost-live . Accessed 20 Mar 2020.

Fox A. A new model for care and support: sharing lives and taking charge. Work Older People Community Care Policy Pract. 2011;15(2):58–63.

Gendron M, Adam E. Caregiving challenges. Baluchon Alzheimer©: an innovative respite and support service in the home of the family caregiver of a person with Alzheimer’s. Alzheimers Care Q. 2005;6(3):249–61.

Hesse E. PRO DEM: a community-based approach to care for dementia. Health Care Financ Rev. 2005;27(1):89–94.

PubMed PubMed Central Google Scholar

Hopkinson J, King A, Young L, McEwan K, Elliott F, Hydon K, et al. Crisis management for people with dementia at home: mixed-methods case study research to identify critical factors for successful home treatment. Health Soc Care Community. 2021;29(4):1072–82.

Ingleton C, Payne S, Nolan M, Carey I. Respite in palliative care: a review and discussion of the literature. Palliat Med. 2003;17(7):567–75.

Kelly CM, Williams IC. Providing dementia-specific services to family caregivers: North Carolina’s Project C.A.R.E. program. J Appl Gerontol. 2007;26(4):399–412.

King A, Parsons M. An evaluation of two respite models for older people and their informal caregivers. N Z Med J. 2005;118(1214):U1440–1440.

Kristjanson LJ, Cousins K, White K, Andrews L, Lewin G, Tinnelly C, et al. Evaluation of a night respite community palliative care service. Int J Palliat Nurs. 2004;10(2):84–90.

LaVela SL, Johnson BW, Miskevics S, Weaver FM. Impact of a multicomponent support services program on informal caregivers of adults aging with disabilities. J Gerontol Soc Work. 2012;55(2):160–74.

Link G. The administration for community living: programs and initiatives providing family caregiver support. Generations. 2015;39(4):57–63.

Lucet F. [In-home respite for the families of Alzheimer’s patients]. Soins Gerontol. 2015;(115):24–9.

Marquant M. [A volunteer helper for carers of patients suffering from Alzheimer’s disease]. Soins Gerontol. 2010;(85):36–7.

Mason A, Weatherly H, Spilsbury K, Arksey H, Golder S, Adamson J, et al. A systematic review of the effectiveness and cost-effectiveness of different models of community-based respite care for frail older people and their carers. Health Technol Assess. 2007;11(40):iii–88.

McKay EA, Taylor AE, Armstrong C. What she told us made the world of difference: Carers’ perspectives on a hospice at home service. J Palliat Care. 2013;29(3):170–7.

Moriarty J. Welcome and introduction to the innovative practice section. Dement. 2002;1(1):113–20.

Noelker L, Bowdie R. Caring for the caregivers: developing models that work. Generations. 2012;1(1):103–6.

Parahoo K, Campbell A, Scoltock C. An evaluation of a domiciliary respite service for younger people with dementia. J Eval Clin Pract. 2002;8(4):377–85.

Perks A, Nolan M, Ryan T, Enderby P, Hemmings I, Robinson K. Breaking the mould: developing a new service for people with dementia and their carers. Qual Ageing. 2001;2(1):3–11.

Rosenthal Gelman C, Sokoloff T, Graziani N, Arias E, Peralta A. Individually-tailored support for ethnically-diverse caregivers: enhancing our understanding of what is needed and what works. J Gerontol Soc Work. 2014;57(6/7):662–80.

Smith SA. Longitudinal examination of a psychoeducational intervention and a respite grant for family caregivers of persons with Alzheimer’s or other dementias. 2006. https://search.ebscohost.com/login.aspx?direct=true&db=gnh&AN=938302&site=ehost-live . Accessed 20 Mar 2020.

Sorrell JM. Developing programs for older adults in a faith community. J Psychosoc Nurs Ment Health Serv. 2006;44(11):15–8.

Staicovici S. Respite care for all family caregivers: the LifeSpan Respite Care Act. J Contemp Health Law Policy. 2003;20(1):243–72.

Starns MK, Karner TX, Montgomery RJV. Exemplars of successful Alzheimer’s demonstration projects. Home Health Care Serv Q. 2002;21(3–4):141–75.

Swartzell KL, Fulton JS, Crowder SJ. State-level Medicaid 1915(c) home and community-based services waiver support for caregivers. Nurs Outlook. 2022;70(5):749–57.

Tompkins SA, Bell PA. Examination of a psychoeducational intervention and a respite grant in relieving psychosocial stressors associated with being an Alzheimer’s caregiver. J Gerontol Soc Work. 2009;52(2):89–104.

Vandepitte S, Putman K, Van Den Noortgate N, Verhaeghe S, Annemans L. Effectiveness of an in-home respite care program to support informal dementia caregivers: a comparative study. Int J Geriatr Psychiatry. 2019;34(10):1534–44.

Washington TR, Tachman JA. Gerontological social work student-delivered respite: a community-university partnership pilot program. J Gerontol Soc Work. 2017;60(1):48–67.

Whitlatch CJ, Feinberg LF. Family and friends as respite providers. J Aging Soc Policy. 2006;18(3/4):127–39.

Martin D, Miller AP, Quesnel-Vallée A, Caron NR, Vissandjée B, Marchildon GP. Canada’s universal health-care system: achieving its potential. Lancet Lond Engl. 2018;391(10131):1718–35.

Canadian Institute for Health Information. Wait times for home care services. In: Your health systems. 2023. https://yourhealthsystem.cihi.ca/hsp/inbrief?lang=en&_gl=1*2ysioj*_ga*MTYzNTk0MjAxMS4xNjc1NDQwNzQ3*_ga_44X3CK377B*MTY4MTkyMDYzMi4yLjEuMTY4MTkyMDY5MC4wLjAuMA.&_ga=2.134837618.2075493098.1681920633-1635942011.1675440747#!/indicators/089/wait-times-for-home-care-services/;mapC1;mapLevel2 ;/. Accessed 28 Ap 2020.

Carretero S, Garcés J, Ródenas F. Evaluation of the home help service and its impact on the informal caregiver’s burden of dependent elders. Int J Geriatr Psychiatry. 2007;22(8):738–49.

Bonnet T, Primerano J. The masks of recognition: the work of home care aides during the COVID-19 health crisis. Lien Soc Polit. 2022;88:89–110.

Rostgaard T. Quality reforms in Danish home care–balancing between standardisation and individualisation. Health Soc Care Community. 2012;20(3):247–54.

Plourde A. Les agences de placement comme vecteurs centraux de la privatisation des services de soutien à domicile. 2022. https://iris-recherche.qc.ca/wp-content/uploads/2022/01/IRIS_Agence_PlacementSSS_web-VF.pdf . Accessed 20 Mar 2020.

Scholey C, Schobel K. Mesure de la performance des organismes sans but lucratif: Le tableau de bord équilibré comme outil. 2016. https://www.cpacanada.ca/fr/ressources-en-comptabilite-et-en-affaires/strategie-risque-et-gouvernance/gouvernance-dosbl/publications/mesure-de-la-performance-des-osbl . Accessed 20 Mar 2020.

Khanam F, Langevin M, Savage K, Sharanjit U. Women working in paid care occupations. 2022. https://www150.statcan.gc.ca/n1/pub/75-006-x/2022001/article/00001-eng.htm . Accessed 20 Mar 2022.

Moore H, Dishman L, Fick J. The challenge of employee retention in medical practices across the United States: An exploratory investigation into the relationship between operational succession planning and employee turnover. In: Hefner JL, Nembhard IM, editors. Advances in health care management. 2021. pp. 45–75.

Clarkson P, Challis D, Hughes J, Roe B, Davies L, Russell I et al. Components, impacts and costs of dementia home support: a research programme including the DESCANT RCT. 2021. https://search.ebscohost.com/login.aspx?direct=true&db=mnh&AN=34181370&site=ehost-live . Accessed 20 Mar 2022.

Cobley CS, Fisher RJ, Chouliara N, Kerr M, Walker MF. A qualitative study exploring patients’ and carers’ experiences of early supported discharge services after stroke. Clin Rehabil. 2013;27(8):750–7.

Jegermalm M. Direct and indirect support for carers: patterns of support for informal caregivers to elderly people in Sweden. J Gerontol Soc Work. 2002;38(4):67–84.

Download references

Acknowledgements

The team thanks the Université de Sherbrooke’s library and archives service for their support. The team also want to thank everyone who participated in the survey.

This article describes a part of a larger study on flexible respite funded by the Fonds de la recherche du Québec (#309508) – Santé and the Conseil de recherches en sciences humaines du Canada (#892-2019-3075). Annie Carrier and Véronique Provencher are Fonds de recherche du Québec – Santé Junior 1 and Junior 2 researchers (#296437 and #297008, respectively). Alexandra Éthier is a Canadian Institutes of Health Research - Research Graduate Scholarships – Doctoral Program recipient (#476590 − 71729).

Author information

Authors and affiliations.

Université de Sherbrooke, Sherbrooke, Québec, Canada

Maude Viens, Alexandra Éthier, Véronique Provencher & Annie Carrier

Research Center on Aging, Sherbrooke, Québec, Canada

You can also search for this author in PubMed Google Scholar

Contributions

MV conducted the review and co-wrote the article with AE. AE co-validated the study selection and co-wrote the article. AC co-validated the study selection, data charting and reviewed the article. VP reviewed the article. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Maude Viens .

Ethics declarations

Ethics approval and consent to participate.

The AMORA project was approved by the research ethics committee of the Integrated University Health and Social Services Centre (CIUSSS) of the Eastern Townships (project number: 2021–3703).

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Viens, M., Éthier, A., Provencher, V. et al. WHO, WHEN, HOW: a scoping review on flexible at-home respite for informal caregivers of older adults. BMC Health Serv Res 24 , 767 (2024). https://doi.org/10.1186/s12913-024-11058-0

Download citation

Received : 05 June 2023

Accepted : 29 April 2024

Published : 26 June 2024

DOI : https://doi.org/10.1186/s12913-024-11058-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Home care/homecare
Informal caregiving
Older adults
Scoping review

BMC Health Services Research

ISSN: 1472-6963

General enquiries: [email protected]

Log in using your username and password

Search More Search for this keyword Advanced search
Latest content
For authors
Browse by collection
BMJ Journals

http://orcid.org/0009-0008-5313-9272 Willow R Schanz 1 ,
Aunum Akhter 2 ,
Georgette Richardson 3 ,
http://orcid.org/0000-0003-0229-966X William T Story 4 ,
Riley Samuelson 5 ,
http://orcid.org/0000-0002-7026-0006 Aamer Imdad 6
1 The University of Iowa Roy J and Lucille A Carver College of Medicine , Iowa City , Iowa , USA
2 Division of Neonatology , The University of Iowa Health Care, Stead Family Department of Pediatrics, Roy J and Lucille A Carver College of Medicine , Iowa City , Iowa , USA
3 Division of Pediatric Psychology , The University of Iowa Health Care, Stead Family Department of Pediatrics , Iowa City , Iowa , USA
4 Department of Community and Behavioral Health , The University of Iowa College of Public Health , Iowa City , Iowa , USA
5 University of Iowa Hardin Library for the Health Sciences , Iowa City , Iowa , USA
6 Division of Gastroenterology, Hepatology, Pancreatology and Nutrition , University of Iowa Health Care, Stead Family Department of Pediatrics, Roy J and Lucille A Carver College of Medicine , Iowa City , Iowa , USA
Correspondence to Dr Aamer Imdad; aamer-imdad{at}uiowa.edu

Introduction The underdevelopment of preterm infants can lead to delayed progression through key early milestones. Demonstration of safe oral feeding skills, constituting proper suck-swallow reflex are requirements for discharge from the neonatal intensive care unit (NICU) to ensure adequate nutrition acquisition. Helping an infant develop these skills can be draining and emotional for both families and healthcare staff involved in the care of preterm infants with feeding difficulties. Currently, there are no systematic reviews evaluating both family and healthcare team perspectives on aspects of oral feeding. Thus, we first aim to evaluate the current knowledge surrounding the perceptions, experiences and needs of families with preterm babies in the context of oral feeding in the NICU. Second, we aim to evaluate the current knowledge surrounding the perceptions, experiences and needs of healthcare providers (physicians, advanced practice providers, nurses, dietitians, speech-language pathologists and occupational therapists) in the context of oral feeding in the NICU.

Methods and analysis A literature search will be conducted in multiple electronic databases from their inception, including PubMed, CINHAL, Embase, the Cochrane Central Register for Controlled Trials and PsycINFO. No restrictions will be applied based on language or data of publication. Two authors will screen the titles and abstracts and then review the full text for the studies’ inclusion in the review. The data will be extracted into a pilot-tested data collection sheet by three independent authors. To evaluate the quality, reliability and relevance of the included studies, the Critical Appraisal Skills Programme checklist will be used. The overall evidence will be assessed using the Grading of Recommendation Assessment, Development and Evaluation criteria. We will report the results of the systematic review by following the Enhancing Transparency in Reporting the synthesis of Qualitative research checklist.

Ethics and dissemination Ethical approval of this project is not required as this is a systematic review using published and publicly available data and will not involve contact with human subjects. Findings will be published in a peer-reviewed journal.

PROSPERO registration number CRD42023479288.

Paediatric gastroenterology
Systematic Review
Percieved Social Support
NEONATOLOGY

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ .

https://doi.org/10.1136/bmjopen-2024-084884

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

This will be a systematic review evaluating both the perspectives of families and neonatal healthcare professionals on feeding practices of preterm infants in the neonatal intensive care unit (NICU).

Evaluating the perspectives of both family members and neonatal healthcare professionals involved in the care of preterm babies with feeding difficulties may uncover shared grievances and mutually beneficial opportunities for quality improvement in the NICU.

Included studies might be conducted in diverse settings, so generalisability to clinical practice may be affected by cultural, language and healthcare systems context.

Introduction

An estimated 13.4 million babies were born preterm (<37 weeks gestation) in 2020, which represented about 10% of all live births worldwide. 1 Preterm birth is a serious health event that contributes to significant morbidity, mortality and increased healthcare cost in neonates. Over 40% of premature infants will experience feeding difficulties, such as struggling to develop typical feeding reflexes (sucking, swallowing, appropriate breathing) and coordinated oesophageal bolus transport. 2 Consequently, feeding difficulties are associated with elevated healthcare costs due to increased length of stay in the neonatal intensive care unit (NICU) and invasive measures, such as a central line or other parenteral support, to supply the infant with adequate nutrients. 3 4 Poor feeding skills are associated with increased morbidity through malnutrition and growth restriction as well as increased mortality through oropharyngeal aspiration. 5 6

Despite the global prevalence, expense and severity of feeding difficulties, no universal guidelines function as the gold standard of care for feeding preterm infants. 7 The resulting high variability in approach may lead to dissatisfaction among NICU families and healthcare professionals. Families of preterm infants have been shown to express concerns about the technicality of feeding interventions, communication with providers regarding their child and feeling isolated from the feeding approaches in the NICU. 8 Tube feeding, a common feeding intervention for preterm infants, has been associated with increased cost, rehospitalisation, stress and anxiety for families. Due to the emotional nature of feeding a newborn, family members may struggle with learning to feed their infant in this manner. 8 Additionally, nurse perceptions of oral feeding in the NICU have emphasised the impactful role they hold in teaching feeding techniques and relieving emotional distress for the family, which has highlighted a need for greater collaboration between the family and care providers. 9 Family integrated care has been perceived to be helpful in the reduction of maternal stress by parents of preterm infants as well as a necessary and feasible care model by neonatologists and NICU nurses that has the potential to lower length of hospitalisation, decrease healthcare costs and improve breastfeeding rates in preterm infants. 10–12 The approach to feeding preterm infants requires a multidisciplinary effort, including the family, nurses, dietitians, occupational therapists, speech-language pathologists, social workers, advanced practice providers and physicians. Despite these experiences being reported, there is still limited understanding regarding the perceptions of families and caregivers on feeding preterm infants in the NICU. 13 14 This qualitative systematic review aims to analyse the current global knowledge of the perceptions, experiences and needs of families and healthcare staff (nurses, physicians, advanced practice providers, dietitians, occupational therapists, social workers and speech-language pathologists) involved in the feeding process of preterm infants in the NICU, as well as possible improvements to decrease barriers to high-quality care.

Methods and analysis

This systematic review will be conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines and checklist. 15

Literature search

Systematic electronic queries, available in online supplemental appendix A , will be conducted in major databases, including PubMed, CINHAL, Embase, the Cochrane Central Register for Controlled Trials and PsycINFO from their inception to date of inquiry. Key terms used in the search are those related to population, context and phenomena of interest (perspectives, views, needs, experiences, perceptions, barriers, challenges). Studies will not be excluded based on the publication year, publication status, geographical location or language. Thus, this analysis will include studies from all countries. Studies evaluating specific racial, gender, geographic, age (of family or provider) differences will be included in this review as long as they evaluate qualitative aspects of our phenomena of interest. Bibliographic software (EndNote) will be used to combine database search results, and duplicates will be removed.

Supplemental material

Eligibility criteria.

The populations of interest include families of preterm infants (parents, mother, father, grandmother, grandfather and guardians) and neonatal healthcare professionals (nurses, physicians, advanced practice providers, caregivers, dietitians, speech-language pathologists, social workers and occupational therapists) involved in the feeding care of preterm infants. For this review, preterm birth will be defined as gestational age <37 weeks at birth. If relevant, additional definitions such as late preterm: 34–36 weeks, moderately preterm: 32–34 weeks, very preterm: 28–32 weeks, extremely preterm: <28 weeks gestational age at birth, will be used and clearly reported.

We are interested in the global state of enteral and oral feeding in preterm infants while in NICUs from the perspective of both families and healthcare providers.

Phenomena of interest

The main phenomena of interest are the experiences and perceptions of families with preterm infants and healthcare providers of preterm infants as outlined below:

Family experiences regarding NICU feeding practices.

Family perceptions of NICU feeding practices.

Family needs regarding care of infants with feeding difficulties.

Family barriers regarding care of infants with feeding difficulties.

Healthcare staff perceptions of the NICU feeding practices.

Healthcare staff needs regarding care of infants with feeding difficulties.

Healthcare staff barriers regarding care of infants with feeding difficulties.

Screening and selection of studies

Screening of studies will be conducted through systematic review software Covidence by three authors (WRS, GR and AI). The initial review will consist of title and abstract filtering for relevance to systematic review objective by three authors (WRS, GR and AI). For studies to progress to future screening, they must evaluate the perceptions regarding feeding practices of preterm infants in the NICU in one of our two populations of interest: (1) families and (2) healthcare providers. Studies deemed irrelevant or out of context will be excluded, such as those evaluating children in the paediatric intensive care unit and those evaluating NICU graduates following up in outpatient clinics. The second stage of study selection will include a complete text review of each potential article by three authors (WRS, GR and AI). Conflicts at all stages will be resolved by discussion and contacting a senior author. Additionally, the references of relevant reviews will be evaluated for inclusion in the review. In the case that only an abstract is available for a given study, authors will be contacted to obtain information on and evaluate methods and results. If we are unable to obtain additional information, the abstract will be evaluated exclusively by inclusion criteria. If a paper is published in a language other than English, we will attempt to translate the article for use in this review. If we are unable to translate the article, we will exclude it from this review.

Data extraction

Data extraction will occur independently by three authors (WRS, GR and AI) and subsequent comparison will occur. Conflicts will be resolved through discussion. To standardise data acquisition, a custom data extraction template will be piloted and used in Covidence. Information to be collected from each study will include:

Study design, study duration, study setting, setting country/region, study year and interventions.

Participants

Recruitment methods, including inclusion and exclusion criteria; group differences; sample size; sample size calculation; relevant baseline characteristics (family participants: maternal age, infant gestational age at birth, infant weight at birth, race/ethnicity, etc.; healthcare professional participants: role, experience, race/ethnicity, etc.); intervention groups.

Qualitative: Phenomena of interest (perceptions, experiences, change in satisfaction, change in feeding rate, etc); definitions of phenomena of interest.

Quantitative (if regarding phenomena of interest): variable type (continuous, dichotomous, qualitative); reporting measure (continuous variable: CIs, SD, SE, etc; dichotomous variable: the number of participants, percentage of participants, OR, etc; qualitative); statistical significance of outcome (p value).

Major themes addressed

Stress, anxiety, fear, needs, barriers, satisfaction, etc.

Other relevant constructs

First-order constructs (participant quotes); second-order constructs (author interpretations).

This data extraction protocol is modelled from thematic analysis principles of qualitative evidence synthesis and recommendations by the Cochrane Qualitative and Implementation Methods Group guidance for data extraction and data synthesis. 16 17 After data extraction, these data will be exported to Excel for synthesis and organised by relevant population.

Data synthesis

Data will be synthesised for each relevant population and outcome combination by three authors (WRS, GR and AI). Major themes will be described in a narrative fashion and simple descriptive statistics may be utilised for clarity. In the case of studies having quantitative measures of our qualitative interests, we will report the data as follows: If relevant, dichotomous data will be reported with OR, 95% CIs, and risk ratios, and continuous data will be reported as confidence intervals. Significant construct findings will be reported as quotes, percentages or other descriptive reports. Any inconsistencies or discrepancies between studies will be considered and reported. Data will be reported in narratives and tables for presentation.

Reporting results

Once the study analysis is complete, we will provide a narrative synthesis of all included studies and analysis between comparable studies. We will compare knowledge, beliefs, attitudes and perceptions of families with infants in the NICU within this population as well as compare these findings to the knowledge, beliefs, attitudes and perceptions knowledge of neonatal healthcare professionals. We will include all findings listed in the ‘Phenomena of interest’ section. Reporting of results will be in accordance with PRISMA and Enhancing Transparency in Reporting the synthesis of Qualitative research guidelines. 15 18

Critical appraisal of the studies

To evaluate the quality, reliability and relevance of the included studies, we plan to follow the Critical Appraisal Skills Programme checklist. 19 This tool is often used to appraise qualitative research and is adaptable to emphasise particular areas of interest within our research question. It is recommended by Cochrane and complements the use of the Grading of Recommendations Assessment, Development and Evaluation—Confidence in Evidence from Reviews of Qualitative Research (GRADE-CERQ) approach through evaluating the strengths and weaknesses of each study rather than on the basis of exclusion. This tool will be used by three members of the review team (WRS, GR and AI), and disagreements will be mediated through conversation.

Certainty of review findings

The GRADE-CERQ approach will be used to evaluate the overall certainty of evidence. 20 This approach is a comprehensive framework used to assess the overall certainty of the evidence for an outcome using study characteristics such as study design, inconsistency, indirectness of evidence, risk of bias, publication bias and imprecision estimates. We will include the GRADE-CERQ assessment results in an evidence profile that contains certainty ratings, including very low, low, moderate or high, based on the evidence across studies for primary outcomes. We will follow the GRADE-CERQ guidelines for assessing confidence in our qualitative evidence findings, which are based on four components: methodological limitations, relevance, adequacy and coherence. Based on analysis in each of these categories, the study will be given a score of either strong or weak. Concerns with any of the components may reduce our confidence in a review finding.

Patient and public involvement

Ethics and dissemination.

This is a qualitative systematic review that evaluates data present in the public domain through published studies and does not involve contact with human subjects. As a study of published literature, this study was not subject to formal IRB (Institutional Reviw Board) approval. We anticipate that the systematic review will be complete by fall of 2024 and will be submitted for publication in a peer-reviewed journal.

Ethics statements

Patient consent for publication.

Not applicable.

Acknowledgments

The authors would like to acknowledge Paul Casella for his help in editing the manuscript

Moller A-B ,
Bradley E , et al
Yamasaki JT , et al
Rolnitsky A ,
Urbach D , et al
Victora CG ,
Walker SP , et al
Jackson B ,
Mörelius E ,
Sahlén Helmer C ,
Hellgren M , et al
Zhu X , et al
Li Y , et al
Wang S , et al
Osborn EK ,
Alshaikh E ,
Nelin LD , et al
Gulati IK ,
Jadcherla S
McKenzie JE ,
Bossuyt PM , et al
Flemming K , et al
Flemming K ,
McInnes E , et al
Glenton C , et al

Supplementary materials

Supplementary data.

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Data supplement 1

Contributors Conceptualisation: WRS, AA, GR and AI; Methodology: WRS, AA, GR, WTS, RS and AI; Writing–original draft preparation: WRS and AI; Writing–review and editing: WRS, AA, GR, WTS, RS and AI. All authors have read and agreed to the published version of the manuscript.

Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Competing interests None declared.

Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

Provenance and peer review Not commissioned; externally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Read the full text or download the PDF:

Comparative analysis of open-source federated learning frameworks - a literature-based survey and review

Original Article
Open access
Published: 28 June 2024

Cite this article

You have full access to this open access article

Pascal Riedel ORCID: orcid.org/0000-0001-9910-3867 1 , 3 ,
Lukas Schick 2 ,
Reinhold von Schwerin 1 ,
Manfred Reichert 3 ,
Daniel Schaudt 1 &
Alexander Hafner 1

While Federated Learning (FL) provides a privacy-preserving approach to analyze sensitive data without centralizing training data, the field lacks an detailed comparison of emerging open-source FL frameworks. Furthermore, there is currently no standardized, weighted evaluation scheme for a fair comparison of FL frameworks that would support the selection of a suitable FL framework. This study addresses these research gaps by conducting a comparative analysis of 15 individual open-source FL frameworks filtered by two selection criteria, using the literature review methodology proposed by Webster and Watson. These framework candidates are compared using a novel scoring schema with 15 qualitative and quantitative evaluation criteria, focusing on features, interoperability, and user friendliness. The evaluation results show that the FL framework Flower outperforms its peers with an overall score of 84.75%, while Fedlearner lags behind with a total score of 24.75%. The proposed comparison suite offers valuable initial guidance for practitioners and researchers in selecting an FL framework for the design and development of FL-driven systems. In addition, the FL framework comparison suite is designed to be adaptable and extendable accommodating the inclusion of new FL frameworks and evolving requirements.

Avoid common mistakes on your manuscript.

1 Introduction

Federated Learning (FL) is a semi-distributed Machine Learning (ML) concept that has gained popularity in recent years, addressing data privacy concerns associated with centralized ML [ 1 , 2 , 3 , 4 , 5 , 6 , 7 ]. For example, data-driven applications with sensitive data such as in healthcare [ 8 , 9 , 10 , 11 , 12 ], finance [ 13 , 14 ], personalized IoT devices [ 15 , 16 ] or public service [ 17 , 18 ] require a technical guarantee of data privacy, which can be achieved by the use of FL.

In FL, a predefined number of clients with sensitive training data and a coordinator server jointly train a global model, while the local training data remains on the original client and is isolated from other clients [ 1 , 19 ]. In the FL training process, the global model is created by the server with randomly initialized weights and distributed to the clients of the FL system [ 20 , 21 ]. The goal of a federated training process is the minimization of the following objective function:

where \(N\) is the number of clients, \(n_k\) the amount of sensitive training data on client \(k\) , \(n\) the total amount of training data on all clients and \(F_k(w)\) is the local loss function [ 1 , 22 , 23 ]. Each client trains an initial model obtained by the coordinator server with the client’s local training data [ 24 ]). The locally updated model weights are asynchronously sent back to the coordinator server, where an updated global model is computed using an aggregation strategy such as Federated Averaging (FedAvg) [ 1 , 7 , 20 , 25 , 26 , 27 ]. The new global model is distributed back to the clients for a new federated training round. The number of federated training rounds is set in advance on the server side and is a hyperparameter that can be tuned [ 1 , 5 , 28 , 29 ]. An overview of the FL architecture is introduced in Fig. 1 . Also, FL can reduce the complexity and cost of model training by allowing a model to be trained on multiple smaller datasets on different clients, rather than on a single large, centralized dataset that requires an exhaustive data collection process beforehand [ 30 , 31 , 32 ]. Although there are a several key challenges to solve in the FL domain, security features such as homomorphic encryption [ 33 , 34 ] and differential privacy [ 6 , 35 , 36 ] are already used to guarantee and improve data privacy and security in FL systems [ 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 ].

Basic FL architecture overview

The advent of FL has spurred the development of various FL frameworks aimed at facilitating the deployment of FL applications, offering standardized functionalities and enhanced usability. Despite the proliferation of these frameworks, the selection of an optimal FL framework for specific project requirements remains a non-trivial challenge for practitioners due to the diversity and complexity of the choices available. This situation is exacerbated by two notable deficiencies in the FL research literature: first, the absence of a methodologically rigorous, in-depth comparative analysis of the most relevant open-source FL frameworks; and second, the lack of a standardized, weighted scoring scheme for a systematic and objective evaluation of these frameworks.

To the best of our knowledge, this comparative study is the most thorough to date, assessing the widest array of open-source FL frameworks against the broadest spectrum of criteria. Consequently, this study endeavors to fill the aforementioned research gaps by providing a robust FL framework comparison suite that could serve as a research-based guide for practitioners navigating the selection of suitable FL frameworks for their projects.

This study provides a comprehensive and user targeted comparison of 15 open-source FL frameworks by performing a systematic literature review according to Webster and Watson [ 45 ]. In this way, relevant FL frameworks and comparison criteria are identified, which are the basis for the comperative analysis. A novel weighted scoring system is proposed for the evaluation of FL frameworks. The proposed comparison criteria and the scoring system in this study can be utilized by practitioners and researchers to determine whether a particular FL framework fulfills their needs. Thus, the major contributions of this study can be summarized as follows:

Proposing 15 comparison criteria for the evaluation of FL frameworks based on a methodological literature review.

Introducing a novel weighted scoring matrix for these comparison criteria.

Conducting an in-depth comparison of 15 relevant open-source FL Frameworks.

In addition, a Research Question (RQ) oriented approach is used in this study with the aim to answer the following three RQs:

RQ 1: Which relevant frameworks for FL exist and are open-source?

RQ 2: Which criteria enable a qualitative and quantitative comparison of FL frameworks?

RQ 3: Which FL framework offers the most added value to practitioners and researchers?

The RQs are addressed and answered in ascending order in Sect. 5.4 on page 16.

The remainder of this paper is organized as follows. Section 2 discusses relevant related work and shows how the main contribution of this paper differs from the others. Section 3 details the literature review methodology applied in this work. Section 4 briefly introduces inclusion criteria and the FL framework candidates. Section 5 presents and discusses the comparison criteria, the weighting schema and the scoring results from the conducted FL framework comparison analysis. Section 6 describes the limitations of this study and suggests future work. Finally, Sect. 7 draws the conclusions of this survey.

2 Related work

In recent years, several research papers have been published dealing with individual FL frameworks. Some developers published works detailing and highlighting their own FL frameworks. For instance, the developers of FedML [ 46 ], Sherpa.ai FL [ 47 ], IBM FL [ 48 ], OpenFL [ 49 ], FATE [ 50 ], Flower [ 51 ], FLUTE [ 52 ], FederatedScope [ 53 ], FedLab [ 54 ] and EasyFL [ 55 ] have all published white papers introducing the features of their released frameworks. These papers include a general introduction to FL, open FL challenges, and how their FL framework can address them, while [ 29 , 46 , 47 , 51 , 52 , 55 ] also provide small comparisons of a few existing FL frameworks. These comparisons were chosen subjectively and are biased, usually in favor of the FL framework developed by the author making the comparison, meaning a neutral, independent and holistic comparison is missing so far. In addition, there are research papers that address the current state of FL research, some of them using specific FL frameworks for technical implementation or evaluation purposes. For example, [ 5 ] showed a general and comprehensive overview of FL. They examined possible future research directions and challenges of FL, such as protection strategies against federated security attacks, and mentioned sources of federated bias. Moreover, they briefly introduced and described some popular FL frameworks, including FATE [ 56 ], PaddleFL [ 57 ], NVIDIA Clara (now a platform offering AI models for healthcare applications) [ 58 ], IBM FL [ 59 ], Flower [ 51 ] and FedLearner [ 60 ]. Another work [ 61 ] followed a similar approach as [ 5 ] and described central FL concepts such as the training process and FL algorithms in more detail before including a brief comparison overview of several FL frameworks. The authors of both works ( [ 5 ] and [ 61 ]) refrain from evaluating FL frameworks and drawing conclusions from their conducted comparison analyses. In contrast to the aforementioned works, [ 62 ] described an in-depth comparison of multiple FL frameworks (TFF [ 63 ], FATE [ 56 ], PySyft [ 64 ] PaddleFL [ 57 ], FL &DP [ 65 ]). Both qualitative (in the form of a table comparing features of the frameworks) and quantitative comparisons (in the form of experiments, measuring training time and accuracy for three classification problems) are performed. Based on their evaluations, [ 62 ] recommended PaddleFL for the industrial usage, citing its high test accuracy for model inference tasks and range of features ready for practical use. A similar qualitative and quantitative FL framework comparison is provided by [ 66 ]. Their comparison contained more FL framework candidates than in the comparison conducted by [ 62 ] (9 vs 5). Furthermore, [ 66 ] performed a larger set of benchmark experiments, in which different FL paradigms were considered. The qualitative comparison was of a similar scope as in [ 62 ], although some criteria were left out (e.g., supported data types and protocols) and others have been added (e.g., documentation availability and GPU Support). Although the authors did not make a recommendation for a particular FL framework, they described a general decision-making process that can be used to determine the most appropriate FL framework.

In contrast to previous works, where the selection of comparison criteria for FL frameworks was often arbitrary, our study introduces a methodologically rigorous approach for a comparative analysis of FL frameworks. Prior works did not incorporate weighted importance of criteria nor did they employ a scoring mechanism for a systematic evaluation of FL frameworks. In addition, there was a lack of comprehensiveness in the inclusion of available and pertinent open-source FL frameworks. Our work advances the field by encompassing a broader spectrum of framework candidates and employing a more integrative methodology for evaluating FL frameworks with a novel weighted scoring approach. Leveraging the structured literature review methodology by Webster and Watson, this comparative study identifies the most pertinent quantitative and qualitative criteria for FL framework users, ensuring a selection of comparison criteria that is both comprehensive and methodically sound, surpassing the scope of similar studies.

3 Research method

We applied the literature review methodology proposed by Webster and Watson [ 45 ] to address the RQs (see Sect. 1 on page 1). They introduced a systematic in-depth review schema for the identification and evaluation of relevant research literature. Webster and Watson’s literature review method was published in response to the lack of reviewing articles in the information systems field, which the authors believe has slowed the progress in the field [ 45 ]. Their methodology has gained popularity since publication, with over 10 000 citations (based on Google Scholar citation count). According to [ 45 ], the collection process of relevant research literature should be concept-oriented or author-centric and is not limited to individual journals or geographical regions. They recommend to identify appropriate journal articles and conference proceedings by conducting a keyword-based search in different literature databases. Additional relevant sources should be identified by searching the references of the literature collected in this manner. This technique is called backward search and can be combined with forward search , which locates literature that cites one of the originally identified documents as a literature source. An overview of the searching methodology applied in this paper is shown in Fig. 2 . We used the research literature review of Webster and Watson [ 45 ] to build the knowledge base for a literature-driven comparison analysis of open-source FL frameworks.

Process flow used in this study to identify and filter relevant publications for the literature review

3.1 Literature databases and search string

For the literature search, the publication databases ACM Digital Library, EBSCOhost and IEEE Xplore were used to identify relevant publications and literature sources (see Fig. 2 ). As recommended by [ 45 ] we mainly searched for peer-reviewed journal articles and conference proceedings, so that a reliable research is feasible. A logical combination of the following terms served as the search string:

‘federated learning’ AND ‘framework’ AND ‘open-source’ OR ‘federated framework’ AND ‘privacy-preserving machine learning’ AND ‘open-source’.

In some cases, additional search keywords were used, determined by reviewing the tables of contents of the retrieved literature based on the search string [ 45 ]. In addition, the research literature was filtered by publication date from 2016 to 2024 to obtain more recent sources. 2016 was chosen as date filter because that was the first year the term federated learning was officially used in a publication [ 1 ]. The forward and backward searches, as described by Webster and Watson [ 45 ], were used to identify additional relevant sources. This made it possible to identify publications that referenced other relevant publications, most of which were not much older than the origin publications. One reason for this could be that the term federated learning did not exist before 2016, so the range of publication dates is quite narrow. For the forward search, Google Scholar, Science Direct, Semantic Scholar, and ArXiv were used in addition to the literature databases mentioned above.

3.2 Inclusion and exclusion criteria

To further filter the identified publications, the following certain inclusion and exclusion criteria were used, defined as follows:

Inclusion Criteria :

The identified publication deals with the topic of federated learning and contributes answers to at least one of the RQs (see Sect. 1 on page 1).

The title and the abstract seem to contribute to the RQs and contain at least one of the following terms: framework, federated learning, machine learning, evaluation or open-source.

Exclusion Criteria :

The publication is not written in English.

The title and abstract do not appear to contribute to the RQs and do not contain a term from the search string (see Subsect. 3.1 ) or inclusion criteria.

The publication is a patent, master thesis, or a non-relevant web page.

The publication is not electronically accessible without payment (i.e. only print issue).

All relevant aspects of the publication are already included in another publication.

The publication only compares existing research and has no new input.

A publication is included in the pool of relevant literature for reviewing if both inclusion criteria are met, and it is excluded if any of the exclusion criteria is fulfilled. Exceptions that are not subject to these criteria are sources that additionally serve to quantitatively or qualitatively support the comparison, such as GitHub repositories or the websites from the FL frameworks. Such sources are also included in our literature database, having a low relevance score.

3.3 Pool of publications

We initially checked the titles and abstracts of the publications for the individual key words of the search term (see Subsect. 3.1 on page 4) and added the publications to the literature pool if there were any matches based on the defined inclusion and exclusion criteria (see Subsect. 3.2 on page 5). Thus, 1328 individual publications from the literature databases were obtained. With the introduction and conclusion, 1196 publications have been eliminated due to lack of relevance. As a result, 132 publications, including 60 peer-reviewed journal articles, 27 conference proceedings, 10 white papers and 35 online sources form the basis for the literature-driven comparative analysis. In the refinement process (see step 3 on Fig. 2 on page 4), duplicated sources were removed, since in some cases the same publication was listed in at least or more than two literature databases.

3.4 Literature review

For the literature review a concept-oriented matrix according to Webster and Watson was used, which enables a systematic relevance assessment of the identified literature [ 45 ]. A publication is rated according to the number of concepts covered. Based on the RQs (see Sect. 1 on page 1), the individual concepts or topics for the literature review in this study are defined as follows:

FL General Information (GI)

FL Security Mechanisms (SM)

FL Algorithms (AL)

FL Frameworks (FW)

For each identified source, the title, the type of publication, the name of the publishing journal or conference if applicable, the number of citations, and a brief summary of the relevant content were noted. Afterwards, the literature was scored based on a scale of 1 to 4, with a publication scored 4 representing high relevance and a publication scored 1 representing low relevance. The rating schema is based on the concepts described above and defined as follows:

1 Point: Relevant to one specific concept except for FW.

2 Points: Relevant to at least two concepts or FW.

3 Points: Relevant to at least three concepts or FW and one or two other concepts.

4 Points: Relevant to all four concepts (GI, SM, AL and FW).

Additional sources not directly related to the concepts defined above were included in the concept Misc. and have been automatically assigned a relevance score of 1. An excerpt of the applied concept-oriented tabular literature review according to Webster and Watson [ 45 ] can be found in Table 1 on page 7. In this study, the knowledge base obtained from the literature review forms the basis for the weighted comparison and evaluation of different open-source FL frameworks (see Sect. 5 on page 8).

3.5 Literature analysis

To analyze the research literature, a Latent Dirichlet Allocation (LDA) was applied on the identified publications to discover common overlapping topics [ 67 ]. This could be used to verify the relevance of our chosen Literature Review Concepts. Stop words, numerical characters and conjunctions have been filtered out in advance. The number of components of the LDA was set to 10. This number was chosen after conducting a grid search and analyzing the generated topics. With the number of components set to 10, a topic that could be assigned to the Literature Review Concept ‘FL Frameworks’ was included for the first time. Thus, this was the lowest number of topics with which all four of the identified Literature Review Concepts were captured by the LDA. In each topic, the LDA determined the 20 most relevant words from the provided literature. Relevance represents the amount of times a word was assigned to a given topic [ 67 ]. Figure 5 (see Appendix, on page 18) displays these identified topics and their most relevant words. The topics were further condensed into the previously defined four concepts in Table 2 . A word cloud consisting of the most common words in the identified literature can be seen in Fig. 6 (see Appendix, on page 19).

The literature-driven analysis reveals that FL frameworks have not often been part of research works on FL (see Table 2 ). This work aims to close this research gap. Figure 3 on page 6 shows the distribution of reviewed FL sources by the publication year. Noticeable is that FL received an overall boost in research interest in 2022 compared to 2021 (25 vs 14 publications). We expect the number of research publications on the four FL concepts described (see Subsect. 3.4 on page 5) to increase in the future as more user-friendly FL frameworks facilitate accessibility to FL to a wider range of users. It is worth to mention that some sources dealing with FL frameworks are GitHub repositories and white papers of the framework developers. In conducting the literature review (see Table 1 on page 7), a total of 18 FL frameworks were identified for the comparison and evaluation. To filter the number of FL frameworks, inclusion criteria are defined and used in this study. These filter criteria and the selected FL frameworks are described in the next section.

Histogram of reviewed literature by year of publication from 2016 (first FL publication) to February 2024 (current research)

4 Federated learning frameworks

Although the term FL was coined as early as in 2016 [ 1 ], it is only in recent years that more Python-based frameworks have emerged that attempt to provide FL in a more user-friendly and application-oriented manner (see Fig. 3 on page 6). Some of the identified FL frameworks are hidden behind paywalls or are completely outdated and no longer actively developed and supported, making it impractical to include them for a fair comparison. Therefore, the following two inclusion criteria must be fulfilled by the FL frameworks in order to be considered as comparison candidates.

4.1 Inclusion criteria

Open-Source Availability In this paper, we also want to contribute to the topic of open-source in AI solutions and affirm its importance in the research community. In times when more and more AI applications are offered behind obfuscated paywalls (e.g., OpenAI [ 68 ]), researchers and developers should also consider the numerous advantages when developing innovative AI solutions as open-source products. After all, the rapid development of AI has only been possible due to numerous previous relevant open-source works. Thus, for the comparison study only open-source FL frameworks are chosen.

A few enterprises, such as IBM [ 59 ] or Microsoft [ 69 ], offer both a commercial integration and a open-source version of their FL frameworks for research purposes. For such FL frameworks only the free versions are considered in our comparison analysis.

Commercial FL frameworks such as Sherpa.ai FL [ 47 , 65 ] are not considered in this work as they do not follow the spirit of open-source. Benchmarking frameworks such as LEAF [ 70 ] or FedScale [ 71 ] were also excluded.

Community Popularity Another inclusion criterion used for filtering FL frameworks is the popularity in the community. It can be assumed that FL frameworks with an active and large GitHub community are more actively developed, more likely to be supported in the long term and thus more beneficial for practitioners. Therefore, this criterion excludes smaller or experimental FL frameworks, such as OpenFed [ 72 ].

As a metric for community activity the number of GitHub Stars are used. FL frameworks that have received at least 200 GitHub Stars for their code repositories are considered. The GitHub Stars indicate how many GitHub users bookmarked the repository, which can be interpreted as a reflection of the popularity of a GitHub repository. In fact, only FL frameworks provided by a company or an academic institution are considered in this study.

4.2 Considered frameworks

To provide a first initial overview of the 15 filtered FL frameworks, a comparison of them is shown in Table 3 on page 9 based on the following metrics: the developer country of origin, GitHub stars, the number of Git releases, dates of the initial and lates releases. Notably, PySyft is the most popular FL framework with over 9000 GitHub stars, followed by FATE AI and FedML. In general, FL frameworks which were released earlier have a higher numbers of GitHub stars. PySyft and TFF have been updated the most, while FLUTE has not yet had an official release on GitHub. Apart from Flower, all other FL frameworks were developed either in China or in the USA. 200 was chosen as the critical value, as this produces a manageable number of FL frameworks with the greatest popularity. In addition, a clear break between the much and little observed frameworks can be seen in this value range, as only a few frameworks can be found between 500 and 200, before the number of repositories increases drastically below 200 stars.

5 Framework comparison and evaluation

This section starts with the introduction of the comparison criteria and the weighted scoring system in Subsec. 5.1 on page 8. Then, the comparison and evaluation of the 15 FL frameworks is performed and the results are presented in 5.2 on page 11. This section closes with a discussion and analysis of our findings in 5.3 on page 14.

5.1 Criteria and weighting definition

To ensure a detailed comparison, the FL frameworks are examined from three different perspectives, namely Features , Interoperability and User Friendliness using a weighted scoring system. All three main comparison categories each make up 100%. For each comparison category, this subsection describes individual comparison criteria and their weighting in descending order of relevance. The comparison criteria in each perspective category were selected based on the systematic literature review described in 3.4 on page 5.

Features This comparison category aims to examine and compare the inherent features of each FL framework. From the user’s point of view, it is mandatory to know the relevant features of an FL framework in order to select a suitable framework for an FL project. Typical FL framework features include the support of different FL Paradigms (horizontal, vertical, and federated transfer learning), Security Mechanisms (cryptographic and algorithm-based methods), different FL Algorithms and specific federated ML Models [ 33 , 34 , 95 , 96 , 97 , 98 , 99 , 100 , 101 ].

In terms of weighting, Security Mechanisms is weighted most heavily at 35%, because increased data privacy and security is the main motivation for using FL in most applications [ 102 ] and the inherent properties of FL do not guarantee complete security [ 34 , 103 , 104 , 105 , 106 ].

FL Algorithms and ML Models are given equal weighting at 25%, as both a wide range of algorithms and models are important to make an FL framework adaptable to different data-driven use cases [ 62 , 66 , 102 ].

The criterion FL Paradigms is weighted at 15%, because horizontal FL is still the most common FL paradigm [ 102 ], making the inclusion of other FL paradigms (i.e. vertical FL [ 107 ], and federated transfer learning [ 108 ]) less pertinent.

Interoperability

Interoperability is a mandatory factor in the evaluation of FL frameworks, particularly in terms of their compatibility with various software and hardware environments. This category includes support for multiple operating systems beyond the universally supported Linux containerization via Docker, CUDA support for leveraging GPUs, and the feasibility of deploying federated applications to physical edge devices [ 66 ].

The criterion Rollout To Edge Devices is weighted at 50%. This comparison criterion is crucial for the practical deployment of FL applications, enabling real-world applications rather than mere simulations confined to a single device [ 62 , 66 ]. Without this, the scope of FL frameworks would be significantly limited to theoretical or constrained environments.

Support for different Operating Systems is assigned a weight of 25%. This inclusivity ensures that a broader range of practitioners can engage with the FL framework, thereby expanding its potential user base and facilitating wider adoption across various platforms [ 62 ].

GPU Support is considered important due to the acceleration it can provide to model training processes, and is weighted at 15%. Although beneficial for computational efficiency, GPU support is not as critical as the other criteria for the core functionality of an FL framework [ 66 ].

Lastly, Docker Installation is recognized as a criterion with a 10% weight. Docker’s containerization technology offers a uniform and isolated environment for FL applications, mitigating setup complexities and compatibility issues across diverse computing infrastructures [ 109 ]. While Docker support enhances versatility and accessibility, it is deemed optional since there are FL frameworks available that may not necessitate containerization for running on other OSes. Although Docker’s containerization is a beneficial attribute for FL frameworks, it is not as heavily weighted as the capacity for edge device deployment or OS support, which are more essential for the practical implementation and broad usability of FL applications.

User Friendliness The aim of this comparison category is to examine and compare the simplicity and user-friendliness of the individual FL frameworks when creating FL applications. The simple use of an FL framework can shorten the development times in an FL project and thus save costs. Therefore, the following comparison criteria should be considered in this criteria group: Development Effort needed to create and run an FL session, federated Model Accuracy on unseen data, available online Documentation , FL Training Speed , Data Preparation Effort , Model Evaluation techniques and, if existing, the Pricing Systems for additional functionalities (e.g., online dashboards and model pipelines) [ 62 , 66 ].

The criteria Development Effort and Model Accuracy are deemed most critical, each carrying a 25% weight, due to their direct impact on the usability of FL frameworks and the effectiveness of the resultant FL applications [ 110 ]. The focus is on quantifying the ease with which developers can leverage the framework to create and deploy FL applications. This facet is critical as it directly influences the time-to-market and development costs of FL projects. Also for the FL application’s success it is important how well a federated model can perform on unseen new data [ 62 , 66 ].

The Documentation aspect is weighted with 20%. Given the novelty of many FL frameworks and the potential scarcity of coding examples, the availability and quality of documentation are evaluated [ 66 ]. This criterion underscores the importance of well-structured and informative documentation that can aid developers in effectively utilizing the FL framework, encompassing tutorials, API documentation, and example projects.

The Training Speed criteria is weighted lower with 10%, since a faster training time is advantageous for any FL framework, but is less relevant compared to a high model accuracy [ 62 , 66 ]. It reflects on the optimization and computational efficiency of the framework in processing FL tasks.

The Data Preparation Effort is assigned a weight of 10%. It evaluates the degree to which an FL framework supports data preprocessing and readiness, considering the ease with which data can be formatted, augmented, and made suitable for federated training. Although not critical for the operational use of an FL framework, streamlined data preparation processes can enhance developer productivity.

Model Evaluation receives the lowest weighting of 5%. It scrutinizes the methodologies and tools available within the FL framework for assessing global model performance and robustness, including validation techniques and metrics. Different model evaluation methods are helpful for practitioners, but not necessary for the effective use of an FL framework [ 66 ]. Thus, this criterion has more a supportive role in the broader context of FL application development.

Since the focus of this work is on open-source FL frameworks, the Pricing Systems is also only weighted at 5%. For FL frameworks that offer additional functionalities through paid versions, this evaluates the cost-benefit ratio of such features. While the core focus is on open-source frameworks, the assessment of pricing systems is still relevant for understanding the scalability and industrial applicability of the framework’s extended features.

To assess the scores for the Development Effort , Model Accuracy , Training Speed , Data Preparation Effort and Model Evaluation criteria, a federated test application has been created, simulating an FL setting while running on a single device. This application used the MNIST dataset [ 111 , 112 ] and performed an image multi-class classification task with a multi-layer perceptron neural network model. A grid search approach was used to identify an optimal hyperparameter configuration. The selected hyperparameters for the model trainings were used identically for testing each FL framework (see Table 4 on page 11).

Weighted Scoring In each of the three comparison categories mentioned above, the criteria are assigned weights that sum up to 100%. Consequently, the total score for all comparison criteria within a category represents the percentage score obtained by an evaluated FL framework in that particular category. These percentage scores for each category are then combined using a weighted sum to derive an overall total score. This serves as a final metric for selecting the best FL framework across all categories. All criterion weights are also listed in Table 7 on page 20 in the Appendix.

The distribution of the weighting of the three top level categories is as follows:

User Friendliness has the highest weighting ( 50% ), as the criteria in this category have the greatest impact for practitioners working with FL frameworks.

Features has the second highest weighting ( 30% ), as this category indicates which functionalities such as Security Mechanisms or FL Paradigms are supported in an FL framework.

Interoperability is weighted as the lowest ( 20% ), as it primarily indicates the installation possibilities of an FL framework, but does not represent core functionalities or the framework’s usability.

The FL frameworks can achieve one of three possible scores in each criterion: a score of zero is awarded if the FL framework does not fulfill the requirements of the criterion at all. A half score is awarded if the FL framework partially meets the requirements. A score of one is awarded if the FL framework fully meets the requirements. If a criterion cannot be verified or tested at all, then it is marked with N.A. (Not Available). This is treated as a score of zero in this criterion when calculating the total score. The detailed scoring schemes for each criterion are given in Table 7 on page 20 in the Appendix.

5.2 Comparison results

The scoring Table 5 on page 12 shows the comparison matrix of the 15 FL framework candidates on the basis of the defined categories and criteria from Subsect. 5.1 on page 8. In the following, we explain our assessment of the individual comparison criteria for the FL frameworks. Note: we write the individual comparison criteria in capital letters to highlight them.

Evaluation of Features It can be noted that for the first criterion, Security Mechanisms , five FL frameworks (PySyft, PaddleFL, FLARE, FLSim and FederatedScope) provide both cryptographic and algorithmic security features such as differential privacy, secure aggregation strategies, secure multiparty computation, trusted execution environments and homomorphic encryption [ 6 , 34 , 35 , 53 , 108 , 113 , 114 , 115 , 116 , 117 , 118 , 119 , 120 , 121 , 122 ]. Therefore, these FL frameworks receive the full score for this criterion. On the other hand, FATE AI, FedML, TFF, Flower, FedLearner, OpenFL, IBM FL and FLUTE all provide only one type of security mechanism. Thus, these FL frameworks receive half the score [ 48 , 49 , 50 , 52 , 57 , 63 , 64 , 66 , 78 , 86 , 87 , 123 ]. FedLab and EasyFL provide no security mechanisms and receive a score of zero in this criterion [ 54 , 55 , 92 ].

For the next criterion, FL Algorithms , the FL frameworks: FedML, TFF, Flower, OpenFL, IBM FL, FLARE, FLUTE, FederatedScope and FedLab receive full scores, because they provide out-of-the-box implementations of the FedAvg [ 1 ] algorithm as well as several different adaptive FL algorithms such as FedProx, FedOpt and FedAdam [ 124 , 125 ]. On the other hand, FATE AI, FedLearner, PaddleFL, FLSim and EasyFL only provide FedAvg as an aggregation strategy; other algorithms are not available in these FL frameworks by default, resulting in a halving of the score on this criterion. PySyft is the only FL framework candidate that requires manual implementation of an FL strategy (even for FedAvg). Therefore, PySyft receives a zero score on this criterion as it requires more effort to set up a training process [ 46 , 51 , 52 , 57 , 60 , 62 , 63 , 73 , 81 , 83 , 86 , 87 , 89 , 92 , 93 ].

For building ML Models , PySyft, FATE AI, FedML, Flower, OpenFL, IBM FL, FLARE and FederatedScope support the deep learning libraries Tensorflow and PyTorch. They provide users with a wide range of federatable ML models. Therefore, these FL frameworks are awarded the full marks on this criterion. However, TFF (Tensorflow), FedLearner (Tensorflow), PaddleFL (PaddlePaddle), FLSim (PyTorch), FLUTE (PyTorch), FedLab (PyTorch) and EasyFL (PyTorch) receive half the score because users are limited to only one supported ML library [ 52 , 57 , 60 , 62 , 63 , 64 , 76 , 78 , 81 , 83 , 86 , 87 , 89 , 91 , 93 ].

In terms of FL Paradigms , there are seven FL frameworks that support both horizontal and vertical FL and therefore receive full marks: PySyft, FATE AI, FedML, Flower, FedLearner, PaddleFL and FederatedScope. TFF, OpenFL, IBM FL, FLARE, FLSim, FLUTE, FedLab and EasyFL receive a zero score because they only support the standard horizontal FL paradigm [ 55 , 57 , 66 , 74 , 78 , 81 , 83 , 86 , 87 , 88 , 90 , 92 , 126 ].

Evaluation of Interoperability The Rollout To Edge Devices that allows FL applications to be implemented in real-world environments (e.g., on thin-clients or IoT devices) is possible with PySyft, FedML, Flower, IBM FL and FLARE. Therefore, they receive full marks on this criterion. However, PySyft only supports Raspberry Pi, while the other four FL frameworks also support the Nvidia Jetson Development Kits [ 86 ]. FATE AI, PaddleFL, FederatedScope and EasyFL each receive half the possible score because the rollout process on edge devices is more cumbersome compared to the other FL frameworks. For example, FATE AI and PaddleFL require edge devices with at least 100 GB of storage and 6 GB of RAM, which excludes most single-board computers. The FL frameworks TFF, FedLearner, OpenFL, FLUTE, FLSim and FedLab do not score on this criterion because they only support FL in simulation mode on a single device [ 46 , 52 , 60 , 62 , 63 , 64 , 77 , 83 , 87 , 89 , 91 , 93 ].

For the Operating System support, PySyft, FedML, Flower, IBM FL, FLARE, FLSim, FederatedScope and FedLab receive full marks, as Windows and MacOS are natively supported. On the other hand, the following FL framework candidates support only one of each: TFF (MacOS), OpenFL (MacOS), FLUTE (Windows) and EasyFL (MacOS) receive half the score. FATE AI, FedLearner and PaddleFL run only on Linux and require Docker containers when used on Windows or MacOS. Therefore, these three FL frameworks do not receive any points for this criterion [ 50 , 57 , 60 , 73 , 76 , 78 , 79 , 81 , 83 , 84 , 87 , 88 , 90 , 91 , 93 ].

All compared FL frameworks offer GPU Support and receive full scores on this criterion, except for FLSim. The documentation of FLSim makes no reference to a CUDA acceleration mode during FL training and CUDA could not be enabled during the conducted experiments. Therefore, this FL framework receives a score of zero in this criterion [ 51 , 52 , 63 , 66 , 73 , 74 , 76 , 81 , 83 , 86 , 87 , 90 , 91 , 93 ].

13 of the 15 FL framework candidates have a Docker containerization option and therefore receive full marks. These frameworks provide Docker images, which can be installed using the Docker Engine. By setting up a Docker container, it is possible to create an isolated environment which makes it possible to install software even though its requirements are not supported by the system specifications [ 109 ]. Some frameworks like FLARE and OpenFL provide a Dockerfile which builds the image automatically, while other frameworks like PaddleFL provide a documentation on how to install the Docker image manually. Surprisingly, FLSim and Microsoft’s FLUTE do not seem to support Docker containers. The use of Docker containers was not mentioned in the documentations and was not possible during the experiments conducted. Therefore, these two FL frameworks receive zero points for this criterion [ 57 , 60 , 73 , 74 , 76 , 78 , 79 , 81 , 83 , 84 , 87 , 88 , 90 , 91 , 93 ].

Evaluation of User Friendliness For the FATE AI, PaddleFL, and FedLearner FL frameworks, it is not possible to evaluate the criteria Development Effort , Model Accuracy , Training Speed , Data Preparation effort, and model Evaluation because of a number of issues with these FL frameworks, such as failed installations on Windows, Linux or MacOS. Thus, these FL frameworks are marked as N.A. in the mentioned criteria, because test experiments could not be performed with them.

For Development Effort , TFF, OpenFL, FedLab and EasyFL receive a score of one as the setup of applications with these frameworks was intuitive, fast and required few lines of code. FedML, Flower, IBM FL, FLSim, FLUTE and FederatedScope receive a half score, since development requires more lines of code than with the four frameworks mentioned previously, but aspects of the training process like the federated aggregation step or the local loss step are implemented. PySyft and FLARE require the most development effort because parts of the training process, such as gradient descent, must be implemented and set by the user, which is not the case for the other FL framework candidates. Thus, PySyft and FLARE are rewarded with zero points on Development Effort.

As for the global Model Accuracy , PySyft, Flower, OpenFL, IBM FL, FLARE, FLSim, FedLab and EasyFL achieved a test accuracy of over 90% in the performed MNIST classification simulation. On the other hand, FedML, TFF, FLUTE and FederatedScope performed worse, achieving an accuracy below the 90% threshold, thus receiving only half the score, even though the same model architecture, configuration and parameters have been used (see Table 4 on page 11). The test accuracies for the tested frameworks can be found in Table 6 on page 13.

Surprisingly, the amount and quality of Documentation available for the FL frameworks varies widely. PySyft [ 64 , 73 ], TFF [ 63 , 79 ], Flower [ 51 , 77 , 78 ] FLARE [ 84 , 85 , 86 ] and EasyFL [ 55 , 93 , 94 ] provide extensive API documentation, several sample applications and video tutorials to learn how to use these frameworks. These FL frameworks receive the full score on the criterion Documentation. However, FedLearner [ 60 ], PaddleFL [ 57 ], FLSim [ 87 ], and FLUTE [ 69 , 88 ] provide only little and mostly outdated documentation. Therefore, this group of FL frameworks receive zero points here. For FATE AI [ 56 , 74 ], FedML [ 46 , 75 , 76 ], OpenFL [ 49 , 81 ] IBM FL [ 48 , 59 , 83 ], FederatedScope [ 53 , 89 , 90 ] and FedLab [ 54 , 91 , 92 ], the available documentation is less extensive and at times outdated. These FL frameworks receive a score of 0.5 for this criterion.

When performing the test experiments with the FL framework candidates, there were also differences in the model Training Speed . With TFF, OpenFL, FLSim, FedLab and EasyFL, the federated training was completed in less than a minute, giving these frameworks a full score. FL Frameworks with a training speed between one and three minutes (FedML, Flower, FLARE, FLUTE, FederatedScope) received half of the score, while training on PySyft and IBM FL took longer than three minutes, resulting in a score of zero for these two frameworks. Since FLUTE can only be used on Windows [ 88 ], the training speed measurement may not be directly comparable to the measurements of the other FL frameworks which were computed on another computer running MacOS with a different hardware specification. The exact training speeds for the tested frameworks can be found in Table 6 on page 13.

For the assessment of the Data Preparation effort, we considered the effort required to transform proxy training datasets such as MNIST [ 112 ] into the required data format of the FL frameworks. Here, PySyft, Flower, FLARE, FLUTE and FedLab required only minor adjustments (e.g., reshaping the input data) and therefore received full scores, while TFF and IBM FL required more preparation, so both FL frameworks received no scores. FedML, OpenFL, FLSim, FederatedScope and EasyFL received a score of 0.5.

For the Evaluation criterion, TFF, OpenFL, IBM FL, FLSim, FederatedScope, FedLab and EasyFL provide built-in evaluation methods that display test set loss and accuracy metrics for the federated training of a global model, resulting in a full score for these FL frameworks in the Model Evaluation criterion. Since the main category is User Friendliness, PySyft receives a score of zero here because in PySyft all evaluation metrics must be implemented manually, which may include the requirements of additional libraries (e.g., TensorBoard). FedML, Flower, OpenFL and FLUTE provided evaluation methods with incomplete or convoluted output and thus received a score of 0.5.

For the Pricing System criterion, all FL framework candidates except FLUTE and IBM FL receive full marks because their features are freely accessible. FLUTE is integrated with Azure ML Studio [ 69 ]. Microsoft touts a faster and easier federated development process by leveraging its cloud service and proclaiming FLUTE’s integration with Azure ML as one of its key benefits, as the federated application can be used directly in the Azure ecosystem [ 69 ]. On the other hand, IBM FL is part of IBM Watson Studio cloud service, where additional features such as a UI-based monitoring and configuration are available that cannot be used in the open-source community edtion [ 59 ]. Therefore, FLUTE and IBM FL do not score on this criterion.

5.3 Discussion

Considering the scores at the category level, there are some FL frameworks that received notable scores in certain categories. FederatedScope received the highest score in the Features category with 100%, offering differential privacy and homomorphic encryption as security mechanisms, support for different ML libraries and many FL algorithms like FedAvg, FedOpt and FedProx. Meanwhile, EasyFL received only 25% of the score, offering no security mechanisms, FedAvg as the only implemented FL algorithm and one ML library, while only horizontal FL is available as a paradigm.

The FL frameworks PySyft, FedML, Flower, IBM FL and FLARE earned a perfect score of 100% in the Interoperability category, while FedLearner and FLSim performed joint-worst, receiving 25% of the category score (see Table 5 on page 12). FedLearner does not offer a rollout on edge devices and is not available for installation on either Windows or MacOS, limiting its potential user base. FLSim is available for both Windows and MacOS, but does not support a rollout on edge devices, GPU-based computation, or a Docker containerization.

Remarkably, EasyFL received the highest score of 95% in the User Friendliness category, fullfilling the most important criteria: Development Effort, Model Accuracy Documentation and Training Speed. The FL frameworks for which no test application could be created received the lowest scores, with FedLearner and PaddleFL receiving the lowest score in this category with 5%, and FATE AI receiving 15%. These low scores are noteworthy, since these three FL frameworks all have a long development history and are popular within the community (see Table 3 on page 9).

Based on the conducted comparison and evaluation, a ranking of FL frameworks can be constructed, which is visualized in Fig. 4 on page 15. It can be concluded that in terms of the overall score, Flower performed best with 84.75%, followed by FLARE with 80.5% and FederatedScope with 78.75% (see Table 5 on page 12). PySyft, FedML, OpenFL, EasyFL, IBM FL, TFF and FedLab all received scores at or above 60% overall. FLSim received a score of 54.25% and FLUTE scored 43.25%, while FATE AI, PaddleFL and FedLearner all scored below 40% in total, with FedLearner’s 24.75% marking the lowest score of the frameworks in this comparison.

The graphical representation of the scores on the bar plot further shows that the top ten FL frameworks, although with big differences in the category scores, all achieved relatively high total scores (at or above 60%). This suggests that a number of FL frameworks could already offer a satisfying solution for practitioners. The total score for the final five FL frameworks on the bar plot decreases sharply, indicating significant shortcomings in categories or specific criteria. FLSim and FLUTE scored low in the Interoperability category at 25% and 27.5% respectively, while FATE AI, PaddleFL and FedLearner received low User Friendliness scores (15%, 5%, and 5%).

Total scores (in percentage) of the compared frameworks

Generally, the difference in score between the FL frameworks in the Features category is small compared to the other categories. Only two frameworks score below 50%. Most variance in this category is introduced by the security and paradigm criteria. Should secure computation and communication be the focal point of development, then PySyft, PaddleFL, FLARE, FLSim and FederatedScope would provide the most extensive features for this use case.

In the Interoperability category, it is observable that only five of the FL frameworks (PySyft, FedML, Flower, IBM FL, FLARE) support a rollout on edge devices without strong limitations. This explains the high fluctuation of scores for this category, as the Rollout criterion was weighted heavily. Should the development of a fully realized, distributed FL application be central to a project, these five FL frameworks offer the best conditions and are most suitable for communication and real-time computing with IoT edge devices.

Examining the User Friendliness category, the Development Effort and Documentation criteria explain a lot of variability, while most FL frameworks generally perform well when tested for model test accuracy and federated training speed. An unexpectedly large variance was observed in the Training Speed criterion, with times ranging from under one minute to over three minutes. This may be explained by the different architecture of the FL frameworks and sequential and parallel computing approaches in simulation mode. Overall, the three FL frameworks (FATE AI, FedLearner, PaddleFL) for which no test application could be created are big outliers in this category. These three frameworks consequently also received the lowest total score, as displayed in Fig. 4 on page 15.

Furthermore, there are specific use cases for which some frameworks may be particularly suitable. FLARE is being developed by the same company (NVIDIA) which released Clara, which is an artificial intelligence suite focused on medical use cases. It may therefore be argued that FLARE profits from experiences made during the development of Clara. Meanwhile, FedML provides a website with an FL dashboard, where projects can be tracked and shared with collaborators, allowing for easy deployment, and sharing of applications. This may be advantageous when developing an FL applications across organizations. Furthermore, an extension for FATE called FATE-LLM has been released, targeting development of large language models in a federated setting, giving FATE a strong foundation in this area [ 127 ].

It can be concluded that the evaluated FL frameworks are relatively homogeneous regarding the criteria in the Features category. Support for a rollout on edge devices in the Interoperability category and differences in the availability and quality of documentation in the User Friendliness category are the major reasons for the variance in total score between the FL frameworks. To attract practitioners to their FL frameworks, these two aspects need to be most urgently improved by the underperforming FL frameworks.

5.4 Result summary

Based on the literature-driven comparison and analysis results, the RQs posed at the beginning of this paper (see Subection 1 on page 1) can be answered as follows:

RQ 1: Which relevant frameworks for FL exist and are open-source? 15 relevant FL frameworks were selected, reduced from a total of 18 identified FL frameworks after applying the inclusion criteria defined in SubSect. 4.1 on page 7. Table 3 on page 9 gives an overview of the selected FL frameworks. These filtered frameworks are all available as open-source software and have community and industry support. The FL frameworks are used as objects of study in the FL framework comparative analysis (see Sect. 5 on page 8).

RQ 2: Which criteria enable a qualitative and quantitative comparison of FL frameworks? The criteria, weights and evaluation schema introduced in Sect. 5.1 , summarized in Table 7 on page 20, are used in the comparison in SubSect. 5.2 . The criteria include quantitative measures such as Model Accuracy and Training Speed as well as qualitative measures such as the included Security Mechanisms and the quality and scope of the available Documentation. The evaluation schema based on these criteria creates a versatile and comprehensive comparison of FL frameworks.

RQ 3: Which FL framework offers the most added value to practitioners and researchers? Different FL frameworks received the highest scores in each of the three formulated categories (FederatedScope in Features, PySyft, FedML, Flower, IBM FL and FLARE in Interoperability and EasyFL in User Friendliness). This indicates that one of several FL Frameworks might provide the most added value depending on one’s preferences and needs regarding a particular project. The criteria, their weights and the presented result can in this case act as guidelines for FL framework selection. However, based on the comparative results (see SubSect. 5.2 on page 11), the FL framework Flower currently offers the most overall added value to practitioners and researchers.

6 Limitations and outlook

In this study, not all currently available FL frameworks are represented, since we formulated inclusion criteria to limit the number of FL framework candidates (see SubSect. 4.1 on page 7). The field of FL frameworks for the proposed comparison suite can be extended to include, for example, proprietary framework candidates that have not been considered in this study. A comparison of these with open-source FL frameworks could provide further interesting insights into the alignment and target audience of each FL framework. Additional experiments with FL frameworks in different FL settings could lead to more comprehensive benchmarking results. The vertical FL and federated transfer learning settings would be possible additions, should more frameworks support these paradigms in the future. Depending on the use case, an adjustment of the criteria weighting might also be required. Therefore, the comparison evaluation schema proposed in this paper can be adapted as desired to reflect the priorities of practitioners and researchers for particular FL projects.

FL is still a niche research field, but the number of scientific papers published each year is steadily increasing (see Fig. 3 on page 6) [ 128 , 129 , 130 , 131 , 132 ]. Based on this trend, we also expect a large number of new FL frameworks to be released in the near future. These emerging FL frameworks can be evaluated and compared to other FL frameworks upon release using the comparison methodology proposed in this paper.

7 Conclusion

In this study, a comparison suite to evaluate open-source Federated Learning (FL) frameworks was introduced. For this, a literature review was conducted following the guidelines set by Webster and Watson. The review method involved identifying relevant literature and organizing it based on the most significant concepts discovered through the use of a Latent Dirichlet Allocation (LDA) applied on identified publications relevant to FL. Based on filtered relevant literature, comparison criteria were formulated, and a weighted scoring system has been proposed. The criteria were categorized into the overarching categories of Features, Interoperability, and User Friendliness. Additionally, two inclusion criteria, namely the open-source availability and community popularity were established to narrow down the number of FL frameworks under consideration. This enabled us to conduct a more detailed comparison and evaluation of 15 relevant open-source FL frameworks as the study subjects. Both qualitative and quantitative aspects of the FL frameworks were compared, and a detailed score was calculated for each FL framework as a percentage. The conducted comparison analysis demonstrated that among the investigated FL frameworks, Flower performed the best, achieving a total score of 84.75%. Other FL framework candidates such as FLARE, FederatedScope, PySyft, FedML, OpenFL, EasyFL, IBM FL, TFF and FedLab also achieved a high total score (at or above 60%) but could not beat Flower in all aspects. Additionally, we observed that FederatedScope performed best in the Features category. PySyft, FedML, Flower, IBM FL and FLARE all received highest scores in the Interoperability category, while EasyFL performed best in the User Friendliness category. The worst performing FL frameworks were FATE AI, PaddleFL and FedLearner with a total score of 38.5%, 35% and 24.75% respectively, because they lacked in the Interoperability and particularily in the User Friendliness category. Due to their limitations, test experiments could not be conducted to accurately measure criteria such as Model Accuracy or Training Speed. While this study demonstrated the superior performance of FL frameworks such as Flower, FLARE or FederatedScope in most baseline scenarios, it is important to note that the priorities and requirements of practitioners and researchers may vary. Therefore, the results of this study can be used primarily as a guiding tool in the FL framework selection process for federated-driven analyses.

Data availability

The MNIST [ 111 , 112 ] proxy dataset that supports the findings of this study is openly available in http://yann.lecun.com/exdb/mnist/

McMahan HB, Moore E, Ramage D, Hampson S, Arcas BA (2017) Communication-efficient learning of deep networks from decentralized data. J Mach Learn Res 54:1273–1282

Google Scholar

Abadi M, Chu A, Goodfellow I, McMahan HB, Mironov I, Talwar K, Zhang L (2016) Deep learning with differential privacy. 23rd ACM conference on computer and communications security (CCS 2016), 308–318. https://doi.org/10.1145/2976749.2978318

Hard A, Rao K, Mathews R, Beaufays F, Augenstein S, Eichner H, Kiddon C, Ramage D (2018) Federated learning for mobile keyboard prediction arXiv:1811.03604

Li T, Sahu AK, Talwalkar A, Smith V (2020) Federated learning: challenges, methods, and future directions. IEEE Signal Process Mag 37:50–60. https://doi.org/10.1109/MSP.2020.2975749

Article Google Scholar

Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Bhagoji AN, Bonawitz KA, Charles Z, Cormode G, Cummings R, D’Oliveira RGL, Rouayheb SE, Evans D, Gardner J, Garrett Z, Gascon A, Ghazi B, Gibbons PB, Gruteser M, Harchaoui Z, He C, He L, Huo Z, Hutchinson B, Hsu J, Jaggi M, Javidi T, Joshi G, Khodak M, Konecny J, Korolova A, Koushanfar F, Koyejo S, Lepoint T, Liu Y, Mittal P, Mohri M, Nock R, Ozgur A, Pagh R, Raykova M, Qi H, Ramage D, Raskar R, Song D, Song W, Stich SU, Sun Z, Suresh AT, Tramer F, Vepakomma P, Wang J, Xiong L, Xu Z, Yang Q, Yu FX, Yu H, Zhao S (2021) Advances and open problems in federated learning. Found Trends Mac Learn 14:1–121. https://doi.org/10.1561/2200000083

Zhang L, Zhu T, Xiong P, Zhou W, Yu P (2023) A robust game-theoretical federated learning framework with joint differential privacy. IEEE Trans Knowl Data Eng 35:3333–3346. https://doi.org/10.1109/TKDE.2021.3140131

Jin H, Bai D, Yao D, Dai Y, Gu L, Yu C, Sun L (2023) Personalized edge intelligence via federated self-knowledge distillation. IEEE Trans Parallel Distrib Syst 34:567–580. https://doi.org/10.1109/TPDS.2022.3225185

Nguyen DC, Pham Q-V, Pathirana PN, Ding M, Seneviratne A, Lin Z, Dobre O, Hwang W-J (2022) Federated learning for smart healthcare: a survey. ACM Comput Surv 55:1–37

Antunes RS, da Costa CA, Küderle A, Yari IA, Eskofier B (2022) Federated learning for healthcare: systematic review and architecture proposal. ACM Trans Intell Syst Technol 13:1–23

Xing H, Xiao Z, Qu R, Zhu Z, Zhao B (2022) An efficient federated distillation learning system for multi-task time series classification. IEEE Trans Instrum Meas 71:1–12. https://doi.org/10.1109/TIM.2022.3201203

Riedel P, von Schwerin R, Schaudt D, Hafner A, Resnetfed S (2023) Federated deep learning architecture for privacy-preserving pneumonia detection from covid-19 chest radiographs. J Healthcare Inf Res 7:203–224

Rahman A, Hossain MS, Muhammad G, Kundu D, Debnath T, Rahman M, Khan MSI, Tiwari P, Band SS (2023) Federated learning-based ai approaches in smart healthcare: concepts, taxonomies, challenges and open issues. Clust Comput 26:2271–2311. https://doi.org/10.1007/s10586-022-03658-4

Bharati S, Mondal MRH, Podder P, Prasath VBS (2022) Federated learning: applications, challenges and future directions. Int J Hybrid Intell Syst 18:19–35

Witt L, Heyer M, Toyoda K, Samek W, Li D (2023) Decentral and incentivized federated learning frameworks: a systematic literature review. IEEE Internet Things J 10:3642–3663

Xiao Z, Xu X, Xing H, Song F, Wang X, Zhao B (2021) A federated learning system with enhanced feature extraction for human activity recognition. Knowl-Based Syst. https://doi.org/10.1016/j.knosys.2021.107338

Boobalan P, Ramu SP, Pham QV, Dev K, Pandya S, Maddikunta PKR, Gadekallu TR, Huynh-The T (2022) Fusion of federated learning and industrial internet of things: a survey. Comput Netw 212

Pandya S, Srivastava G, Jhaveri R, Babu MR, Bhattacharya S, Maddikunta PKR, Mastorakis S, Thippa MJP, Gadekallu R (2023) Federated learning for smart cities: a comprehensive survey. Sustain Energy Technol Assess 55:2–13

Zhang T, Gao L, He C, Zhang M, Krishnamachari B, Avestimehr AS (2022) Federated learning for the internet of things: applications, challenges, and opportunities. IEEE Internet Things Mag 5:24–29

Zhang K, Song X, Zhang C, Yu S (2021) Challenges and future directions of secure federated learning: a survey. Front Comput Sci 16:1–8

Li C, Zeng X, Zhang M, Cao Z (2022) Pyramidfl: a fine-grained client selection framework for efficient federated learning. Proceedings of the 28th annual international conference on mobile computing and networking 28, 158–171

Huang W, Ye M, Du B (2022) Learn from others and be yourself in heterogeneous federated learning. 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR)

Wen J, Zhang Z, Lan Y, Cui Z, Cai J, Zhang W (2023) A survey on federated learning: challenges and applications. Int J Mach Learn Cybern 14:513–535. https://doi.org/10.1007/s13042-022-01647-y

Guendouzi BS, Ouchani S, Assaad HE, Zaher ME (2023) A systematic review of federated learning: challenges, aggregation methods, and development tools. J Netw Comput Appl. https://doi.org/10.1016/j.jnca.2023.103714

Zhao Y, Li M, Lai L, Suda N, Civin D, Chandra V (2018) Federated learning with non-iid data arXiv:1806.00582

Almanifi ORA, Chow C-O, Tham M-L, Chuah JH, Kanesan J (2023) Communication and computation efficiency in federated learning: a survey. Internet Things 22:100742

Xu C, Qu Y, Xiang Y, Gao L (2023) Asynchronous federated learning on heterogeneous devices: a survey. Comput Sci Rev. https://doi.org/10.1016/j.cosrev.2023.100595

Qi P, Chiaro D, Guzzo A, Ianni M, Fortino G, Piccialli F (2023) Model aggregation techniques in federated learning: a comprehensive survey. Futur Gener Comput Syst 150:272–293. https://doi.org/10.1016/j.future.2023.09.008

Li Q, Diao Y, Chen Q, He B (2022) Federated learning on non-iid data silos: an experimental study. 2022 IEEE 38th iInternational conference on data engineering (ICDE)

Wang Z, Xu H-Z, Xu Y, Jiang Z, Liu J, Chen S (2024) Fast: enhancing federated learning through adaptive data sampling and local training. IEEE Trans Parallel Distrib Syst 35:221–236. https://doi.org/10.1109/TPDS.2023.3334398

Abreha HG, Hayajneh M, Serhani MA (2022) Federated learning in edge computing: a systematic survey. Sensors 22:450

Ticao Zhang SM (2021) An introduction to the federated learning standard. GetMobile Mobile Comput Commun 25:18–22

Beltrán ETM, Pérez MQ, Sánchez PMS, Bernal SL, Bovet G, Pérez MG, Pérez GM, Celdrán AH (2023) Decentralized federated learning: fundamentals, state of the art, frameworks, trends, and challenges. IEEE Commun Surv Tutorials 25:2983–3013. https://doi.org/10.1109/COMST.2023.3315746

Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: concept and applications. ACM Trans Intell Syst Technol 10:1–19. https://doi.org/10.1145/3298981

Gong X, Chen Y, Wang Q, Kong W (2023) Backdoor attacks and defenses in federated learning: state-of-the-art, taxonomy, and future directions. IEEE Wirel Commun 30:114–121. https://doi.org/10.1109/MWC.017.2100714

Dwork C, Roth A (2014) The algorithmic foundations of differential privacy. Found Trends Theor Comput Sci 9:211–407. https://doi.org/10.1561/0400000042

Article MathSciNet Google Scholar

McMahan HB, Ramage D, Talwar K, Zhang L (2018) Learning differencially private recurrent language models. International Conference on Learning Representations

Shaheen M, Farooq MS, Umer T, Kim B-S (2022) Applications of federated learning; taxonomy, challenges, and research trends. Electronics 11:670

Rodríguez-Barroso N, Jiménez-López D, Luzón MV, Herrera F, Martínez-Cámara E (2023) Survey on federated learning threats: concepts, taxonomy on attacks and defences, experimental study and challenges. Inf Fusion 90:148–173

Cummings R, Gupta V, Kimpara D, Morgenstern JH (2019) On the compatibility of privacy and fairness. Adjunct publication of the 27th conference on user modeling, adaptation and personalization, 309–315 https://doi.org/10.1145/3314183.3323847

Kusner MJ, Loftus JR, Russell C, Silva R (2017) Counterfactual fairness. 31st conference on neural iInformation processing systems 30, 4069–4079

Ding J, Tramel E, Sahu AK, Wu S, Avestimehr S, Zhang T (2022) Federated learning challenges and opportunities: an outlook. ICASSP 2022 - 2022 IEEE iInternational conference on acoustics, speech and signal processing (ICASSP)

Buolamwini J, Gebru T (2018) Gender shades: intersectional accuracy disparities in commercial gender classification. Proc Mach Learn Res 81:1–15

Zhang X, Kang Y, Chen K, Fan L, Yang Q (2023) Trading off privacy, utility, and efficiency in federated learning. ACM Trans Intell Syst Technol 14:98–18931. https://doi.org/10.1145/3595185

Khan M, Glavin FG, Nickles M (2023) Federated learning as a privacy solution - an overview. Procedia Comput Sci 217:316–325. https://doi.org/10.1016/j.procs.2022.12.227

Webster J, Watson RT (2002) Analyzing the past to prepare for the future: writing a literature review. MIS Q 26(2),

He C, Li S, So J, Zhang M, Wang H, Wang X, Vepakomma P, Singh A, Qiu H, Shen L, Zhao P, Kang Y, Liu Y, Raskar R, Yang Q, Annavaram M, Avestimehr S (2020) Fedml: a research library and benchmark for federated machine learning arXiv:2007.13518

Barroso NR, Stipcich G, Jimenez-Lopez D, Ruiz-Millan JA, Martinez-Camara E, Gonzalez-Seco G, Luzon MV, Veganzones MA, Herrera F (2020) Federated learning and differential privacy: software tools analysis, the sherpa.ai fl framework and methodological guidelines for preserving data privacy. Inf Fusion 64:270–292

Ludwig H, Baracaldo N, Thomas G, Zhou Y, Anwar A, Rajamoni S, Ong YJ, Radhakrishnan J, Verma A, Sinn M, Purcell M, Rawat A, Minh TN, Holohan N, Chakraborty S, Witherspoon S, Steuer D, Wynter L, Hassan H, Laguna S, Yurochkin M, Agarwal M, Chuba E, Abay A (2020) Ibm federated learning: an enterprise framework white paper v0.1 arXiv:2007.10987

Reina GA, Gruzdev A, Foley P, Perepelkina O, Sharma M, Davidyuk I, Trushkin I, Radionov M, Mokrov A, Agapov D, Martin J, Edwards B, Sheller MJ, Pati S, Moorthy PN, Wang HS, Shah P, Bakas S (2021) Openfl: an open-source framework for federated learning arXiv:2105.06413

Liu Y, Fan T, Qian Xu TC, Yang Q (2021) Fate: an industrial grade platform for collaborative learning with data protection. J Mach Learn Res 22:1–6

MathSciNet Google Scholar

Beutel DJ, Topal T, Mathur A, Qiu X, Parcollet T, Lane ND (2020) Flower: a friendly federated learning research framework arXiv:2007.14390

Dimitriadis D, Garcia MH, Diaz DM, Manoel A, Sim R (2022) Flute: a scalable, extensible framework for high-performance federated learning simulations arXiv:2203.13789

Xie Y, Wang Z, Gao D, Chen D, Yao L, Kuang W, Li Y, Ding B, Zhou J (2023) Federatedscope: a flexible federated learning platform for heterogeneity. Proc VLDB Endowment 16: 1000–1012. https://doi.org/10.14778/3579075.3579076

Zeng D, Liang S, Hu X, Wang H, Xu Z (2023) Fedlab: a flexible federated learning framework. J Mach Learn Res 24:1–7

Zhuang W, Gan X, Wen Y, Zhang S (2022) Easyfl: a low-code federated learning platform for dummies. IEEE Internet Things J 9:13740–13754. https://doi.org/10.1109/JIOT.2022.3143842

FedAI: what is FATE? https://fate.fedai.org/overview/ Accessed 20 Feb 2024

PaddlePaddle: GitHub Repository PaddlePaddle/PaddleFL. https://github.com/PaddlePaddle/PaddleFL Accessed 20 Feb 2024

NVIDIA: NVIDIA Clara: an application framework optimized for healthcare and life sciences developers. https://developer.nvidia.com/clara Accessed 30 May 2023

IBM Research: IBM Federated Learning. https://ibmfl.res.ibm.com Accessed 20 Feb 2024

ByteDance: GitHub Repository FedLearner. https://github.com/bytedance/fedlearner Accessed 20 Feb 2024

Liu J, Huang J, Zhou Y, Li X, Ji S, Xiong H, Dou D (2022) From distributed machine learning to federated learning: a survey. Knowl Inf Syst 64:885–917

Kholod I, Yanaki E, Fomichev D, Shalugin ED, Novikova E, Filippov E, Nordlund M (2021) Open-source federated learning frameworks for iot: a comparative review and analysis. Sensors 21:167–189. https://doi.org/10.3390/s21010167

TensorFlow: TensorFlow Federated: Machine Learning on Decentralized Data. https://www.tensorflow.org/federated Accessed 20 Feb 2024

OpenMined: OpenMined. https://www.openmined.org Accessed 20 Feb 2024

Sherpa.ai: Sherpa.ai: Privacy-Preserving Artificial Intelligence. https://www.sherpa.ai Accessed 20 Feb 2024

Liu X, Shi T, Xie C, Li Q, Hu K, Kim H, Xu X, Li B, Song D (2022) Unifed: a benchmark for federated learning frameworks arXiv:2207.10308

SciKitLearn: Latent Dirichlet Allocation. https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html Accessed 24 April 2023

OpenAI: OpenAI: Pricing. https://openai.com/pricing Accessed 20 Feb 2024

Microsoft: FLUTE: a scalable federated learning simulation platform. https://bit.ly/3KnvugJ Accessed 20 Feb 2024

Caldas S, Duddu SMK, Wu P, Li T, Konečný J, McMahan HB, Smith V, Talwalkar A (2018) Leaf: a benchmark for federated settings

Lai F, Dai Y, Singapuram S, Liu J, Zhu X, Madhyastha H, Chowdhury M Fedscale: Benchmarking model and system performance of federated learning at scale. Proceedings of the 39th international conference on machine learning 162 (2022)

FederalLab: GitHub Repository OpenFed. https://github.com/FederalLab/OpenFed Accessed 20 Feb 2024

OpenMined: GitHub Repository OpenMined/PySyft. https://github.com/OpenMined Accessed 20 Feb 2024

FedAI: GitHub Repository FedAI/FATE. https://github.com/FederatedAI/FATE Accessed 20 Feb 2024

FedML: FedML: The Federated Learning/Analytics and Edge AI Platform. https://fedml.ai Accessed 20 Feb 2024

FedML: GitHub Repository FedML-AI. https://github.com/FedML-AI Accessed 20 Feb 2024

Adap: Adap: Fleet AI. https://adap.com/en Accessed 20 Feb 2024

Adap: GitHub Repository Adap/Flower. https://github.com/adap/flower Accessed 20 Feb 2024

TensorFlow: GitHub Repository TensorFlow/Federated. https://github.com/tensorflow/federated Accessed 20 Feb 2024

Baidu research: Baidu PaddlePaddle releases 21 new capabilities to accelerate industry-grade model development. http://research.baidu.com/Blog/index-view?id=126 Accessed 07 Aug 2023

Intel: GitHub Repository Intel/OpenFL. https://github.com/intel/openfl Accessed 20 Feb 2024

University of Pennsylvania: CBICA: The Federated Tumor Segmentation (FeTS) Initiative. https://www.med.upenn.edu/cbica/fets/ Accessed 24 Aug 2022

IBM: GitHub Repository IBM Federated Learning. https://github.com/IBM/federated-learning-lib Accessed 20 Feb 2024

NVIDIA: GitHub Repository NVIDIA FLARE. https://github.com/NVIDIA/NVFlare Accessed 20 Feb 2024

Dogra, P.: Federated learning with FLARE: NVIDIA brings collaborative AI to healthcare and beyond. https://blogs.nvidia.com/blog/2021/11/29/federated-learning-ai-nvidia-flare/ Accessed 02 Aug 2023

NVIDIA: NVIDIA FLARE Documentation. https://nvflare.readthedocs.io/en/2.1.1/index.html Accessed 20 Feb 2024

Meta Research: GitHub Repository FLSim. https://github.com/facebookresearch/FLSim Accessed 20 Feb 2024

Microsoft: GitHub Repository Microsoft FLUTE. https://github.com/microsoft/msrflute Accessed 20 Feb 2024

FederatedScope: FederatedScope. https://federatedscope.io Accessed 20 Feb 2024

FederatedScope: GitHub FederatedScope. https://github.com/alibaba/FederatedScope Accessed 20 Feb 2024

FedLab: GitHub FedLab. https://github.com/SMILELab-FL/FedLab Accessed 20 Feb 2024

FedLab: ReadTheDocs FedLab. https://fedlab.readthedocs.io/en/master/ Accessed 20 Feb 2024

EasyFL: GitHub EasyFL. https://github.com/EasyFL-AI/EasyFL/tree/master Accessed 20 Feb 2024

EasyFL: ReadTheDocs EasyFL. https://easyfl.readthedocs.io/en/latest/ Accessed 20 Feb 2024

Bonawitz K, Eichner H, Grieskamp W, Huba D, Ingerman A, Ivanov V, Kiddon C, Koneny J, Mazzocchi S, McMahan B, Overveldt TV, Petrou D, Ramage D, Roselander J (2019) Towards federated learning at scale: system design. Proc Mach Learn Syst 1:374–388

Mansour Y, Mohri M, Ro J, Suresh AT (2020) Three approaches for personalization with applications to federated learning arXiv:2002.10619

Silva PR, Vinagre J, Gama J (2023) Towards federated learning: an overview of methods and applications. WIREs Data Min Knowl Discov 13:1–23

Zhu H, Xu J, Liu S, Jin Y (2021) Federated learning on non-iid data: a survey. Neurocomputing 465:371–390. https://doi.org/10.1016/j.neucom.2021.07.098

Nilsson A, Smith S, Ulm G, Gustavsson E, Jirstrand M (2018) A performance evaluation of federated learning algorithms. DIDL ’18: Proceedings of the second workshop on distributed infrastructures for deep learning 2, 1–8 . https://doi.org/10.1145/3286490.3286559

Asad M, Moustafa A, Ito T, Aslam M (2020) Evaluating the communication efficiency in federated learning algorithms. Proceedings of the 27th ACM symposium on operating systems principles. https://doi.org/10.1109/CSCWD49262.2021.9437738

Smith V, Chiang C-K, Sanjabi M, Talwalkar A (2017) Federated multi-task learning. 31st conference on neural information processing systems (NIPS 2017), 4427–4437

Lo SK, Lu Q, Wang C, Paik H, Zhu L (2021) A systematic literature review on federated machine learning: from a software engineering perspective. ACM Comput Surv 54(5):1–39. https://doi.org/10.1145/3450288

Lyu L, Yu H, Zhao J, Yang Q (2020) Threats to federated learning. Lecture Notes Artif Intell 12500:3–16. https://doi.org/10.1007/978-3-030-63076-8_1

Bagdasaryan E, Veit A, Hua Y, Estrin D, Shmatikov V (2020) How to backdoor federated learning. Proceedings of the 23rd international conference on artificial intelligence and statistics, 2938–2948

Shejwalkar V, Houmansadr A, Kairouz P, Ramage D (2022) Back to the drawing board: a critical evaluation of poisoning attacks on production federated learning. 2022 IEEE symposium on security and privacy (SP)

Fu J, Zhang X, Ji S, Chen J, Wu J, Guo S, Zhou J, Liu AX, Wang T (2022) Label inference attacks against vertical federated learning. Proceedings of the 31st USENIX security symposium 31

Feng S, Yu H (2020) Multi-participant multi-class vertical federated learning arXiv:2001.11154

Liu Y, Kang Y, Xing C, Chen T, Yang Q (2020) A secure federated transfer learning framework. IEEE Intell Syst 35(4):70–82. https://doi.org/10.1109/MIS.2020.2988525

Docker Inc.: The industry-leading container runtime. https://www.docker.com/products/container-runtime/ Accessed 07 June 2023

Fayad M, Schmidt D (1997) Object-oriented application frameworks. Commun ACM 40(10):32–38. https://doi.org/10.1145/262793.262798

Ge D-Y, Yao X-F, Xiang W-J, Wen, X-J, Liu, E-C (2019) Design of high accuracy detector for mnist handwritten digit recognition based on convolutional neural network. 2019 12th international conference on intelligent computation technology and automation (ICICTA), 658–662 . https://doi.org/10.1109/ICICTA49267.2019.00145

Deng L (2012) The mnist database of handwritten digit images for machine learning research. IEEE Signals Process Mag 29(6):141–142. https://doi.org/10.1109/MSP.2012.2211477

Avent B, Korolova A, Zeber D, Hovden T, Livshits B (2017) Blender enabling local search with a hybrid differential privacy model. J Privacy Confid 9, 747–764. DOIurlhttps://doi.org/10.29012/jpc.680

Cheu A, Smith AD, Ullman J, Zeber D, Zhilyaev M (2019) Distributed differential privacy via shuffling. IACR Cryptol. ePrint Arch, 375–403 . https://doi.org/10.1007/978-3-030-17653-2_13

Roth E, Noble D, Falk BH, Haeberlen A (2019) Honeycrisp: large-scale differentially private aggregation without a trusted core. Proceedings of the 27th ACM Symposium on Operating Systems Principles, 196–210. https://doi.org/10.1145/3341301.3359660

Song S, Chaudhuri K, Sarwate AD (2013) Stochastic gradient descent with differentially private updates. 2013 IEEE global conference on signal and information processing, 245–248. https://doi.org/10.1109/GlobalSIP.2013.6736861

Masters O, Hunt H, Steffinlongo E, Crawford J, Bergamaschi F (2019) Towards a homomorphic machine learning big data pipeline for the financial services sector. IACR Cryptol. ePrint Arch, 1–21

Yao AC-C (1986) How to generate and exchange secrets. Proceedings of the 27th annual symposium on foundations of computer science, 162–167

Kaissis G, Ziller A, Passerat-Palmbach J, Ryffel T, Usynin D, Trask A, Lima I, Mancuso J, Jungmann F, Steinborn M-M, Saleh A, Makowski M, Rueckert D, Braren R (2021) End-to-end privacy preserving deep learning on multi-institutional medical imaging. Nat Mach Intell 3(6):473–484. https://doi.org/10.1038/s42256-021-00337-8

Subramanyan P, Sinha R, Lebedev IA, Devadas S, Seshia SA (2017) A formal foundation for secure remote execution of enclaves. Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2435–2450. https://doi.org/10.1145/3133956.3134098

Hardy S, Henecka W, Ivey-Law H, Nock R, Patrini G, Smith G, Thorne B (2017) Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption arXiv:1711.10677

Nikolaenko V, Weinsberg U, Ioannidis S, Joye M, Boneh D, Taft N (2013) Privacy-preserving ridge regression on hundreds of millions of records. 2013 IEEE symposium on security and privacy, 334–348. https://doi.org/10.1109/SP.2013.30

So J, He C, Yang C-S, Li S, Yu Q, Ali RE, Guler B, Avestimehr S (2022) Lightsecagg: a lightweight and versatile design for secure aggregation in federated learning. Proc Mach Learn Syst 4:694–720

Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V (2020) Federated optimization in heterogeneous networks. Proc Mach Learn Syst 2:429–450

Reddi SJ, Charles Z, Zaheer M, Garrett Z, Rush K, Konečný J, Kumar S, McMahan HB (2021) Adaptive federated optimization. International conference on learning representations ICLR 2021

Romanini D, Hall AJ, Papadopoulos P, Titcombe T, Ismail A, Cebere T, Sandmann R, Roehm R, Hoeh MA (2021) Pyvertical: a vertical federated learning framework for multi-headed splitnn. ICLR 2021 Workshop on distributed and private machine learning

Fan T, Kang Y, Ma G, Chen W, Wei W, Fan L, Yang Q (2023) Fate-llm: a industrial grade federated learning framework for large language models. Arxiv Preprint

Velez-Esteveza A, Ducangeb P, Perezc IJ, Coboc MJ (2022) Conceptual structure of federated learning research field. Procedia Comput Sci 214:1374–1381

Farooq A, Feizollah A, Rehman MH (2021) Federated learning research trends and bibliometric analysis. Stud Comput Intell 965:1–19. https://doi.org/10.1007/978-3-030-70604-3_1

Gong M, Zhang Y, Gao Y, Qin AK, Wu Y, Wang S, Zhang Y (2024) A multi-modal vertical federated learning framework based on homomorphic encryption. IEEE Trans Inf Forensics Secur 19:1826–1839. https://doi.org/10.1109/TIFS.2023.3340994

Caramalau R, Bhattarai B, Stoyanov D (2023) Federated active learning for target domain generalisation. ArXiv abs/2312.02247 . https://doi.org/10.48550/arXiv.2312.02247

Matsuda K, Sasaki Y, Xiao C, Onizuka M (2024) Benchmark for personalized federated learning. IEEE Open J Comput Soc 5:2–13. https://doi.org/10.1109/OJCS.2023.3332351

Download references

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and affiliations.

University of Applied Sciences Ulm, Prittwitzstraße 10, 89075, Ulm, Baden-Württemberg, Germany

Pascal Riedel, Reinhold von Schwerin, Daniel Schaudt & Alexander Hafner

University of Tübingen, Geschwister-Scholl-Platz, 72074, Tübingen, Baden-Württemberg, Germany

Lukas Schick

University of Ulm, Helmholzstraße 16, 89081, Ulm, Baden-Württemberg, Germany

Pascal Riedel & Manfred Reichert

You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pascal Riedel .

Ethics declarations

Conflict of interest.

The authors have no Conflict of interest to declare that are relevant to the content of this article and there are no financial interests

Ethical approval

The data and models used are purely for scientific purposes and do not replace a clinical COVID-19 diagnosis by medical specialists

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Supplemental Material

See Figs. 5 , 6 and Table 7 .

List of topics, words and frequencies using LDA

Graphical representation of the most common words used in the identified literature

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Riedel, P., Schick, L., von Schwerin, R. et al. Comparative analysis of open-source federated learning frameworks - a literature-based survey and review. Int. J. Mach. Learn. & Cyber. (2024). https://doi.org/10.1007/s13042-024-02234-z

Download citation

Received : 13 August 2023

Accepted : 28 May 2024

Published : 28 June 2024

DOI : https://doi.org/10.1007/s13042-024-02234-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Federated learning
Machine learning
Open source
Framework comparison
Find a journal
Publish with us
Track your research

IMAGES

Literature Review For Qualitative Research
Literature Review For Qualitative Research
How to Write Literature Review in Qualitative Research
Literature Review For Qualitative Research
Literature Review For Qualitative Research
39 Best Literature Review Examples (Guide & Samples)

VIDEO

Lesson 2:Research- Phrases to use in the Literature Review (Part 1) #english #researchtips
Lesson 3: Research- Phrases to use in the Literature Review (Part 2) #researchtips
How to Do a Good Literature Review for Research Paper and Thesis
Introduction to Measuring
Writing an Abstract and Literature reviews, Research Methodology,easy notes
Tech Enhanced Innovation: The AI Revolution in Multidisciplinary Research and Patent Development

COMMENTS

Literature review as a research methodology: An overview and guidelines
As mentioned previously, there are a number of existing guidelines for literature reviews. Depending on the methodology needed to achieve the purpose of the review, all types can be helpful and appropriate to reach a specific goal (for examples, please see Table 1).These approaches can be qualitative, quantitative, or have a mixed design depending on the phase of the review.
Qualitative Research: Literature Review
In The Literature Review: A Step-by-Step Guide for Students, Ridley presents that literature reviews serve several purposes (2008, p. 16-17). Included are the following points: Historical background for the research; Overview of current field provided by "contemporary debates, issues, and questions;" Theories and concepts related to your research;
How to Write a Literature Review
Examples of literature reviews. Step 1 - Search for relevant literature. Step 2 - Evaluate and select sources. Step 3 - Identify themes, debates, and gaps. Step 4 - Outline your literature review's structure. Step 5 - Write your literature review.
Chapter 9. Reviewing the Literature
A literature review is a comprehensive summary of previous research on a topic. It includes both articles and books—and in some cases reports—relevant to a particular area of research. Ideally, one's research question follows from the reading of what has already been produced. For example, you are interested in studying sports injuries ...
Criteria for Good Qualitative Research: A Comprehensive Review
For this review, a comprehensive literature search was performed from many databases using generic search terms such as Qualitative Research, Criteria, etc. The following databases were chosen for the literature search based on the high number of results: IEEE Explore, ScienceDirect, PubMed, Google Scholar, and Web of Science.
PDF Qualitative Analysis Techniques for the Review of the Literature
Leech and Onwuegbuzie (2008) presented a typology for qualitative data analysis wherein qualitative data were conceptualized as representing one of four major sources; namely, talk, observations, drawings/photographs/videos, and documents. We believe that all four source types serve as relevant literature review sources.
Qualitative systematic reviews: their importance for our understanding
A qualitative systematic review brings together research on a topic, systematically searching for research evidence from primary qualitative studies and drawing the findings together. There is a debate over whether the search needs to be exhaustive. 1 , 2 Methods for systematic reviews of quantitative research are well established and explicit ...
Methodological Approaches to Literature Review
A literature review is defined as "a critical analysis of a segment of a published body of knowledge through summary, classification, and comparison of prior research studies, reviews of literature, and theoretical articles." (The Writing Center University of Winconsin-Madison 2022) A literature review is an integrated analysis, not just a summary of scholarly work on a specific topic.
Writing a literature review
Writing a literature review requires a range of skills to gather, sort, evaluate and summarise peer-reviewed published data into a relevant and informative unbiased narrative. Digital access to research papers, academic texts, review articles, reference databases and public data sets are all sources of information that are available to enrich ...
Guidance on Conducting a Systematic Literature Review
Literature review is an essential feature of academic research. Fundamentally, knowledge advancement must be built on prior existing work. To push the knowledge frontier, we must know where the frontier is. By reviewing relevant literature, we understand the breadth and depth of the existing body of work and identify gaps to explore.
Research Guides: Literature Reviews: What is a Literature Review?
A literature review is a review and synthesis of existing research on a topic or research question. A literature review is meant to analyze the scholarly literature, make connections across writings and identify strengths, weaknesses, trends, and missing conversations. A literature review should address different aspects of a topic as it ...
Chapter 9 Methods for Literature Reviews
9.3. Types of Review Articles and Brief Illustrations. EHealth researchers have at their disposal a number of approaches and methods for making sense out of existing literature, all with the purpose of casting current research findings into historical contexts or explaining contradictions that might exist among a set of primary research studies conducted on a particular topic.
What is Qualitative in Qualitative Research
What is qualitative research? If we look for a precise definition of qualitative research, and specifically for one that addresses its distinctive feature of being "qualitative," the literature is meager. In this article we systematically search, identify and analyze a sample of 89 sources using or attempting to define the term ...
Why Qualitative Research Needs More and Better Systematic Review
Those doing qualitative research cannot "opt out" of knowing their relevant scholarly conversations. Undertaking a qualitative systematic review provides a vital means to know and tune into the past conversation in your topic area that allows the researcher to position themselves and their work substantively, ontologically, theoretically, and methodologically in this landscape.
Research Guides: Systematic Reviews: Types of Literature Reviews
Qualitative, narrative synthesis. Thematic analysis, may include conceptual models. Rapid review. Assessment of what is already known about a policy or practice issue, by using systematic review methods to search and critically appraise existing research. Completeness of searching determined by time constraints.
A Guide to Writing a Qualitative Systematic Review Protocol to ...
Methodology: The key elements required in a systematic review protocol are discussed, with a focus on application to qualitative reviews: Development of a research question; formulation of key search terms and strategies; designing a multistage review process; critical appraisal of qualitative literature; development of data extraction ...
Qualitative or Quantitative?
Quantitative Research (an operational definition) Quantitative research: an operational description. Purpose: explain, predict or control phenomena through focused collection and analysis of numberical data. Approach: deductive; tries to be value-free/has objectives/ is outcome-oriented. Hypotheses: Specific, testable, and stated prior to study.
PDF Literature Review: An Overview
The literature review provides the researcher with an opportunity to identify any gaps that may exist in the body of literature and to provide a rationale for how the proposed study may contribute to the existing body of knowledge. The literature review helps the researcher to refine the research questions and embed them in guiding hypotheses ...
How to Operate Literature Review Through Qualitative and ...
3.5 Step 5: Qualitative Analysis. The literature review is an essential part of the research process. There are several types of the literature review [44, 45]. However, in general, the literature review is a process of questioning. It is intended to answer some questions about a particular topic: What are the primary literature sources?
Are Systematic Reviews Qualitative or Quantitative
A systematic review can be qualitative, quantitative, or a combination of the two. The approach that is chosen is determined by the research question and the scope of the research. When qualitative and quantitative techniques are used together in a given study, it is called a mixed method. In a mixed-method study, synthesis for the quantitative ...
Methods for the synthesis of qualitative research: a critical review
Background. The range of different methods for synthesising qualitative research has been growing over recent years [1,2], alongside an increasing interest in qualitative synthesis to inform health-related policy and practice [].While the terms 'meta-analysis' (a statistical method to combine the results of primary studies), or sometimes 'narrative synthesis', are frequently used to describe ...
Qualitative Data Analysis in Systematic Reviews
A qualitative systematic review aggregates integrates and interprets data from qualitative studies, which is collected through observation, interviews, and verbal interactions. Included studies may also use other qualitative methodologies of data collection in the relevant literature. The use of qualitative systematic reviews analyzes the ...
What are Literature Reviews?
Literature reviews are comprehensive summaries and syntheses of the previous research on a given topic. While narrative reviews are common across all academic disciplines, reviews that focus on appraising and synthesizing research evidence are increasingly important in the health and social sciences.. Most evidence synthesis methods use formal and explicit methods to identify, select and ...
WHO, WHEN, HOW: a scoping review on flexible at-home respite for
A scoping review [32,33,34] was conducted, as part of a larger multi-method participatory research known as the AMORA project [] to characterize flexible at-home respite.Scoping reviews allow to map the extent of literature on a specific topic [32, 34].The six steps proposed by Levac et al. [] were followed: [] Identifying the research question; [] searching and [] selecting pertinent ...
Perceptions of families and healthcare providers about feeding preterm
Methods and analysis A literature search will be conducted in multiple electronic databases from their inception, including PubMed, CINHAL, Embase, the Cochrane Central Register for Controlled Trials and PsycINFO. No restrictions will be applied based on language or data of publication. Two authors will screen the titles and abstracts and then review the full text for the studies' inclusion ...
Narrative Reviews: Flexible, Rigorous, and Practical
Narrative reviews have many strengths. They are flexible and practical, and ideally provide a readable, relevant synthesis of a diverse literature. Narrative reviews are often helpful for teaching or learning about a topic because they deliver a general overview. They are also useful for setting the stage for future research, as they offer an ...
Comparative analysis of open-source federated learning frameworks
These framework candidates are compared using a novel scoring schema with 15 qualitative and quantitative evaluation criteria, focusing on features, interoperability, and user friendliness. ... We used the research literature review of Webster and Watson to build the knowledge base for a literature-driven comparison analysis of open-source FL ...

Archer Library

Exploring the literature review

Your Literature Review

1. Select a Topic

Consider Purpose

★ Schedule a research appointment

2. Search the Literature

Books & eBooks: Archer Library & OhioLINK

Databases: Scholarly & Practitioner Journals

Databases: Theses & Dissertations

Newspapers: Databases & Internet

Search Strategies & Boolean Operators

Overview of boolean terms

Database Search Limiters

★ Truncating Search Terms

Asterisk (*) Wildcard

★ EBSCO Databases & Google Drive

Researching in an EBSCO database?

EBSCO Databases & Google Drive

Defining Literature Review

Recommended Reading

About this page

Archer Librarians

Archer Library • Ashland University © Copyright 2023. An Equal Opportunity/Equal Access Institution.

Have a language expert improve your writing

How to Write a Literature Review | Guide, Examples, & Templates

Instantly correct all language mistakes in your text

Table of contents

Don't submit your assignments before you do this

Make a list of keywords

Search for relevant sources

Take notes and cite your sources

Prevent plagiarism. Run a free check.

Chronological

Methodological

Theoretical

Cite this Scribbr article

Is this article helpful?

Shona McCombes

Chapter 9. Reviewing the Literature

Examples of Literature Reviews

Reader-Friendly Example: The Power of Peers

Authoritative Academic Journal Example: Working Class Students’ College Expectations

Embracing Theory

Helpful Tips

Where to Start Looking for Literature

Keep a List of Your Keywords

Think Laterally

Read Outside the Canon

Consider Multiple Uses for Literature

Describing Gaps in the Literature

Use Concept Mapping

Ask Yourself, “How Is This Sociology (or Political Science or Public Policy, Etc.)?”

Don’t Treat This as a Chore

Supplement: Two More Literature Review Examples

”Changing Dispositions among the Upwardly Mobile” by Curl, Lareau, and Wu ( 2018 )

Further Readings

Libraries | Research Guides

Systematic Reviews

What Makes a Systematic Review Different from Other Types of Reviews?

Save citation to file

Add to My Bibliography

A Guide to Writing a Qualitative Systematic Review Protocol to Enhance Evidence-Based Practice in Nursing and Health Care

Similar articles

LinkOut - more resources

Other Literature Sources

University Libraries

Literature Reviews

Qualitative researchers TEND to:

Qualitative Research (an operational definition)

Some Other Guidelines

Quantitative researchers TEND to:

Quantitative Research (an operational definition)

Methods for the synthesis of qualitative research: a critical review

James Thomas

Overview of synthesis methods

Grounded theory

Thematic Synthesis

Textual Narrative Synthesis

Meta-narrative