research proposal improving speaking skill

Information

Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

Active Journals
Find a Journal
Proceedings Series
For Authors
For Reviewers
For Editors
For Librarians
For Publishers
For Societies
For Conference Organizers
Open Access Policy
Institutional Open Access Program
Special Issues Guidelines
Editorial Process
Research and Publication Ethics
Article Processing Charges
Testimonials
Preprints.org
SciProfiles
Encyclopedia

Article Menu

research proposal improving speaking skill

Subscribe SciFeed
Recommended Articles
Google Scholar
on Google Scholar
Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

A hybrid methodology to improve speaking skills in english language learning using mobile applications.

1. Introduction

2. mobile learning in english as a second language education.

Portability : The acquisition of knowledge can be moved to any place and time, facilitating the student’s learning regardless of where he/she is.
Availability: Access to all information.
Personalization: The information can be adapted to the user’s needs and style.
Social Connectivity: Collaborative work among the same users of an application.
Motivation: It motivates students to get involved in their education.
The apps consider the four core English language skills: reading, speaking, listening, and writing.
Learners can access their apps anytime, anywhere.
Learners can interact through the apps with their peers or with other people around the world.
The apps are free of charge.
They can reinforce knowledge and provide tips to perform better in different skills.
The ubiquity of mobile apps undoubtedly provides learners with an autonomous and omnipresent education. They only need a smartphone, an app of their liking and the right motivation to start the process of responsible and dynamic learning.

2.1. Mobile Assisted Language Learning

Mobility of technology. It refers to all technological devices that have the wireless standard 802.11 (Wi-Fi). Through Wi-Fi, the teaching and learning material allows learners to acquire knowledge from anywhere and at any time; they choose the time and place where they want to learn, as well as the applications they want to use to interact with other users, create their own calendars, and relax from time to time.
Mobility of learning . It refers to student-focused autonomous learning, where the teaching process is a personalized, collaborative, omnipresent and permanent experience that considers the level of each student, helping them to realize their level of knowledge, their learning process, their growth, their achievements, and goals.
Mobility of learner. It refers to those free and independent learners who use teaching tools, leaving aside the barriers of age, time and learning ability. This avoids the comparison between students that usually causes shyness, fear and lack of confidence when expressing their ideas in a spoken way. In this way, mobile students personalize their learning activities based on their own interests and objectives, thus engaging in their continuous learning [ 33 ].

2.2. English Language Teaching–Learning Models

Effective interaction only occurs through speaking skills.
Practicing speaking skills allows the learning space to become a socially dynamic place.
Speaking with others immerses students in real contexts and social experiences when sharing a message with others.
The skill of speaking enhances the relationships that exist in the classroom. It all depends on the methodology used by the teacher to create a trusting classroom environment that allows everyone to express his or her opinions in a spoken manner without fear.
The ability to speak helps students develop metacognitive and creative thinking, because they involve other actors in communication, and learning takes on a collaborative role.
Grammar: It enables L2 learners to order and structure sentences so that they contain the correct meaning and sense and gives them the ability to communicate effectively orally and in writing. Grammar allows them to combine words based on rules and structures to create sentences that can be understood by others.
Vocabulary: Relates to the knowledge of words, their meaning, and their use in a real context. Knowing how to pronounce them within a conversation on a specific topic is what allows communication within the speaking discourse. To fulfill this purpose, vocabulary must be familiar and extensive so that the student is able to talk about any topic without the barrier of not knowing what it means or how to say certain words.
Pronunciation: Considers the phonology of the words and their grammatical elements that mark the sounds of the words in each language. Pronunciation is one of the most important elements of speech ability, since it is difficult for a person who does not have good pronunciation to make himself understood and communicate even if he has the correct grammatical elements and vocabulary.
Fluency: Allows communicating naturally and accurately. It is the goal of every L2 learner to be able to communicate with others using an appropriate speed and precise pauses when constructing a conversation. This rhythm of speech indicates that the learner is knowledgeable of the language and does not take too much time to structure sentences or search for precise vocabulary, but rather is able to use expressions that come to mind quickly and naturally.
Comprehension Allows one to understand the message transmitted and to convey a message that is understood. This encompasses all of the above, since making good use of grammar, vocabulary, pronunciation, and fluency is when comprehension is effective and opens the way for students to engage in meaningful and efficient communication.
Traditional approach: Grammatical translation methodology
The natural approach: Direct learning methodology and Berlitz method
Structural approach: Audio-oral method, situational method, and audio-visual method
Communicative approach: Communicative method and nocio-functional method
Humanistic approach: Total physical response method, silence method, suggestopedia method
The Direct Method
The Audio-lingual Method
The Total Physical Response
Communicative Language Teaching
Task-based Language Learning
Suggestopedia
Which mobile application can be used for teaching English language speaking skills?
Is it possible to propose an innovative methodology that combines traditional methodologies and mobile learning to improve English language communication skills?

3. Selection of Methodologies and Mobile App for the Research Proposal

Identification of a mobile application for research.

First step: The search was performed with the keyword and found about 250 applications for learning the English language and the development of communication skills.
Second step: From this group only, we chose the apps that were top-rated in the Google Play Store (4.5–4.9) and that had been download between 50,000+ and 100,000,000+ times, as shown in Table 2 .
Third step: A smartphone was used to download all of the apps and choose those most suitable to the educational environment. These were chosen because they were free to use, easy to use, and had outstanding educational content.
Fourth step: We applied an evaluation rubric, as shown in Table 3 , which allowed us to quantitatively identify which application was the best to use in this research [ 40 ].

4. Findings

5. case study, 5.1. timing, 5.2. learning methodology.

Grammar: Structuring sentences that use tenses, phrases, and grammatical rules correctly.
Sociolinguistics: Refers to the function of communication and interaction with other actors within different social and cultural contexts.
Strategy: Having the ability to understand and be understood when presenting ideas and overcome difficulties and misunderstandings that may arise in communication.

5.3. Limitations

6. discussion, 7. conclusions.

Pedagogical coherence
Feedback and self-correction
Customization

8. Future Work

Author contributions, institutional review board statement, informed consent statement, conflicts of interest.

Januariyansah, S.; Rohmantoro, D. The Role of Digital Classroom Facilities to Accommodate Learning Process ff The Z and Alpha Generations. In Proceedings of the 2nd International Conference on Child-Friendly Education, Muhammadiyah Surakarta University, Jawa Tengah, Indonesia, 21–22 April 2018; Volume 1994, pp. 434–439. [ Google Scholar ]
Gooding de Palacios, F.A. Enfoques para el aprendizaje de una segunda lengua: Expectativa en el dominio del idioma inglés. Rev. Cient. Orb. Cógn. 2020 , 4 , 20–38. [ Google Scholar ] [ CrossRef ]
Rani, K.J.; Kranthi, T.; Anjaneyulu, G. Teaching and Learning English as a Foreign Language. Int. J. Engl. Lang. Teach. 2018 , 5 , 57. [ Google Scholar ] [ CrossRef ]
Hafifah, H. The Effectiveness of Duolingo in Improving Students’ Speaking Skill at Madrasah Aliyah Bilingual Batu School Year 2019/2020. Lang.-Edu. J. 2019 , 10 , 1–7. [ Google Scholar ]
Prensky, M. Digital Natives, Digital Immigrants. Horizon 2001 , 9 , 1–6. [ Google Scholar ]
Al-Masri, E.; Kabu, S.; Dixith, P. Emerging Hardware Prototyping Technologies as Tools for Learning. IEEE Access 2020 , 8 , 80207–80217. [ Google Scholar ] [ CrossRef ]
Ritgerð Mobile Apps for Learning English. 2013. Available online: https://1library.net/document/q7o4v2oy-mobile-apps-for-learning-english.html (accessed on 21 March 2022).
Black, E.; Richard, F.; Lindsay, T. K-12 Virtual Schooling, COVID-19, and Student Success. JAMA—J. Am. Med. Assoc. 2020 , 324 , 833–834. [ Google Scholar ] [ CrossRef ]
Guggenberger, T.; Lockl, J.; Röglinger, M.; Schlatt, V.; Sedlmeir, J.; Stoetzer, J.-C.; Urbach, N.; Völter, F. Emerging digital technologies to combat future crises: Learnings from COVID-19 to be prepared for the future. Int. J. Innov. Technol. Manag. 2020 , 18 , 2140002. [ Google Scholar ] [ CrossRef ]
Sarica, G.N.; Cavus, N. New trends in 21st Century English learning. Procedia—Soc. Behav. Sci. 2009 , 1 , 439–445. [ Google Scholar ] [ CrossRef ]
Bolstad, R.; Lin, M. Students’ Experiences of Learning in Virtual Classrooms ; NZCER: Wellington, New Zealand, 2014; Volume 15, Retrieved May 2012. [ Google Scholar ]
Arshad, M.; Saeed, M.N. Emerging technologies for e-learning and distance learning: A survey. In Proceedings of the 2014 International Conference on Web and Open Access to Learning (ICWOAL), Dubai, United Arab Emirates, 25–27 November 2014; IEEE: Piscataway, NJ, USA, 2015. [ Google Scholar ] [ CrossRef ]
Dingli, A.; Seychell, D. The New Digital Natives ; Springer: Berlin, Germany, 2015; ISBN 9783662465905. [ Google Scholar ]
Choo, C.C.; Devakaran, B.; Chew, P.K.H.; Zhang, M.W.B. Smartphone application in postgraduate clinical psychology training: Trainees’ perspectives. Int. J. Environ. Res. Public Health 2019 , 16 , 4206. [ Google Scholar ] [ CrossRef ]
Sharma, S. Smartphone based language learning through mobile apps. Int. J. Recent Technol. Eng. 2019 , 8 , 8040–8043. [ Google Scholar ] [ CrossRef ]
Parsons, D. (Ed.) Combining E-Learning and M-Learning: New Applications of Blended Educational Resources ; Massey University: Palmerston North, New Zealand, 2011. [ Google Scholar ] [ CrossRef ] [ Green Version ]
Persson, V.; Nouri, J. A systematic review of second language learning with mobile technologies. Int. J. Emerg. Technol. Learn. 2018 , 13 , 188–210. [ Google Scholar ] [ CrossRef ]
Vate-U-Lan, P. Mobile learning: Major challenges for engineering education. In Proceedings of the 2008 38th Annual Frontiers in Education Conference, Saratoga Springs, NY, USA, 22–25 October 2008; pp. 11–16. [ Google Scholar ] [ CrossRef ]
Cisco. Annual Internet Report (2018–2023). 2020. Available online: https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/white-paper-c11-741490.html (accessed on 1 April 2022).
Segev, E. Mobile Learning_ Improve Your English Anytime, Anywhere_British Coun. 2014. Available online: https://www.britishcouncil.org/voices-magazine/mobile-learning-improve-english-anytime-anywhere (accessed on 2 April 2022).
Godwin-Jones, R. Emerging technologies from memory palaces to spacing algorithms: Approaches to second-language vocabulary learning. Lang. Learn. Technol. 2010 , 14 , 4–11. [ Google Scholar ]
Guo, H. Analysing and Evaluating Current Mobile Applications for Learning English Speaking. British Council ELT Master’s Dissertation Awards: Commendation 2014; pp. 2–92. Available online: https://www.teachingenglish.org.uk/sites/teacheng/files/analysing_and_evaluating_current_mobile_applications_v2.pdf (accessed on 3 April 2022).
Hossain, M. Exploiting Smartphones and Apps for Language Learning: A Case Study with The EFL Learners in a Bangladeshi University. Rev. Public Adm. Manag. 2018 , 6 , 1. [ Google Scholar ] [ CrossRef ]
Hashemi, M.; Ghasemi, B. Using mobile phones in language learning/teaching. Procedia-Soc. Behav. Sci. 2011 , 15 , 2947–2951. [ Google Scholar ] [ CrossRef ]
Huang, L. Acceptance of mobile learning in classroom instruction among college English teachers in China using an extended TAM. In Proceedings of the 2017 International Conference of Educational Innovation through Technology (EITT), Osaka, Japan, 7–9 December 2017; IEEE: Piscataway, NJ, USA, 2018; Volume 2018, pp. 283–287. [ Google Scholar ] [ CrossRef ]
Metafas, D.; Politi, A. Mobile-assisted learning: Designing class project assistant, a research-based educational app for project based learning. In Proceedings of the 2017 IEEE Global Engineering Education Conference (EDUCON), Athens, Greece, 25–28 April 2017; pp. 667–675. [ Google Scholar ] [ CrossRef ]
Wan Azli, W.U.A.; Shah, P.M.; Mohamad, M. Perception on the Usage of Mobile Assisted Language Learning (MALL) in English as a Second Language (ESL) Learning among Vocational College Students. Creat. Educ. 2018 , 9 , 84–98. [ Google Scholar ] [ CrossRef ]
Mierlus-Mazilu, I. M-learning objects. In Proceedings of the 2010 International Conference on Electronics and Information Engineering, Kyoto, Japan, 1–3 August 2010; Volume 1, pp. V1-113–V1-117. [ Google Scholar ] [ CrossRef ]
Özdoǧan, K.M.; Başoǧlu, N.; Erçetin, G. Exploring major determinants of mobile learning adoption. In Proceedings of the 2012 Proceedings of PICMET ‘12: Technology Management for Emerging Technologies, Vancouver, BC, Canada, 29 July–2 August 2012; pp. 1415–1423. [ Google Scholar ]
Yang, Z. A study on self-efficacy and its role in mobile-assisted language learning. Theory Pract. Lang. Stud. 2020 , 10 , 439–444. [ Google Scholar ] [ CrossRef ]
Criollo-C, S.; Lujan-Mora, S.; Jaramillo-Alcazar, A. Advantages and disadvantages of m-learning in current education. In Proceedings of the IEEE World Engineering Education Conference (EDUNINE), Buenos Aires, Argentina, 11–14 March 2018. [ Google Scholar ] [ CrossRef ]
Eshankulovna, R.A. Modern technologies and mobile apps in developing speaking skill. Linguist. Cult. Rev. 2021 , 5 , 1216–1225. [ Google Scholar ] [ CrossRef ]
Ozdamli, F.; Cavus, N. Basic elements and characteristics of mobile learning. Procedia-Soc. Behav. Sci. 2011 , 28 , 937–942. [ Google Scholar ] [ CrossRef ]
Zou, B.; Li, J. Exploring Mobile Apps for English Language Teaching and Learning ; Research-Publishing: Wuhan, China, 2015; pp. 564–568. [ Google Scholar ] [ CrossRef ]
Garay-Cortes, J.; Uribe-Quevedo, A. Location-based augmented reality game to engage students in discovering institutional landmarks. In Proceedings of the 2016 7th International Conference on Information, Intelligence, Systems & Applications (IISA), Chalkidiki, Greece, 13–15 July 2016. [ Google Scholar ]
Ameri, M. The Use of Mobile Apps in Learning English Language. Bp. Int. Res. Crit. Linguist. Educ. J. 2020 , 3 , 1363–1370. [ Google Scholar ] [ CrossRef ]
Alzatma, A.A.; Awfiq Khader, K. Using Mobile Apps to Improve English Speaking Skills of EFL Students at the Islamic University of Gaza. Master’s Thesis, Islamic University of Gaza, Gaza, Palestine, 2020. [ Google Scholar ] [ CrossRef ]
Rizqiningsih, S.; Hadi, M.S. Multiple Intelligences (MI) on Developing Speaking Skills. Engl. Lang. Focus 2019 , 1 , 127. [ Google Scholar ] [ CrossRef ]
Aznar, A. Different Methodologies Teaching English. 2014. Available online: https://documents.pub/document/different-methodologies-teaching-english-277pdf-analyze-each-of-these-methods.html?page=1 (accessed on 1 April 2022).
Chen, X. Evaluating Language-learning Mobile Apps for Second-language Learners. J. Educ. Technol. Dev. Exch. 2016 , 9 , 42–43. [ Google Scholar ] [ CrossRef ]
Duolingo. Duolingo—La Mejor Manera de Aprender un Idioma a Nivel Mundial. 2020. Available online: https://es.duolingo.com/course/en/es/Aprender-ingl%C3%A9s (accessed on 7 July 2022).
Teske, K. Learning Technology Review—Duolingo. Calico J. 2017 , 34 , 393–401. [ Google Scholar ] [ CrossRef ]
Reima, A.-J. Mobile Apps in the EFL College Classroom Aims of Study Why Use Mobile Apps Types of Language Apps. J. Res. Sch. Prof. Engl. Lang. Teach. 2020 , 4 , 1–7. [ Google Scholar ]
Cambridge. Assessing Speaking Performance—Level A2 Examiners and Speaking Assessment in the A2 Key Exam. 2011. Available online: https://docplayer.net/30294467-Assessing-speaking-performance-level-a2.html (accessed on 2 April 2022).
Bustillo, J.; Rivera, C.; Guzmán, J.G.; Ramos Acosta, L. Benefits of using a mobile application in learning a foreign language. Sist. Telemática 2017 , 15 , 55–68. [ Google Scholar ] [ CrossRef ]
Baldauf, M.; Brandner, A.; Wimmer, C. Mobile and gamified blended learning for language teaching—Studying requirements and acceptance by students, parents and teachers in the wild. In Proceedings of the 16th International Conference on Mobile and Ubiquitous Multimedia, Stuttgart, Germany, 26–29 November 2017; pp. 13–24. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

Category	Least Suitable (1–3)	Average (4–7)	Most Suitable (8–10)
Content should provide opportunities to advance learners’ English skills, with connection to their prior knowledge.	Content fails to help achieve learning goals or autonomous learning.	Content helps achieve the learning goals but is neither autonomous learning nor related to prior knowledge.	Content helps achieve the learning goals, autonomous learning, and relates prior knowledge to new content.
The skills provided in the app should be consistent with the targeted learning goal.	Skills (especially listening and speaking skills) reinforced in the app were not consistent with the targeted skill or concept.	Skills (especially listening and speaking skills) reinforced were prerequisite to foundation skills for the targeted skill or concept.	Skills (especially listening and speaking skills) reinforced were strongly connected to the target skill or concept.
Learners should be provided with feedback to conduct self-evaluation.	Feedback is limited to correct learner response.	Feedback is specific and allows for learners to try again in order to improve learning performance.	Feedback is specific, which results in improved learner performance; data are available to learners and instructors.
Elements are embedded to engage and motivate language learners to use the app.	No elements are embedded to encourage learners’ self-directed learning.	Limited elements are embedded to encourage learners’ self-directed learning.	Elements are embedded to encourage learners’ self-directed learning.
Learners are provided with clearly indicated menus and icons to easily navigate through the app.	Menus and icons are not clearly indicated, no onscreen help and/or tutorials are available, and learners need constant help to use the app.	Menus and icons are clearly indicated, but no on-screen help and/or tutorials are available. Learners need to have an instructor in order to review how to use the app.	Menus and icons are clearly indicated, and on-screen help and/or tutorials are available so that learners can launch and navigate the app independently.
Learners have their individualized needs met including font size and customizable settings to personalize their learning.	Text size cannot be adjusted, and few customizations are provided.	Text size can be adjusted according to user’s needs, and some customizations are provided.	Text size can be adjusted to suit diverse needs, and customizations and more individualized options are provided.
Learners can share their learning progress, issues, or concerns in learning.	Limited performance data, or learner’s progress is not accessible.	Performance data or learner progress is available in the app, but exporting is limited.	Specific performance summary or learner progress is saved in the app and can be exported to a teacher or an audience.

Mobile App (Code *)	App Name	Google Play Store Rating	Downloads	Comment
A	Aba English—aprender inglés	4.6	10,000,000+	does not track progress
B	Andy English—habla en inglés	4.6	1,000,000+	poor content
C	Aprende a hablar inglés (talk englihs)	4.6	5,000,000+	little content, lots of ads
D	Aprende inglés—escuchando y hablando	4.7	1,000,000+	no levels
E	Aprende inglés, vocabulario	4.7	1,000,000+	focuses on vocabulary
F	Aprender a hablar inglés (hello)	4.7	1,000,000+	poor vocabulary and assessment
G	Aprender a hablar inglés (miracle full box)	4.7	1,000,000+	no levels
H	Basic English for beginners	4.7	1,000,000+	low level

J	Bytalk: speak English online	4.7	100,000+	not safe
K	Cake	4.8	50,000,000+	no specific themes
L	Conversación en inglés	4.6	1,000,000+	focuses on listening

N	Curso de inglés para principiantes gratis	4.9	1,000,000+	low level


Q	English 1500 conversation	4.7	1,000,000+	focuses on listening
R	English conversation	4.9	1,000,000+	lots of ads

T	English conversation practice case	4.9	100,000+	time consuming

V	English skills—practicar y aprender inglés	4.5	1,000,000+	lack of short dialogues

X	Habla inglés—comunicar	4.5	1,000,000+	no audios
Y	Hablar inglés americano	4.7	500,000+	complicated to use
Z	Hallo: hablar inglés	4.5	1,000,000+	not safe
AA	Hello English: learn English	4.6	10,000,000+	complicated to use
BB	Hello English: learn English skills	4.5	1,000,000+	audios do not work
CC	Ielts listening English—eli	4.9	1,000,000+	standardized test


FF	Practica conversar en inglés (talk english)	4.5	5,000,000+	not eye-catching

HH	Pronunciación correcta—aprende inglés	4.6	1,000,000+	focuses or pronunciation
II	Reallife	4.9	100,000+	focuses on listening
JJ	Redkiwi: escucha&habla inglés	4.7	100,000+	focuses on listening
KK	English pronunciation (yobimi group)	4.7	500,000+	focus on pronunciation
LL	Speak English pro: American pronunciation	4.7	100,000+	focuses or pronunciation
MM	Speak English!	4.5	1,000,000+	not safe

Category	Mobile App (Code *)
Category	O	P	W	GG	NN	DD	S	M	EE	I	U
Content quality		7	7	6	8	8	3	8	8	8	3
Pedagogical coherence		8	7	7	6.5	9	5	8	8	8	4
Feedback and self-correction		8	5	7.5	7	3	7	5	7	5	7
Motivation		4	6	6	7	5	6	3	6	3	5
Usability		6	7	6	6	7	9	4	3	4	8
Customization		8	8	6	3	6	4	5	4	4	5
Sharing		7	7	6	6	3	7	7	4	4	3
Total score		48	47	44.5	43.5	41	41	40	40	36	35

Week 1: Greetings
Objective: To identify the different forms of greetings and apply them in conversations in different spaces, such as offices, restaurants, parks, medical centers, etc.
Resources: teacher, students, flash cards, smartphones, Duolingo app

Warm up	The teacher will start the class with a game. Say “Hello”. The teacher will ask the students to go out to the playground and form a circle. Then, the teacher will explain to the students the dynamics of the game. The students should greet the classmates who meet the teacher’s characteristics. For example, say “hello” to whoever is wearing green pants.	5 min
Duolingo Time	The students have already created their accounts and profiles and will start with the first session. The teacher asks the students to choose a space on the playground and work on the “Greeting” session of the application. The students will complete the following activities: complete sentences and questions to make a short conversation, match words with their meaning, read short paragraphs and choose the correct answer, answer questions, listen to sentences and repeat them or write them down, and translate sentences from Spanish to English or from English to Spanish. All of these exercises are related to the theme. The students have 15 min to do so. At the end of the lesson, the students should show the teacher their crown, which signifies that they have successfully completed their session. While the students work on the application, the teacher observes the students for any questions or doubts.	15 min
Speaking in real context (Evaluation)	The teacher will ask students to enter the classroom. The teacher will hold in his/her hands two bags containing flashcards about space and time. Each group of students will choose whether their conversation takes place in the morning, afternoon, evening, or night and in which space. The teacher will ask the students to record their conversations with their cell phones and then send them to his or her email.	15 min
Feedback	The teacher will take notes on errors as the activity unfolds. At the end of the speaking activity, the teacher will give general feedback on and correct errors across the board. As homework, the students will have to complete the same level of Duolingo at home.	10 min

Week 1: Making Plans for the Weekend
Objective: To know how to ask a friend what he/she would like to do on the weekend.
Resources: teacher, students, Sarah’s, Lukas’s and Jake’s Photo, smartphones, Duolingo app

Warm up	The teacher will ask the students to go out to the playground and play in groups of the number she indicates. The student who is left out of a group will answer the following question: What would you like to do on the weekend? The teacher will play this game with the students several times so that different students can answer the question. The teacher will explain to the students that the theme of the class will be to talk about plans for the weekend.	5 min
Duolingo Time	The teacher asks the students to choose a space on the playground and work on the “Plans” session of the application. Students will complete the following activities: complete sentences and questions to make a short conversation, match words with their meaning, read short paragraphs and choose the correct answer, answer questions, listen to sentences and repeat them or write them down, and translate sentences from Spanish to English or from English to Spanish. All of these exercises are related to the theme. The students have 15 min to do so. At the end of the lesson, the students should show the teacher their crown, which signifies that they have successfully completed their session. While the students work on the application, the teacher observes the students for any questions or doubts.	15 min
Speaking in real context (Evaluation)	The teacher will show the group a picture of Sara, Jake and Lukas, teenagers. The teacher will tell the students their story. Sara is a teenager who is going out with her friends for the weekend. The teacher will ask the students to choose a character and ask their friends about their plans for the weekend. For this activity, students will work in pairs. Each pair will come to the front of the class and discuss their plans.	15 min
Feedback	The teacher will take notes on errors as the activity unfolds. At the end of the speaking activity, the teacher will give general feedback on and correct errors across the board. As homework, the students will have to complete the same level of Duolingo at home.	10 min

Week 2: Comparatives and Superlatives
Objective: To identify how to use and express the comparative and superlative forms in a sentence. Vocabulary (adjectives)
Resources: teacher, students, students’ family photos, smartphones, Duolingo app

Warm up	The teacher will show the students a picture of his/her family. The teacher will describe each family member using comparative and superlative sentences. Example: This is my sister Karen. She is the youngest in the family. In this way the teacher will indicate to the students the topic of the activity.	5 min
Duolingo Time	The teacher asks the students to choose a space on the playground and work on the “Comparatives and Superlatives” session of the application. Students will complete the following activities: complete sentences and questions to make a short conversation, match words with their meaning, read short paragraphs and choose the correct answer, answer questions, listen to sentences and repeat them or write them down, and translate sentences from Spanish to English or from English to Spanish. All of these exercises are related to the theme. The students have 15 min to do so. At the end of the lesson, the students should show the teacher their crown, which signifies that they have successfully completed their session. While the students work on the application, the teacher observes the students for any questions or doubts.	15 min
Speaking in real context (Evaluation)	The teacher will ask students to bring in a family photo. In each of the groups, the students will describe their family pictures. Example: This is my family. My brother Carlos is the youngest of all, my sister Laura is taller than my mom, but my dad is the tallest in the family. This time, students will come to the front of the class voluntarily. Only those who want to. This is to see if students have developed more confidence in public speaking in front of their peers.	15 min
Feedback	The teacher will take notes on errors as the activity unfolds. At the end of the speaking activity, the teacher will give general feedback on and correct errors across the board. As homework, the students will have to complete the same level of Duolingo at home.	10 min

Week 2: Likes and Dislikes
Objective: To identify how to use and express likes and dislikes in a sentence.
Resources: teacher, students, questions, smartphones, Duolingo app

Warm up	The teacher will start the class with a “Speaking Telephone” game. The teacher will ask the students to form two lines. The teacher will say the phrase “I like fruits and candy, but I don’t like soup, pizza and noodles” to the first student in each line. The students will have to relay the message to their classmates. At the end, the last student will say the message out loud. The students will compare their message with the original. The results will be a lot of fun.	5 min
Duolingo Time	The teacher asks the students to choose a space on the playground and work on the “Likes and Dislikes” session of the application. Students will complete the following activities: complete sentences and questions to make a short conversation, match words with their meaning, read short paragraphs and choose the correct answer, answer questions, listen to sentences and repeat them or write them down, and translate sentences from Spanish to English or from English to Spanish. All of these exercises are related to the theme. The students have 15 min to do so. At the end of the lesson the students should show the teacher their crown, which signifies that they have successfully completed their session. While the students work on the application, the teacher observes the students for any questions or doubts.	15 min
Speaking in real context (Evaluation)	The teacher will share with each group of students a list of questions. The students will have to discuss these questions and discuss and present their points of view. This activity should be recorded and sent to the teacher’s email. Questions:	15 min
Feedback	The teacher will take notes on errors as the activity unfolds. At the end of the speaking activity, the teacher will give general feedback on and correct errors across the board. As homework, the students will have complete the same level of Duolingo at home.	10 min

Week 3: What Kind of Friend Are You?
Objective: To identify and use adjectives to describe themselves and others to express what kinds of friends they are.
Resources: teacher, students, TikTok questions, smartphones, Duolingo app ( (accessed on 1 March 2022)) ( (accessed on 1 March 2022))

Warm up	Students will record a video about 14 questions with their friend. The teacher will ask students to form pairs, then they should put a cell phone in front of their faces. Students should activate the questions, close their eyes and point to the friend who would be the answer to the question. For example: Who is the funniest of the two?	5 min
Duolingo Time	The teacher asks the students to choose a space on the playground and work on the “Friends” session of the application. Students will complete the following activities: complete sentences and questions to make a short conversation, match words with their meaning, read short paragraphs and choose the correct answer, answer questions, listen to sentences and repeat them or write them down, and translate sentences from Spanish to English or from English to Spanish. All of these exercises are related to the theme. The students have 15 min to do so. At the end of the lesson, the students should show the teacher their crown, which signifies that they have successfully completed their session. While the students work on the application, the teacher observes the students for any questions or doubts.	15 min
Speaking in real context (Evaluation)	The teacher will write the names of all the students on paper. Then ask each student to take a piece of paper, The student will read the name and describe the person to their classmates to guess who the person is.	15 min
Feedback	The teacher will take notes on errors as the activity unfolds. At the end of the speaking activity, the teacher will give general feedback on and correct errors across the board. As homework, the students will have to complete the same level of Duolingo at home.	10 min

Week 3: Emotions
Objective: To identify and use feelings to express how they feel about different actions or events.
Resources: teacher, students, Spider Man movie pictures, smartphones, Duolingo app

Warm up	The teacher will project in class pictures of famous actors from the latest Spider Man movie and ask the students about the emotions in each picture. If the actor or actress is happy, upset, excited and why. Students will respond to each question by expressing their point of view.	5 min
Duolingo Time	The teacher asks the students to choose a space on the playground and work on the “Emotions” session of the application. Students will complete the following activities: complete sentences and questions to make a short conversation, match words with their meaning, read short paragraphs and choose the correct answer, answer questions, listen to sentences and repeat them or write them down, and translate sentences from Spanish to English or from English to Spanish. All of these exercises are related to the theme. The students have 15 min to do so. At the end of the lesson, the students should show the teacher their crown, which signifies that they have successfully completed their session. While the students work on the application, the teacher observes the students for any questions or doubts.	15 min
Speaking in real context (Evaluation)	The teacher will read short paragraphs to his/her students. Students should listen and then say how they feel about the situation. For example: Pandemic, Animals in Danger of Extinction, The Premiere of a New Movie, etc.	15 min
Feedback	The teacher will take notes on errors as the activity unfolds. At the end of the speaking activity, the teacher will give general feedback on and correct errors across the board. As homework, the students will have to complete the same level of Duolingo at home.	10 min

Week 4: Nature
Objective: To identify nature-related vocabulary and use it in conversations.
Resources: teacher, students, nature audio, smartphone, Duolingo app (accessed on 5 April 2022)

Warm up	The teacher will play an audio of nature sounds. The teacher will ask the students to use the sounds to describe a place. Students who wish to do so will raise their hand and give a brief description of a place based on the sounds they heard.	5 min
Duolingo Time	The teacher asks the students to choose a space on the playground and work on “Nature” session of the application. Students will complete the following activities: complete sentences and questions to make a short conversation, match words with their meaning, read short paragraphs and choose the correct answer, answer questions, listen to sentences and repeat them or write them down, and translate sentences from Spanish to English or from English to Spanish. All of these exercises are related to the theme. The students have 15 min to do so. At the end of the lesson, the students should show the teacher their crown, which signifies that they have successfully completed their session. While the students work on the application, the teacher observes the students for any questions or doubts.	15 min
Speaking in real context (Evaluation)	The teacher will ask the students to record with their phones a video advertising a company that organizes camping trips. Students should mention in their videos the activities that can be performed at the camp. Students should send their videos to the teacher’s email address.	15 min
Feedback	The teacher will take notes on errors as the activity unfolds. At the end of the speaking activity, the teacher will give general feedback on and correct errors across the board. As homework, the students will have to complete the same level of Duolingo at home.	10 min

Week 4: Hobbies
Objective: To identify nature-related vocabulary and use it in conversations.
Resources: teacher, students, fancy hat, smartphones, Duolingo app

Warm up	The teacher will bring a fancy hat to class. The teacher will sit at the front of the class and ask one of the students to be the interviewer and wear the fancy hat; the topic will be hobbies. Students can take turns wearing the hat and interviewing the teacher.	5 min
Duolingo Time	The teacher asks the students to choose a space on the playground and work on the “Hobbies” session of the application. Students will complete the following activities: complete sentences and questions to make a short conversation, match words with their meaning, read short paragraphs and choose the correct answer, answer questions, listen to sentences and repeat them or write them down, and translate sentences from Spanish to English or from English to Spanish. All of these exercises are related to the theme. The students have 20 min to do so. At the end of the lesson, the students should show the teacher their crown, which signifies that they have successfully completed their session. While the students work on the application, the teacher observes the students for any questions or doubts.	15 min
Speaking in real context (Evaluation)	This time, the teacher will put on the fancy hat and interview the students briefly. Students will choose to be a superhero like Wonderwoman or Aquaman, wear their masks and answer questions about their hobbies.	15 min
Feedback	The teacher will take notes on errors as the activity unfolds. At the end of the speaking activity, the teacher will give general feedback on and correct errors across the board. As homework, the students will have to complete the same level of Duolingo at home.	10 min

Improve Speaking Skills in English Language Teaching Process through the Use of Duolingo App.


x
	x
		x
			x
				x
					x
						x
							x

Name:	Where I Am?

I speak fluently, clearly and loudly.
I understand what my friends are talking about.
My friends understand when I say something.
I am not afraid to express my ideas out loud.
When I work in a group, we all listen to each other.
I am more confident to speak when I work in a group than when I work alone.

B2	Grammar and Vocabulary	Discourse Management	Pronunciation	Interactive Communication

	Performance shares features of Bands 3 and 5

	Performance shares features of Bands 1 and 3

	Performance below Band 1

MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

Criollo-C, S.; Guerrero-Arias, A.; Vidal, J.; Jaramillo-Alcazar, Á.; Luján-Mora, S. A Hybrid Methodology to Improve Speaking Skills in English Language Learning Using Mobile Applications. Appl. Sci. 2022 , 12 , 9311. https://doi.org/10.3390/app12189311

Criollo-C S, Guerrero-Arias A, Vidal J, Jaramillo-Alcazar Á, Luján-Mora S. A Hybrid Methodology to Improve Speaking Skills in English Language Learning Using Mobile Applications. Applied Sciences . 2022; 12(18):9311. https://doi.org/10.3390/app12189311

Criollo-C, Santiago, Andrea Guerrero-Arias, Jack Vidal, Ángel Jaramillo-Alcazar, and Sergio Luján-Mora. 2022. "A Hybrid Methodology to Improve Speaking Skills in English Language Learning Using Mobile Applications" Applied Sciences 12, no. 18: 9311. https://doi.org/10.3390/app12189311

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

Subscribe to receive issue release notifications and newsletters from MDPI journals

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

We're Hiring!
Help Center

Improving Students’ Speaking Skills through Task-Based Learning: An Action Research at the English Department

2020, International Journal of Multicultural and Multireligious Understanding

This study aims to improve students' speaking skills at the Department of English. Based on interviews carried out to get initial data on the students' speaking skills, it was shown that the students had problems in speaking due to inadequate knowledge of the language which in turn made the students felt unconfident to speak. The students were not familiar with various speaking activities facilitating them to speak. They read the text to convey ideas and a lack of strategies when speaking. To help the students, task-based learning was adapted through action research in one-semester courses. Fifteen students in the third semester participated in this study. The data were taken from the results of the pre-test to post-test, interview, and observation. The findings reveal that the use of task-based learning helps the students improve their speaking skills of three indicators assessed: accuracy, vocabulary, and comprehension. The students manage to complete the tasks by conduct...

Related Papers

International Journal of Multicultural and Multireligious Understanding

Moh. Farid Maftuh

In the preliminary study, the researchers found that the average value of fluency in English speaking is 2.25. This is because of the teacher always uses monotonous, less enjoyable learning strategy and the teacher only focuses on completing the material so that the learning process can only transfer knowledge so that the level of mastery of the material is low which ultimately decreases student achievement. To overcome this problem, researchers propose a strategy in teaching speaking skill, namely role-playing learning strategy. This study is designed to improve fluency in English speaking by using role playing learning strategy. This study aims to explain how role-playing learning strategy can improve fluency in English speaking in the 4th semester, English Study Program, Business Administration Department, State Polytechnic of Madiun. This study uses a Classroom Action Research (CAR) design which is collaborative in nature where researchers and teacher collaborate in carrying out...

khusnul khoiriyah

The aims of this research were to design the integrated-skills materials for the four grade students of Islamic elementary schools, and to find out the appropriateness of the materials developed. The type of this research was Research and Development (R & D). The research model followed the Dick and Carey model in developing the materials which simplified into six steps. This research employed three techniques in collecting the data namely survey, interview and observation. Based on the techniques, the instruments of the research were questionnaires, the guideline for interview and observation sheets. The result of this study is the integrated-skills materials for the students of Islamic elementary school entitled “Let’s learn English” which consisted of integrated-skills activities and some Islamic values in every unit. The unit design of the materials presented the warming-up activity, the main activity including integrated skills activities, games, and grammar notes, glossary, re...

eka apriani

The present research aimed at investigating kinds of cohesive devices and the problems of using those cohesive devices in writing English paragraphs. 10 undergraduate English students from an institute in Curup, Bengkulu, Indonesia were involved as the participants purposively. Document analysis was conducted towards students’ written paragraphs to garner the data about kinds of cohesive devices, and they were then interviewed to reveal information with respect to their problems of using cohesive devices. The data were analyzed using an interactive model. The present research revealed that the students had used some kinds of cohesive devices such as references in the form of personal pronoun and demonstrative reference. They used conjunctions in the form of additive, adversative, and clausal conjunctions. They used reiteration in the form of making repetitions of the same words. This condition indicated that they had moderately been able to use general cohesive devices. However, the...

Sumardi Sumardi

This research is focused on the use of Powton as a digital medium to improve the students’ pronunciation in speaking class. The researcher applies a classroom action research which consisting of two cycles with three meetings in every cycle. Collections of quantitative and qualitative data are gained from test, observation, questionnaire, interview, and diary. The quantitative data obtained from the test is analyzed in order to find out the mean score. Furthernore, the qualitative data are analyzed by: assembling the data, coding the data, comparing the data, building interpretations, and reporting the outcomes. The result of the research shows that Powtoon as a digital medium could improve: (1) the students’ pronunciation in speaking class; and (2) the students’ learning motivation. Hence, the findings reveal that improving the students’ pronunciation in speaking using Powtoon was successful viewed from some dimensions. The implementation of using Powtoon improved the teaching and...

Suranto Suranto

Speaking as one of the competences in English has become main purpose for those who study English. This competence becomes a symbol for students to show that they have mastered English. In fact, teaching speaking in school often failed because the teacher didn’t include culture in teaching it. Most teachers forget that language is an important part of the culture. The aim of this research was to investigate deeply the strategy of teaching speaking through culture. By the ethnographic design, the researcher revealed the strategy used by the teacher in teaching speaking through culture. The object of the study was students of Senior High School in the higher level. Data were collected through observation, interview with the teachers, students and document review. The finding indicated that the teacher integrated the adaptive strategy in teaching speaking with three level of culture that is: cultural knowledge, cultural awareness and cultural competence.

Sudirman Wilian

The aim of this study was to determine whether or not classroom learning environment strategy is effective to increase students’ vocabulary acquisition. The research design was experimental study. There were four groups in this design; two groups for experimental group and two groups for control group. The sampling technique was random sampling which meant each subject or unit has an equal chance of being selected. The technique of data collecting in this research were documentation and testing. Data was analyzed by using t-test formula and t-table. The sample for this study consisted of one hundred and sixty-four (164) students. The data were collected through pre-test, treatment, and post-test where the experimental groups were treated by classroom learning environment strategy whereas the control groups were treated by using common teaching. The finding showed that the t-test value was higher than t-table. 5.3839 and 7.0249 > 1.990 at significant level .05 5.3839 and 7.0249 &g...

tarlan ghashghaei

The emergence of Computer Assisted Language Learning (CALL) has drastically changed the mode of teaching in many educational contexts. CALL can not only facilitate meaningful language learning but it can also accelerate it while giving students’ learning more depth and breath. Different types of learning in general and distance learning in particular, which is now prevalent in the present Corona pandemic, is not feasible without computer-based training. A mixed-method research design was adopted in one semester. The study measurement consisted of two sections: a quantitative section in which a survey questionnaire was utilized and a qualitative one in which, on the whole, 30 sessions of semi-structured interviews were conducted with participants to understand instructors’ opinion about merits of using CALL in teaching English. Using non-parametric test of Spearman correlation and the independent samples t-test, it was illuminated that having computer facilities impels instructors to...

Mansour Amini

This experimental study investigated the relationship between noticing of corrective feedback and L2 development considering the learners’ perspective on error correction. Specifically, it aimed to uncover the noticeability and effectiveness of recasts, prompts, a combination of the two, to determine a relationship between noticing of CF and learning of the past tense. The participants were four groups of college ESL learners (n = 40). Each group was assigned to a treatment condition, but the researcher taught the control group. CF was provided to learners in response to their mistakes in forming the past tense. While noticing of CF was assessed through immediate recall and questionnaire responses, learning outcomes were measured through picture description administered via pre-test, post-test, and delayed post-test design. Learner beliefs about CF were probed by means of a 40-item questionnaire. The results indicated that the noticeability of CF is dependent on the grammatical targ...

2017 IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI)

anagha vaidya

Endang Situmorang

This research purpose is to investigate the compatibility of materials in the textbook Bahasa Inggris based on Tomlinson’s theory. The researchers use descriptive research as the type of the research. The data of the study are the content of English textbook entitled Bahasa Inggris SMA/SMK K-13. The analysis is done by using Three Level of Analysis by Little john (2011). They are: 1) Level 1 Analysis: ‘What is There’ (Objective Description), 2) Level 2 Analysis: ‘What is Required of Users’ (Subjective Analysis), and 3) Level 3 Analysis: ‘What is Implied’ (Subjective Inference). The result of this study shows that the English textbook Bahasa Inggris fulfills 15 criteria or 93,75% of Tomlinson’s theory. Therefore the textbook is suitable to be used by the students

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

Ady Tandjung

Desti Ariani

Elok Nimasari

Harvest: An International Multidisciplinary and Multilingual Research Journal

Maulina Maulina

armyta puspitasari

Endang Fauziati

Journal of English Teaching and Learning Issues

Mochammad Ircham Maulana

Sara Kashefian-Naeeini

ENGLISH JOURNAL

Alan Jaelani

International Journal of Multicultural and Multireligious Understanding IJMMU

Mansour Amini , International Journal of Multicultural and Multireligious Understanding IJMMU

Asian Journal of Education and Training

Khomkrit Tachom

Raheni Suhita

The Asian EFL Journal

Rahmatullah Syaripuddin

Didascein : Journal of English Education

Fitri Novia

JEELS (Journal of English Education and Linguistics Studies)

Moh. Fikri Nugraha Kholid

Hassan M . Kassem

mutiarani pionera

Andang Saehu

International Journal of English Language Education

Mohammed Makhlouf

Oktaviana Ratnawati

mohamad sumantri

Language Testing in Asia

Mahani Stapa

Theory and Practice in Language Studies

vildana dubravac

Untung Waluyo

International Journal of Instruction

Hakim Yassi

Journal of English Teaching, Applied Linguistics and Literatures (JETALL)

Shofiah Nur Azizah

Proceedings of the 6th International Conference on Educational Research and Innovation (ICERI 2018)

helti maisyarah

Christian Lewier

Linguists : Journal Of Linguistics and Language Teaching

adjie pangestu

Cinthya Olivares Garita

Dwi Suputra

Yudi Juniardi

SYSTEMATIC REVIEW article

Assessing speaking proficiency: a narrative review of speaking assessment research within the argument-based validation framework.

$\nJason Fan$

1 Language Testing Research Centre, The University of Melbourne, Melbourne, VIC, Australia
2 Department of Linguistics, University of Illinois at Urbana-Champaign, Champaign, IL, United States

This paper provides a narrative review of empirical research on the assessment of speaking proficiency published in selected journals in the field of language assessment. A total of 104 published articles on speaking assessment were collected and systematically analyzed within an argument-based validation framework ( Chapelle et al., 2008 ). We examined how the published research is represented in the six inferences of this framework, the topics that were covered by each article, and the research methods that were employed in collecting the backings to support the assumptions underlying each inference. Our analysis results revealed that: (a) most of the collected articles could be categorized into the three inferences of evaluation, generalization , and explanation ; (b) the topics most frequently explored by speaking assessment researchers included the constructs of speaking ability, rater effects, and factors that affect spoken performance, among others; (c) quantitative methods were more frequently employed to interrogate the inferences of evaluation and generalization whereas qualitative methods were more frequently utilized to investigate the explanation inference. The paper concludes with a discussion of the implications of this study in relation to gaining a more nuanced understanding of task- or domain-specific speaking abilities, understanding speaking assessment in classroom contexts, and strengthening the interfaces between speaking assessment, and teaching and learning practices.

Introduction

Speaking is a crucial language skill which we use every day to communicate with others, to express our views, and to project our identity. In today's globalized world, speaking skills are recognized as essential for international mobility, entrance to higher education, and employment ( Fulcher, 2015a ; Isaacs, 2016 ), and are now a major component in most international and local language examinations, due at least in part to the rise of the communicative movement in language teaching and assessment ( Fulcher, 2000 ). However, despite its primacy in language pedagogy and assessment, speaking has been considered as an intangible construct which is challenging to conceptualize and assess in a reliable and valid manner. This could be attributable to the dynamic and context-embedded nature of speaking, but may be also due to the various forms that it can assume (e.g., monolog, paired conversation, group discussion) and the different conditions under which speaking happens (e.g., planned or spontaneous) (e.g., Luoma, 2004 ; Carter and McCarthy, 2017 ). When assessing speaking proficiency, multiple factors come into play which potentially affect test takers' performance and subsequently their test scores, including task features, interlocutor characteristics, rater effects, and rating scale, among others ( McNamara, 1996 ; Fulcher, 2015a ). In the field of language assessment, considerable research attention and efforts have been dedicated to researching speaking assessment. This is evidenced by the increasing number of research papers with a focus on speaking assessment that have been published in the leading journals in the field.

This prolonged growth in speaking assessment research warrants a systematic review of major findings that can help subsequent researchers and practitioners to navigate the plethora of published research, or provide them with sound recommendations for future explorations in the speaking assessment domain. Several review or position papers are currently available on speaking assessment, either reviewing the developments in speaking assessment more broadly (e.g., Ginther, 2013 ; O'Sullivan, 2014 ; Isaacs, 2016 ) or examining a specific topic in speaking assessment, such as pronunciation ( Isaacs, 2014 ), rating spoken performance ( Winke, 2012 ) and interactional competence ( Galaczi and Taylor, 2018 ). Needless to say, these papers are valuable in surveying related developments in speaking proficiency assessment and sketching a broad picture of speaking assessment for researchers and practitioners in the field. Nonetheless, they typically adopt the traditional literature review approach, as opposed to the narrative review approach that was employed in this study. According to Norris and Ortega (2006 , p. 5, cited in Ellis, 2015 , p. 285), a narrative review aims to “scope out and tell a story about the empirical territory.” Compared with traditional literature review which tends to rely on a reviewer's subjective evaluation of the important or critical aspects of the existing knowledge on a topic, a narrative review is more objective and systematic in the sense the results are usually based on the coding analysis of the studies that are collected through applying some pre-specified criteria. Situated within the argument-based validation framework ( Chapelle et al., 2008 ), this study is aimed at presenting a narrative review of empirical research on speaking assessment published in two leading journals in the field of language assessment, namely, Language Testing (LT) and Language Assessment Quarterly (LAQ). Through following the systematic research procedures of narrative review (e.g., Cooper et al., 2019 ), we survey the topics of speaking assessment that have been explored by researchers as well as the research methods that have been utilized with a view to providing recommendations for future speaking assessment research and practice.

Theoretical Framework

Emerging from the validation of the revised Test of English as a Foreign Language (TOEFL), the argument-based validation framework adopted in this study represents an expansion of Kane's (2006) argument-based validation model, which posits that a network of inferences needs to be verified to support test score interpretation and use. A graphic display of this framework is presented in Figure 1 . As shown in this figure, the plausibility of six inferences need to be verified to build a validity argument for a language test, including: domain definition, evaluation, generalization, explanation, extrapolation , and utilization . Also included in the framework are the key warrants that license each inference and its underlying assumptions. This framework was adopted as the guiding theoretical framework of this review study in the sense that each article collected for this study was classified into one or several of these six inferences in the framework. As such, it is necessary to briefly explain these inferences in Figure 1 in the context of speaking assessment. The explanation of the inferences, together with their warrants and assumptions, is largely based on Chapelle et al. (2008) and Knoch and Chapelle (2018) . To facilitate readers' understanding of these inferences, we use the TOEFL speaking test as an example to provide an illustration of the warrants, key assumptions, and backings for each inference.

Figure 1 . The argument-based validation framework (adapted from Chapelle et al., 2008 , p. 18).

The first inference, domain definition , links the target language use (TLU) domain to test takers' observed performance on a speaking test. The warrant supporting this inference is that observation of test takers' performance on a speaking test reveals the speaking abilities and skills required in the TLU domain. In the case of the TOEFL speaking test, the TLU domain is the English-medium institutions of higher education. Therefore, the plausibility of this inference hinges on whether observation of test takers' performance on the speaking tasks reveals essential academic speaking abilities and skills in English-medium universities. An important assumption underlying this inference is that speaking tasks that are representative of language use in English-medium universities can be identified and simulated. Backings in support of this assumption can be collected through interviews with academic English experts to investigate speaking abilities and skills that are required in English-medium universities.

The warrant for the next inference, evaluation , is that test takers' performance on the speaking tasks is evaluated to provide observed scores which are indicative of their academic speaking abilities. The first key assumption underlying this warrant is that the rating scales for the TOEFL speaking test function as intended by the test provider. Backings for this assumption may include: a) using statistical analyses (e.g., many-facets Rasch measurement, or MFRM) to investigate the functioning of the rating scales for the speaking test; and b) using qualitative methods (e.g., raters' verbal protocols) to explore raters' use of the rating scales for the speaking test. Another assumption for this warrant is that raters provide consistent ratings on each task of the speaking test. Backing for this assumption typically entails the use of statistical analyses to examine rater reliability on each task of the speaking test. The third assumption is that detectable rater characteristics do not introduce systematic construct-irrelevant variance into their ratings of test takers' performance. Bias analyses are usually implemented to explore whether certain rater characteristics (e.g., experience, L1 background) interact with test taker characteristics (e.g., L1 background) in significant ways.

The third inference is generalization . The warrant that licenses this inference is that test takers' observed scores reflect their expected scores over multiple parallel versions of the speaking test and across different raters. A few key assumptions that underlie this inference include: a) a sufficient number of tasks are included in the TOEFL speaking test to provide stable estimates of test takers' speaking ability; b) multiple parallel versions of the speaking test feature similar levels of difficulty and tap into similar academic English speaking constructs; and c) raters rate test takers' performance consistently at the test level. To support the first assumption, generalizability theory (i.e., G-theory) analyses can be implemented to explore the number of tasks that is required to achieve the desired level of reliability. For the second assumption, backings can be collected through: (a) statistical analyses to ascertain whether multiple parallel versions of the speaking test have comparable difficulty levels; and (b) qualitative methods such as expert review to explore whether the parallel versions of the speaking test tap into similar academic English speaking constructs. Backing of the third assumption typically entails statistical analyses of the scores that raters have awarded to test takers to examine their reliability at the test level.

The fourth inference is explanation . The warrant of this inference is that test takers' expected scores can be used to explain the academic English speaking constructs that the test purports to assess. The key assumptions for this inference include: (a) features of the spoken discourse produced by test takers on the TOEFL speaking test can effectively distinguish L2 speakers at different proficiency levels; (b) the rating scales are developed based on academic English speaking constructs that are clearly defined; and (c) raters' cognitive processes when rating test takers' spoken performance are aligned with relevant theoretical models of L2 speaking. Backings of these three assumptions can be collected through: (a) discourse analysis studies aiming to explore the linguistic features of spoken discourse that test takers produce on the speaking tasks; (b) expert review of the rating scales to ascertain whether they reflect relevant theoretical models of L2 speaking proficiency; and (c) rater verbal protocol studies to examine raters' cognitive processes when rating performance on the speaking test.

The fifth inference in the framework is extrapolation . The warrant that supports this inference is that the speaking constructs that are assessed in the speaking test account for test takers' spoken performance in English-medium universities. The first key assumption underlying this warrant is that test takers' performance on the TOEFL speaking test is related to their ability to use language in English-medium universities. Backing for this assumption is typically collected through correlation studies, that is, correlating test takers' performance on the speaking test with an external criterion representing their ability to use language in the TLU domains (e.g., teachers' evaluation of students' speaking proficiency of academic English). The second key assumption for extrapolation is that raters' use of the rating scales reflects how spoken performance is evaluated in English-medium universities. For this assumption, qualitative studies can be undertaken to compare raters' cognitive processes with those of linguistic laypersons in English-medium universities such as subject teachers.

The last inference is utilization . The warrant supporting this inference is that the speaking test scores are communicated in appropriate ways and are useful for making decisions. The assumptions that underlie the warrant include: (a) the meaning of the TOEFL speaking test scores is clearly interpreted by relevant stakeholders, such as admissions officers, test takers, and teachers; (b) cut scores are appropriate for making relevant decisions about students; and (c) the TOEFL speaking test has a positive influence on English teaching and learning. To collect the backings for the first assumption, qualitative studies (e.g., interviews, focus groups) can be conducted to explore stakeholders' perceptions of how the speaking test scores are communicated. For the second assumption, standard setting studies are often implemented to interrogate the appropriateness of cut scores. The last assumption is usually investigated through test washback studies, exploring how the speaking test influences English teaching and learning practices.

The framework was used in the validation of the revised TOEFL, as reported in Chapelle et al. (2008) , as well as in low-stakes classroom-based assessment contexts (e.g., Chapelle et al., 2015 ). According to Chapelle et al. (2010) , this framework features several salient advantages over other alternatives. First, given the dynamic and context-mediated nature of language ability, it is extremely challenging to use the definition of a language construct as the basis for building the validity argument. Instead of relying on an explicit definition of the construct, the argument-based approach advocates the specification of a network of inferences, together with their supporting warrants and underlying assumptions that link test takers' observed performances to score interpretation and use. This framework also makes it easier to formulate validation research plans. Since every assumption is associated with a specific inference, research questions targeting each assumption are developed ‘in a more principled way as a piece of an interpretative argument' ( Chapelle et al., 2010 , p. 8). As such, the relationship between validity argument and validation research becomes more apparent. Another advantage of this approach to test validation it that it enables the structuring and synthesis of research results into a logical and coherent validity argument, not merely an amalgamation of research evidence. By so doing, it depicts the logical progression of how the conclusion from one inference becomes the starting point of the next one, and how each inference is supported by research. Finally, by constructing a validity argument, this approach allows for a critical evaluation of the logical development of the validity argument as well as the research that supports each inference. In addition to the advantages mentioned above for test validation research, this framework is also very comprehensive, making it particularly suitable for this review study.

By incorporating this argument-based validation framework in a narrative review of the published research on speaking assessment, this study aims to address the following research questions:

RQ1. How does the published research on speaking assessment represent the six inferences in the argument-based validation framework?

RQ2. What are the speaking assessment topics that constituted the focus of the published research?

RQ3. What methods did researchers adopt to collect backings for the assumptions involved in each inference?

This study followed the research synthesis steps recommended by Cooper et al. (2019) , including: (1) problem formation; (2) literature search; (3) data evaluation; (4) data analysis; (5) interpretation of results; and (6) public presentation. This section includes details regarding article search and selection, and methods for synthesizing our collected studies.

Article Search and Selection

We collected the articles on speaking assessment that were published in LT from 1984 1 to 2018 and LAQ from 2004 to 2018. These two journals were targeted because: (a) both are recognized as leading high-impact journals in the field of language assessment; (b) both have an explicit focus on assessment of language abilities and skills. We understand that numerous other journals in the field of applied linguistics or educational evaluation also publish research on speaking and its assessment. Admittedly, if the scope of our review extends to include more journals, the findings might be different; however, given the high impact of these two journals in the field, a review of their published research on speaking assessment in the past three decades or so should provide sufficient indication of the directions in assessing speaking proficiency. This limitation is discussed at the end of this paper.

The PRISMA flowchart in Figure 2 illustrates the process of article search and selection in this study. A total of 120 articles were initially retrieved through manually surveying each issue in the electronic archives of the two journals, containing all articles published in LT from 1984 to 2018 and LAQ from 2004 to 2018. Two inclusion criteria were applied: (a) the article had a clear focus on speaking assessment. Articles that targeted the whole language test involving multiple skills were not included; (b) the article reported an empirical study in the sense that it investigated one or more aspects of speaking assessment through the analysis of data from either speaking assessments or designed experimental studies.

Figure 2 . PRISMA flowchart of article search and collection.

Through reading the abstracts carefully, 13 articles were excluded from our analysis, with two special issue editorials and 11 review or position papers. A further examination of the remaining 107 articles revealed that three of them involved multiple language skills, suggesting a lack of primary focus on speaking assessment. These three articles were therefore excluded from our analysis, yielding 104 studies in our collection. Of the 104 articles, 73 (70.19%) were published in LT and 31 (29.81%) were published in LAQ . All these articles were downloaded in PDF format and imported into NVivo 12 ( QSR, 2018 ) for analysis.

Data Analysis

To respond to RQ1, we coded the collected articles into the six inferences in the argument-based validation framework based on the focus of investigation for each article, which was determined by a close examination of the abstract and research questions. If the primary focus did not emerge clearly in this process, we read the full text. As the coding progressed, we noticed that some articles had more than one focus, and therefore should be coded into multiple inferences. For instance, Sawaki (2007) interrogated several aspects of an L2 speaking test that were considered as essential to its construct validity, including the interrelationships between the different dimensions of spoken performance and the reliability of test scores. The former was considered as pertinent to the explanation inference, as it explores the speaking constructs through the analysis of test scores; the latter, however, was deemed more relevant to the generalization inference, as it concerns the consistency of test scores at the whole test level ( Knoch and Chapelle, 2018 ). Therefore, this article was coded into both explanation and generalization inference.

To answer RQ2, the open coding method ( Richards, 1999 ) was employed to explore the speaking assessment topics that constituted the focus of each article in our collection. This means that a coding scheme was not specified a prior ; rather, it was generated through examining the abstracts or full texts to determine the topics and subtopics. RQ3 was investigated through coding the research methods that were employed by speaking assessment researchers. A broad coding scheme consisting of three categories was employed to code the research methods: (a) quantitatively oriented; (b) qualitatively oriented; and (c) mixed methods with both quantitative and qualitative orientations. Next, the open coding method was adopted to code the specific methods that were utilized under each broad category. Matrix coding analysis ( Miles et al., 2014 ) was subsequently implemented in NVivo to explore the relationships between the speaking assessment topics, research methods and the six inferences in the argument-based validation framework. This would enable us to sketch the broad patterns of: (a) which topics on speaking assessment tended be investigated under each of the six inferences; (b) which research methods were frequently employed to collect the backings for the assumptions that underlie each inference.

The coding process underwent three iterative stages to ensure the reliability of the coding results. First, both authors coded 10 articles selected randomly from the dataset independently and then compared their coding results. Differences in coding results were resolved through discussion. Next, the first author coded the rest of the articles in NVivo, using the coding scheme that was generated during the first stage while adding new categories as they emerged from the coding process. Finally, the second author coded 20 articles (19.23%) which were randomly selected from the dataset, using the coding scheme that was determined during the second stage. Intercoder agreement was verified through calculating Cohen's kappa statistic in NVivo ( k = 0.93), which suggested satisfactory coding reliability.

Results and Discussion

Overall, our coding results indicate that a wide range of research was conducted of speaking assessment to interrogate the six inferences in the argument-based validation framework. These studies cover a variety of research topics, employing quantitative, qualitative, and mixed research methods. In this section, we describe and discuss the analysis results through showcasing the broad patterns that emerged from our coding process. Illustrative studies are used as appropriate to exemplify the research that was undertaken in assessing speaking proficiency.

Representation of the Published Research in the Six Inferences

Table 1 presents the representation of the published research in the six inferences. As indicated in this table, most of our collected articles were categorized into the three inferences of evaluation ( n = 42, 40.38%), generalization ( n = 42, 40.38%), and explanation ( n = 50, 48.08%); in contrast, a much smaller number of studies targeted the other three inferences of domain description ( n = 4, 3.85%), extrapolation ( n = 7, 6.73%), and utilization ( n = 5, 4.81%). Despite the highly skewed representation of the published research in the six inferences, the findings were not entirely surprising. According to the argument-based validation framework ( Chapelle et al., 2008 ), backings in support of the assumptions that underlie the three inferences of evaluation, generalization , and explanation relate to almost all key components in the assessment of speaking proficiency, including rater effects, rating scale, task features or administration conditions, interlocutor effects in speaking tasks such as paired oral, interview or group discussion, and features of produced spoken discourse. These components essentially represent the concerns surrounding the development, administration, and validation of speaking assessment (e.g., McNamara, 1996 ; Fulcher, 2015a ). Take the inference of evaluation as an example. In the argument-based validation framework, this inference pertains to the link from the observation of test takers' performance on a speaking test to their observed scores. As mentioned previously (see section Theoretical Framework), backings in support of the key assumptions underlying this inference include an evaluation of rating scales as well as rater effects at the task level. Given the pivotal role that raters and rating scales play in speaking assessment (e.g., Eckes, 2011 ), it is not surprising to observe a reasonably high proportion of studies exploring the plausibility of this inference. Almost half of our collected articles ( n = 50, 48.08%) interrogated the explanation inference. This finding can be interpreted in relation to the centrality of understanding the construct in language test development and validation (e.g., Alderson et al., 1995 ; Bachman and Palmer, 1996 ), which lies at the core of the explanation inference.

Table 1 . Representation of the published research in the six inferences ( n = 104).

One possible explanation for the limited research on domain description is related to the journals that formed the basis for this review study. Both LT and LAQ have an explicit focus on language assessment, whereas in many cases, exploration of language use in TLU domains, which is the focus of domain description , might be reported as needs assessment studies in test development reports, which were beyond the purview of this study. Another plausible explanation, as pointed out by one of the reviewers, might lie in the lack of theoretical sophistication regarding this inference. The reason why few studies targeted the extrapolation inference might be attributable to the challenges in pinpointing the external criterion measure, or in collecting valid data to represent test takers' ability to use language in TLU domains. These challenges could be exacerbated in the case of speaking ability due to its intangible nature, the various forms that it may assume in practice, and the different conditions under which it happens. Similarly, very few studies focused on the utilization inference which concerns the communication and use of test scores. This could relate to the fact that test washback or impact studies have to date rarely focused exclusively on speaking assessment ( Yu et al., 2017 ). Speaking assessment researchers should consider exploring this avenue of research in future studies, particularly against the backdrop of the increasingly extensive application of technology in speaking assessment ( Chapelle, 2008 ).

Speaking Assessment Topics

Table 2 presents the matrix coding results of speaking assessment topics and the six inferences in the argument-based validation framework. It should be noted that some of the frequency statistics in this table are over-estimated because, as mentioned previously, some articles were coded into multiple inferences; however, this should not affect the general patterns that emerged from the results in a significant way. The topics that emerged from our coding process are largely consistent with the themes that Fulcher (2015a) identified in his review of speaking assessment research. One noteworthy difference is many-facets Rasch measurement (MFRM), a topic in Fulcher (2015a) but was coded as a research method in our study (see section Research Methods). In what follows, we will focus on the three topics which were most frequently investigated by speaking assessment researchers, namely, speaking constructs, rater effects, and factors that affect speaking performance, as examples to illustrate the research that was undertaken of speaking assessment.

Table 2 . Matrix coding results of inferences and speaking assessment topics ( n = 104).

Speaking Constructs

Table 2 shows that “speaking constructs” ( n = 47) is the topic that was investigated most frequently in our collected studies. Matrix coding results indicate that this topic area appears most frequently under the inference of explanation ( n = 39, 37.50%). The importance of a clear understanding of the construct cannot be overemphasized in language test development and validation (e.g., Alderson et al., 1995 ; Bachman and Palmer, 1996 ). Indeed, construct definition forms the foundation of several highly influential test validation frameworks in the field (e.g., Messick, 1989 ; Weir, 2005 ). Our analysis indicates that considerable research has been dedicated to disentangling various speaking constructs. Two topics that feature prominently in this topic area are the analysis of spoken discourse and interactional competence.

A common approach to investigate the speaking constructs is through the analysis of produced spoken discourse ( Carter and McCarthy, 2017 ), usually focusing on linguistic features that can distinguish test takers at different proficiency levels such as complexity, accuracy, and fluency (e.g., Iwashita, 2006 ; Gan, 2012 ; Bosker et al., 2013 ). Research in this area can provide substantial evidence concerning speaking proficiency. Iwashita (2006) , for instance, examined the syntactic complexity of the spoken performance of L2 Japanese learners. Results reveal that learner' oral proficiency could be predicted significantly by several complexity indicators, including T-unit length, the number of clauses per T-unit, and the number of independent clauses per T-unit. In another discourse analysis study, Gan (2012) probed the syntactic complexity of test takers' spoken discourse and examined the relationship between syntactic complexity and task type in L2 speaking assessment. Gan's results show that, compared with the group interaction task, test takers' discourses on the individual presentation task featured longer T-units and utterances as well as significantly greater number of T-units, clauses, verb phrases and words. These discourse analysis studies have implications for understanding speaking proficiency as well as its development and maturity among L2 learners.

International competence (IC) is yet another topic which features prominently in this topic area. Despite the recognized need of including IC in speaking assessment (e.g., Kramsch, 1986 , McNamara, 1997 ), how it should be conceptualized remains a contentious issue. Research has shown that this construct consists of multiple dimensions which is susceptible to the influence of a range of personal cognitive and contextual factors ( Galaczi and Taylor, 2018 ). Our review suggests that IC was approached through analyzing test takers' spoken discourse as well as exploring raters' perspectives. Galaczi (2008) , for instance, performed elaborate analyses of test takers' spoken discourse on the paired speaking task in the First Certificate in English (FCE) speaking test. The results led the researcher to conclude that test takers' interactions primarily featured three patterns on paired oral assessment tasks: collaborative, parallel and blended interaction (i.e., a mixture of collaborative/parallel or collaborative/asymmetric features). In a more recent study, Lam (2018) analyzed test takers' spoken discourse on a school-based group oral speaking assessment for the Hong Kong Diploma of Secondary Education (HKDSE) English Language Examination. Instead of exploring IC more broadly, as in Galaczi (2008) , this study targeted a particular feature of IC, namely, producing responses contingent on previous speakers' contributions. The analyses pointed to three kinds of conversational actions that underpinned a response contingent on previous speaker's contributions: formulating previous speakers' contributions, accounting for (dis)agreement with previous speakers' ideas and extending previous speakers' ideas.

Some other studies explored the construct of IC from raters' perspectives. A typical study was reported by May (2011) who explored the features that were salient to raters on a paired speaking test. The study identified a repertoire of features which were salient to raters, and hence were potentially integral to the IC construct. Such features include, for example, the ability to manage a conversation, ask for opinion or clarification, challenge or disagree with an interactional partner, and demonstrate effective body language, and interactive listening. While suggesting that IC is a highly complex and slippery construct, these studies have significant implications for clarifying the IC construct and promoting its valid operationalization in speaking assessment. The findings are particularly meaningful in the context where interactive tasks are increasingly used in speaking assessment.

Rater Effects

Raters play a significant role in speaking assessment; their performance is affected by a host of non-linguistic factors, which are often irrelevant to the speaking constructs of interest, hence causing construct-irrelevant variance ( Messick, 1989 ) or contamination ( AERA et al., 2014 ). Not surprisingly, the next topic area that was most frequently explored by speaking assessment researchers is rater effects ( n = 39). The studies that focused on this topic were mostly classified into the two inferences of evaluation ( n = 27, 25.96%) and generalization (n =23, 22.12%). Knoch and Chapelle (2018) applied the argument-based validation framework to the analysis of rater effects and rating processes in language assessment research. They observed that several important aspects of rater effects could be mapped onto evaluation and generalization inferences. The key assumptions of the evaluation inference relate to the raters' consistency at the task level, the bias that raters display against task types or other aspects of the assessment situation, and the impact of raters' characteristics on the ratings that they assign. When it comes to the generalization inference, the key assumptions largely concern raters' consistency at the whole test level and the number of raters that is required to achieve the desired level of consistency. Research on rater effects has significant implications for enhancing both the validity and fairness of speaking assessment (e.g., McNamara et al., 2019 ).

Two topics that feature prominently in the study of rater effects are the impact of raters' characteristics on their rating behaviors and rater cognition, that is, the cognitive processes that raters engage when assigning scores to a spoken performance. Raters' characteristics such as language background, experience and qualifications may have appreciable impact on their ratings. This topic has attracted considerable research attention as it has implications for test fairness and rater training programs. One such study was reported by Kim (2009) who examined and compared the rating behaviors of native and non-native English teachers when assessing students' spoken performance. The results indicate that native-speaker (NS) and non-native-speaker (NNS) teachers on the whole exhibited similar severity levels and internal consistency; however, in comparison with NNS teachers, NS teachers provided more detailed and elaborate comments on students' performance. The findings generally concur with Zhang and Elder (2011) who compared the rating behaviors of NS and NNS teachers in the context of the College English Test - Spoken English Test (CET-SET), a large-scale high-stakes speaking test in China. Instead of focusing on raters' L1 background, Winke et al. (2013) examined whether raters' accent familiarity, defined as their L2 learning experience, constituted a potential source of bias when they rated test takers' spoken performance. In other words, if a rater studies Chinese as his or her L2, is he or she biased toward test takers who have Chinese as their L1? Their findings indicate that the raters with Spanish or Chinese as their L2 were significantly more lenient toward L1 Spanish and Chinese test takers than they were toward those from other L1 backgrounds. However, in both cases, the effect sizes were small, suggesting that such effect had minimal impact in practice. The results are largely consistent with some other studies in our collection (e.g., Yan, 2014 ; Wei and Llosa, 2015 ), which explored a similar topic.

Rater cognition or rating processes constitute yet another important topic under the topic area of “rater effects”. Studies along this line are typically implemented through analyzing raters' verbal protocols to explore their cognitive processes when applying the rating criteria or assigning scores to a spoken performance. Research into raters' cognitive processes can generate valuable insights into the validity of the rating scales as well as the speaking constructs that are being assessed in a speaking test. Findings from these studies have important implications for the revision of rating scales, improving rater training programs, and enhancing the validity and usefulness of the speaking test in focus. In a qualitative study, Kim (2015) explored the rating behaviors of three groups of raters with different levels of experience on an L2 speaking test by analyzing their verbal reports of rating processes. The study revealed that the three groups of raters exhibited varying uses of the analytic rating scales, hence suggesting that experience was an important variable affecting their rating behaviors. Furthermore, an analysis of their performance over time revealed that the three groups of raters demonstrated different degrees of improvement in their rating performance. It should be noted that several studies in our collection examined raters' rating processes with a view to either complementing or accounting for the quantitative analyses of speaking test scores. For instance, both Kim (2009) and Zhang and Elder (2011) , two studies which were reviewed previously, investigated raters' rating processes, and the findings significantly enriched our understanding of the rating behaviors of raters from different backgrounds.

Factors That Affect Spoken Performance

The third topic area that emerged from our coding process is “factors that affect spoken performance” ( n = 30). As shown in Table 3 , most of the studies in this topic area were classified into the inference of generalization ( n = 19, 18.27%). This is understandable as factors such as task features, administration conditions, and planning time might affect the generalizability of speaking test scores. Indeed, understanding factors that affect test performance has long since been one of the central concerns for language assessment research as a whole (e.g., Bachman, 1990 ; Bachman et al., 1995 ). Research along this line has implications for speaking test development and implementation, and for test score interpretation and use. Our coding analyses indicate that a range of factors have been explored by speaking assessment researchers, of which ‘interlocutor effects' features most prominently. This could be related to the increasingly widespread use of interviews, paired oral or group discussion tasks to assess speaking ability in applied linguistics and language pedagogy. A notable advantage with these assessment formats lies in the unscripted and dynamic nature of the interactions involved, which is key to increasing the authenticity of speaking assessments. Nonetheless, interlocutor characteristics, such as gender, proficiency levels, personality, and styles of interaction might have considerable impact on test takers' spoken performance, thus impinging on the validity, fairness and overall usefulness of these tasks.

Table 3 . Matrix coding results of research methods and inferences ( n = 104).

An earlier study on interlocutor effects was reported by McNamara and Lumley (1997) who examined the potential impact of interlocutor characteristics on test scores in the context of the Occupational English Test (OET), a high-stakes speaking test for health professionals in Australia. Their study indicated that interlocutor characteristics had some influence on the ratings that test takers received. For example, they found that raters tended to compensate for interlocutors' incompetence in conducting the speaking test; in other words, if an interlocutor was perceived as less competent, test takers tended to receive higher ratings than expected. In addition, they also observe that an interlocutor's ability to build rapport with test takers had a positive effect on the ratings that test takers received. In another study, Brown (2003) probed the effects of interlocutor characteristics on test takers' performance in the context of a conversational interview. She performed elaborate analyses of the interactions between the interviewers (i.e., interlocutors) and test takers, revealing that the interlocutors differed quite significantly in terms of: (a) how they structured topical sequences; (b) their questioning technique; and (c) how they provided feedback and built rapport with test takers. Further analyses uncovered that interviewer styles had quite significant impact on the ratings that test takers received. Resonating with McNamara and Lumley (1997) , the findings of this study again call for the reconceptualization of speaking proficiency.

Several other studies focused on the effects of interaction partners in paired or group oral tasks on spoken performance. ( Ockey, 2009 ), for instance, investigated the potential effects of group member's assertiveness levels on spoken performance on a group discussion task. Results confirmed that test takers' assertiveness levels had an impact on the scores that they received. Specifically, assertive test takers were awarded higher scores than expected when grouped with non-assertive test takers; this trend, however, was reversed when they were grouped with test takers with similar assertiveness levels. A plausible explanation could be that raters viewed assertive test takers more positively when other members in the groups were non-assertive, whereas more negatively when other group members, who were also assertive, competed to be the leaders in the interactions. This study reiterates the co-constructed nature of speaking proficiency. Despite the research that has been undertaken of interlocutor effects, controversy remains as to whether this variation is part of the speaking construct and therefore should be incorporated in the design of a speaking test or it should be controlled to such an extent that it poses minimal threat to the reliability and fairness of speaking test scores ( Fulcher, 2015a ).

In addition to the three topics above, researchers also explored speaking test design ( n = 14) in terms of the task features (e.g., Wigglesworth and Elder, 2010 ; Ahmadi and Sadeghi, 2016 ) and the use of technology in speaking test delivery (e.g., Nakatsuhara et al., 2017 ; Ockey et al., 2017 ). The next topic is test score generalizability ( n = 7), typically investigated through G-theory analysis (e.g., Lee, 2006 ; Sawaki, 2007 ; Xi, 2007 ). Furthermore, six studies in our collection evaluated the rating scales for speaking assessments, including comparing the effectiveness of different types of rating scales (e.g., Hirai and Koizumi, 2013 ), and examining whether a rating scale functioned as intended by the test developer (e.g., Isaacs and Thomson, 2013 ). Finally, five studies focused on the use of speaking assessments, mainly relating to test takers' perceptions of speaking assessments (e.g., Scott, 1986 ; Qian, 2009 ) and standard setting studies to determine the cut scores for certain purposes (e.g., Pill and McNamara, 2016 ).

Research Methods

Table 3 presents the matrix coding results of research methods and inferences. As indicated in this table, quantitative research methods were more frequently employed by speaking assessment researchers ( n = 50), in comparison with qualitative methods ( n = 23). It is worth noting that a number of studies ( n = 31) utilized mixed methods design, which features a combination of both quantitative and qualitative orientations.

Table 3 indicates that quantitative methods were most frequently used to collect backings in support of the evaluation ( n = 21, 20.19%) and generalization inferences ( n = 27, 25.96%). This finding can be interpreted in relation to the key assumptions that underlie these two inferences (see section Theoretical Framework). According to the argument-based validation framework, the assumptions of these two inferences largely concern rater consistency at task and whole-test level, the functioning of the rating scales, as well as the generalizability of speaking test scores across tasks and raters. Understandably, quantitative methods are widely used to collect the backings to test these assumptions. In addition to the overall representation of quantitative methods in speaking assessment research, we also went a step further to examine the use of specific quantitative methods. As shown in Table 3 , while traditional data analysis methods such as ANOVA or regression ( n = 34) continued to be utilized, mainly in the interrogation of the inferences of evaluation ( n = 13, 12.50%), generalization ( n = 14, 13.46%), and explanation ( n = 15, 14.42%), Rasch analysis methods were also embraced by speaking assessment researchers ( n = 28). Note that Rasch analysis is an overarching term which encompasses a family of related models, among which the many-facets Rasch model (MFRM) is frequently used in speaking assessment (e.g., McNamara and Knoch, 2012 ). As an extension of the basic Rasch model, the MFRM allows for the inclusion of multiple aspects or facets in a speaking context (e.g., rater severity, task difficulty, difficulty of rating scales). Furthermore, compared with traditional data analysis methods such as correlation and ANOVA which can only provide results at the group level, the MFRM can provide both group- and individual-level statistics ( Eckes, 2011 ). This finding concurs with Fulcher (2015a) who identified the MFRM as an important theme in speaking assessment. It also resonates with the observation of Fan and Knoch (2019 , p. 136) who commented that Rasch analysis has indeed become “one of the default methods or analysis techniques to examine the technical quality of performance assessments.” The power of Rasch analysis in speaking assessment research is best illustrated by studies such as Bonk and Ockey (2003) , Eckes (2005) , and Winke et al. (2013) , among others, all of which examined rater effects on speaking assessments in different contexts. Finally, G-theory ( n = 7) and structural equation modeling ( n = 5), two complicated quantitative methods, were also utilized by speaking assessment researchers.

In terms of qualitative research methods, discourse analysis is the one which was most frequently employed by speaking assessment researchers ( n = 25). Matrix coding results indicate that this method features most prominently under the inference of explanation ( n = 20, 19.23%). This finding is aligned with the key assumptions that underlie the explanation inference, namely, (a) features of the spoken discourse produced by test takers can effectively distinguish L2 speakers at different proficiency levels, and (b) raters' cognitive processes are consistent with the theoretical models of L2 speaking, both entailing the use of discourse analysis method to explore test takers' spoken responses and raters' rating processes. Importantly, our analysis results indicate that conversation analysis (CA) was the method that appeared frequently under the category of “discourse analysis.” This is best represented by studies such as Galaczi (2008) , Lam (2018) , and Roever and Kasper (2018) , all endeavoring to elucidate the construct of interactional competence. As a data analysis method, CA provides speaking researchers with a principled and intricate approach to analyze the interactions between test takers and examiners in interview, paired oral, or group discussion tasks. Table 3 shows that some other qualitative methods were also quite frequently used by speaking researchers, including interview/focus groups ( n = 11), written comments ( n = 11), and verbal protocol reports ( n = 10). These research methods were typically adopted following the quantitative analyses of test takers' scores, which explains the increasingly widespread use of mixed methods in speaking assessment research ( n = 31). The finding could find resonance in the observation that mixed method research has been gaining momentum in language assessment research more broadly (e.g., Turner, 2013 ; Jang et al., 2014 ; Moeller et al., 2016 ). As shown in Table 3 , mixed-methods design is most frequently employed to collect backings in support of the inferences of evaluation ( n = 17, 16.35%) and explanation ( n = 16, 15.38%). For the evaluation inference, mixed method design was often utilized to research rater effects where quantitative and qualitative analyses were used sequentially to examine rating results and processes. When it comes to the explanation inference, researchers tended to use a combination of quantitative and qualitative analyses to explore the differences in test takers' speaking scores as well as the spoken discourse that they produced.

Conclusions and Implications

In this study, we conducted a narrative review of published empirical research on assessing speaking proficiency within the argument-based validation framework ( Chapelle et al., 2008 ). A total of 104 articles on speaking assessment were collected from LT (1984–2018) and LAQ (2004–2018), two highly influential journals in the field of language assessment. Following the coding of the collected articles, matrix coding analyses were utilized to explore the relationships between the speaking assessment topics, research methods, and the six inferences in the argument-based validation framework.

The analysis results indicate that speaking assessment was investigated from various perspectives, primarily focusing on seven broad topic areas, namely, the constructs of speaking ability, rater effects, factors that affect spoken performance, speaking test design, test score generalizability, rating scale evaluation, and test use. The findings of these studies have significantly enriched our understanding of speaking proficiency and how assessment practice can be made more reliable and valid. In terms of research methods, it was revealed that quantitative research methods were most frequently utilized by speaking assessment researchers, a trend which was particularly pronounced in the inferences of evaluation and generalization . Though traditional quantitative methods such as ANOVA, regression, and correlation continued to be employed, Rasch analysis played a potent role in researching speaking assessment. In comparison, qualitative methods were least frequently used, mainly for the interrogation of the explanation inference. Mixed-methods design, recognized as “an alternative paradigm” ( Jang et al., 2014 , p. 123), ranked in the middle in terms of frequency, suggesting its increasingly widespread use in speaking assessment research. This is noteworthy when it comes to the evaluation and explanation inference.

Despite the abundance of research on speaking assessment and the variety of research topics and methods that emerged from our coding process, we feel that there are several areas which have not been explored extensively by language assessment researchers, and therefore warrant more future research endeavors. First, more studies should be conducted to interrogate the three inferences of domain description, extrapolation , and utilization in the argument-based validation framework. As indicated in our study, only a small fraction of studies have been dedicated to examining these three inferences in comparison with evaluation, generalization , and explanation (see Table 2 ). Regarding domain description , we feel that more research could be undertaken to understand task- and domain-specific speaking abilities and communicative skills. This would have significant implications for enhancing the authenticity of speaking assessment design, and for constructing valid rating scales for evaluating test takers' spoken performance. The thick description approach advocated by Fulcher et al. (2011) could be attempted to portray a nuanced picture of speaking ability in the TLU domains, especially in the case of Language for Specific Purposes (LSP) speaking assessment. When it comes to the extrapolation inference, though practical difficulties in collecting speaking performance data in the TLU domains are significant indeed, new research methods and perspectives, as exemplified by the corpus-based register analysis approach taken by LaFlair and Staples (2017) , could be attempted in the future to enable meaningful comparisons between spoken performance on the test and speaking ability in TLU domains. In addition, the judgments of linguistic layperson may also be employed as a viable external criterion (e.g., Sato and McNamara, 2018 ). The utilization inference is yet another area that language assessment researchers might consider exploring in the future. Commenting on the rise of computer-assisted language assessment, Chapelle (2008 , p. 127) argued that “test takers have needed to reorient their test preparation practices to help them prepare for new test items.” As such, it is meaningful for language assessment researchers to explore the impact of computer-mediated speaking assessments and automated scoring systems on teaching and learning practices.

Next, though the topic of speaking constructs has attracted considerable research attention from the field, as evidenced by the analysis results of this study, it seems that we are still far from achieving a comprehensive and fine-grained understanding of speaking proficiency. The results of this study suggest that speaking assessment researchers tended to adopt a psycholinguistic approach, aiming to analyze the linguistic features of produced spoken discourse that distinguish test takers at different proficiency levels. However, given the dynamic and context-embedded nature of speaking, there is a pressing need for a sociocultural perspective to better disentangle the speaking constructs. Using pronunciation as an example, Fulcher (2015b) argued convincingly the inadequacy of a psycholinguistic approach in pronunciation assessment research; rather, a sociocultural approach, which aims to demystify rationales, linguistic or cultural, that underlie (dys)fluency, could significantly enrich our understanding of the construct. Such an approach should be attempted more productively in future studies. In addition, as the application of technology is becoming prevalent in speaking assessment practices ( Chapelle, 2008 ), it is essential to explore whether and to what extent technology mediation has altered the speaking constructs and the implications for score interpretation and use.

We also found that several topics were under-represented in the studies that we collected. Important areas that received relatively limited coverage in our dataset include: (a) classroom-based or learning-oriented speaking assessment; (b) diagnostic speaking assessment; and (c) speaking assessment for young language learners (YLLs). The bulk of the research in our collection targeted large-scale high-stakes speaking assessments. This is understandable, perhaps, because results on these assessments are often used to make important decisions which have significant ramifications for stakeholders. In comparison, scanty research attention has been dedicated to speaking assessments in classroom contexts. A recent study reported by May et al. (2018) aimed to develop a learning-oriented assessment tool for interactional competence, so that detailed feedback could be provided about learners' interactional skills in support of their learning. More research of such a nature is needed in the future to reinforce the interfaces between speaking assessment with teaching and learning practices. In the domain of L2 writing research, it has been shown that simply using analytic rating scales does not mean that useful diagnostic feedback can be provided to learners ( Knoch, 2009 ). Arguably, this also holds true for speaking assessment. In view of the value of diagnostic assessment ( Lee, 2015 ) and the call for more integration of learning and assessment (e.g., Alderson, 2005 ; Turner and Purpura, 2015 ), more research could be conducted to develop diagnostic speaking assessments so that effective feedback can be provided to promote L2 learners' speaking development. Finally, young language learners (YLLs) have specific needs and characteristics which have implications for how they should be assessed (e.g., McKay, 2006 ). This is particularly challenging with speaking assessment in terms of task design, implementation and score reporting. This topic, however, has rarely been explored by speaking assessment researchers and therefore warrants more future research.

In terms of research methods, we feel that speaking assessment researchers should consider exploring more the potentials of qualitative methods which are well-suited to investigating an array of research questions related to speaking assessment. Our analysis results indicate that despite the quite frequent use of traditional qualitative methods such as interviews and focus groups, new qualitative methods that are supported by technology (e.g., eye-tracking) have only recently been utilized by speaking assessment researchers. For example, a recent study by Lee and Winke (2018) demonstrated the use of eye-tracking in speaking assessment through examining test-takers' cognitive processes when responding to computer-based speaking assessment tasks. Eye-tracking is advantageous in the sense that as opposed to traditional qualitative methods such as introspective think-aloud protocols, it causes minimal interference of the test taking process. Our final comment concerns the use of mixed-methods design in speaking assessment research. Despite it being applied quite frequently in researching speaking assessment, it appears that only the sequential explanatory design (i.e., the use of qualitative research to explain quantitative findings) was usually employed. Speaking assessment researchers may consider other mixed methods design options (e.g., convergent parallel design or embedded mixed methods design, see Moeller et al., 2016 ) to investigate more complex research questions in speaking assessment.

We acknowledge a few limitations with this study. As mentioned previously, we targeted only two highly influential journals in the field of language assessment, namely, LT and LAQ while aware that numerous other journals in applied linguistics or educational evaluation also publish research on speaking and its assessment. As such, caution needs to be exercised when interpreting the relevant research findings that emerged from this study. Future studies could be undertaken to include more journals and other publication types (e.g., research reports, PhD dissertations) to depict a more representative picture of speaking assessment research. In addition, given the sheer volume of published research on speaking assessment available, our research findings can only be presented as indications of possible trends of the wider publishing context, as reflected in the specific articles we explored. Arguably, the findings might be more revealing if we zoomed in on a few key topics in speaking assessment (e.g., rater effects, speaking constructs), analyzed specific studies on these topics in detail, and compared their findings. Finally, it would be worthwhile to explore how the research on some key topics in speaking assessment has been evolving over time. Such analysis could have provided a valuable reference point to speaking assessment researchers and practitioners. Such a developmental trend perspective, however, was not incorporated in our analysis and could be attempted in future research.

Data Availability Statement

The datasets generated for this study are available on request to the corresponding author.

Author Contributions

JF designed the study, collected and coded the data, and drafted the article. XY collected and coded the data, and drafted this article together with JF.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling editor declared a past collaboration with one of the authors JF.

Acknowledgments

The preparation of this manuscript was supported by the National Planning Office for Philosophy and Social Sciences (NPOPSS) of the People's Republic of China under the project title Reform of English speaking assessment and its impact on the teaching of English speaking (19BYY234). We would like to thank Angela McKenna and three reviewers for their insightful and perspicacious comments on the previous draft of this article.

1. ^ LT and LAQ made their debut in 1984 and 2004, respectively.

AERA, APA, and NCME (2014). Standards for Educational and Psychological Testing . Washtington, DC: AERA.

Ahmadi, A., and Sadeghi, E. (2016). Assessing English language learners' oral performance: a comparison of monologue, interview, and group oral test. Lang. Assess. Q. 13, 341–358. doi: 10.1080/15434303.2016.1236797

CrossRef Full Text | Google Scholar

Alderson, J. C. (2005). Diagnosing Foreign Language Proficiency: The interface between Learning and Assessment . London: Bloomsbury.

Google Scholar

Alderson, J. C., Clapham, C., and Wall, D. (1995). Language Test Construction and Evaluation . Cambridge: Cambridge University Press.

Bachman, L. F. (1990). Fundamental Considerations in Language Testing . Oxford England; New York, NY: Oxford University Press.

Bachman, L. F., Davidson, F., Ryan, K., and Choi, I.-C. (1995). An Investigation into the Comparability of Two Tests of English as a Foreign Language , Vol. 1 (Cambridge: Cambridge University Press).

Bachman, L. F., and Palmer, A. S. (1996). Language Assessment in Practice: Designing and Developing Useful Language Tests . Oxford: Oxford University Press.

Bonk, W. J., and Ockey, G. J. (2003). A many-facet rasch analysis of the second language group oral discussion task. Lang. Test. 20, 89–110. doi: 10.1191/0265532203lt245oa

Bosker, H. R., Pinget, A.-F., Quené, H., Sanders, T., and De Jong, N. H. (2013). What makes speech sound fluent? The contributions of pauses, speed and repairs. Lang. Test. 30, 159–175. doi: 10.1177/0265532212455394

Brown, A. (2003). Interviewer variation and the co-construction of speaking proficiency. Lang. Test. 20, 1–25. doi: 10.1191/0265532203lt242oa

Carter, R., and McCarthy, M. (2017). Spoken grammar: where are we and where are we going? Appl. Linguistics 38, 1–20. doi: 10.1093/applin/amu080

Chapelle, C. A. (2008). Utilizing Technology in Language Assessment Encyclopedia of Language and Education, Vol. 7 (New York, NY: Springer), 123–134.

Chapelle, C. A., Cotos, E., and Lee, J. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Lang. Test. 32, 385–405. doi: 10.1177/0265532214565386

Chapelle, C. A., Enright, M. K., and Jamieson, J. (2010). Does an argument-based approach to validity make a difference? Educ. Meas. Iss. Pract. 29, 3–13. doi: 10.1111/j.1745-3992.2009.00165.x

Chapelle, C. A., Enright, M. K., and Jamieson, J. M. (2008). Building a Validity Argument for the Test of English as a Foreign Language . New York, NY; London: Routledge; Taylor & Francis Group.

Cooper, H., Hedges, L. V., and Valentine, J. C. (2019). The Handbook of Research Synthesis and Meta-Analysis. New York, NY:Russell Sage Foundation.

Eckes, T. (2005). Examining rater effects in TestDaF writing and speaking performance assessments: a many-facet Rasch analysis. Lang. Assess. Q: Int. J. 2, 197–221. doi: 10.1207/s15434311laq0203_2

Eckes, T. (2011). Introduction to Many-Facet Rasch Measurement . Frankfurt: Peter Lang.

Ellis, R. (2015). Introduction: complementarity in research syntheses. Appl. Linguistics 36, 285–289. doi: 10.1093/applin/amv015

Fan, J., and Knoch, U. (2019). Fairness in language assessment: what can the Rasch model offer? Pap. Lang. Test. Assess. 8, 117–142. Available online at: http://www.altaanz.org/uploads/5/9/0/8/5908292/8_2_s5_fan_and_knoch.pdf

Fulcher, G. (2000). The ‘communicative' legacy in language testing. System 28, 483–497. doi: 10.1016/S0346-251X(00)00033-6

Fulcher, G. (2015a). Assessing second language speaking. Lang. teaching 48, 198–216. doi: 10.1017/S0261444814000391

Fulcher, G. (2015b). Re-Examining Language Testing: a Philosophical and Social Inquiry . New York, NY: Routledge.

Fulcher, G., Davidson, F., and Kemp, J. (2011). Effective rating scale development for speaking tests: performance decision trees. Lang. Test. 28, 5–29. doi: 10.1177/0265532209359514

Galaczi, E., and Taylor, L. (2018). Interactional competence: conceptualisations, operationalisations, and outstanding questions. Lang. Assess. Q. 15, 219–236. doi: 10.1080/15434303.2018.1453816

Galaczi, E. D. (2008). Peer-peer interaction in a speaking test: the case of the First Certificate in English examination . Lang. Assess. Q. 5, 89–119. doi: 10.1080/15434300801934702

Gan, Z. (2012). Complexity measures, task type, and analytic evaluations of speaking proficiency in a school-based assessment context. Lang. Assess. Q. 9, 133–151. doi: 10.1080/15434303.2010.516041

Ginther, A. (2013). “Assessment of speaking,” in The Encyclopedia of Applied Linguistics , ed C. A. Chapelle (New York, NY: Blackwell Publishing Ltd.), 1–7.

Hirai, A., and Koizumi, R. (2013). Validation of empirically derived rating scales for a story retelling speaking test. Lang. Assess. Q. 10, 398–422. doi: 10.1080/15434303.2013.824973

Isaacs, T. (2014). “Assessing pronunciation,” in The Companion to Language Assessment , Vol. 1, ed A. J. Kunnan (New York, NY: John Wiley & Sons), 140–155.

Isaacs, T. (2016). “Assessing speaking,” in Handbook of Second Language Assessment, Vol. 12 , eds D. Tsagari and J. Banerjee (Boston, MA; Berlin, Germany: De Gruyter), 131–146.

Isaacs, T., and Thomson, R. I. (2013). Rater experience, rating scale length, and judgments of L2 pronunciation: revisiting research conventions. Lang. Assess. Q. 10, 135–159. doi: 10.1080/15434303.2013.769545

Iwashita, N. (2006). Syntactic complexity measures and their relation to oral proficiency in Japanese as a foreign language. Lang. Assess. Q. Int. J. 3, 151–169. doi: 10.1207/s15434311laq0302_4

Jang, E. E., Wagner, M., and Park, G. (2014). Mixed methods research in language testing and assessment. Annu. Rev. Appl. Linguistics 34, 123–153. doi: 10.1017/S0267190514000063

Kane, M. T. (2006). “Validation,” in Educational Measurement , ed R. L. Brennan (Westport, CT: American Council on Education), 17–64.

Kim, H. J. (2015). A qualitative analysis of rater behavior on an L2 speaking assessment. Lang. Assess. Q. 12, 239–261. doi: 10.1080/15434303.2015.1049353

Kim, Y.-H. (2009). An investigation into native and non-native teachers' judgments of oral english performance: a mixed methods approach. Lang. Test. 26, 187–217. doi: 10.1177/0265532208101010

Knoch, U. (2009). Diagnostic assessment of writing: a comparison of two rating scales. Lang. Test. 26, 275–304. doi: 10.1177/0265532208101008

Knoch, U., and Chapelle, C. A. (2018). Validation of rating processes within an argument-based framework. Lang. Test. 35, 477–499. doi: 10.1177/0265532217710049

Kramsch, C. (1986). From language proficiency to interactional competence. Mod. Lang. J. 70, 366–372. doi: 10.1111/j.1540-4781.1986.tb05291.x

LaFlair, G. T., and Staples, S. (2017). Using corpus linguistics to examine the extrapolation inference in the validity argument for a high-stakes speaking assessment. Lang. Test. 34, 451–475. doi: 10.1177/0265532217713951

Lam, D. M. (2018). What counts as “responding”? Contingency on previous speaker contribution as a feature of interactional competence. Lang. Test. 35, 377–401. doi: 10.1177/0265532218758126

Lee, S., and Winke, P. (2018). Young learners' response processes when taking computerized tasks for speaking assessment. Lang. Test. 35, 239–269. doi: 10.1177/0265532217704009

Lee, Y.-W. (2006). Dependability of scores for a new ESL speaking assessment consisting of integrated and independent tasks. Lang. Test. 23, 131–166. doi: 10.1191/0265532206lt325oa

Lee, Y.-W. (2015). Diagnosing diagnostic language assessment. Lang. Test. 32, 299–316. doi: 10.1177/0265532214565387

Luoma, S. (2004). Assessing Speaking . Cambridge: Cambridge University Press.

May, L. (2011). Interactional competence in a paired speaking test: features salient to raters. Lang. Assess. Q. 8, 127–145. doi: 10.1080/15434303.2011.565845

May, L., Nakatsuhara, F., Lam, D. M., and Galaczi, E. (2018). “Learning-oriented assessment feedback for interactional competence: developing a checklist to support teachers and learners,” Paper presented at the Language Testing Research Colloquium (Auckland).

McKay, P. (2006). Assessing Young Language Learners . Cambridge: Cambridge University Press.

McNamara, T. (1996). Measuring Second Language Proficiency . London: Longman.

McNamara, T., and Knoch, U. (2012). The Rasch wars: the emergence of Rasch measurement in language testing. Lang. Test. 29, 553–574. doi: 10.1177/0265532211430367

McNamara, T., Knoch, U., and Fan, J. (2019). Fairness, Justice and Langauge Assessment . Oxford: Oxford University Press.

McNamara, T. F. (1997). ‘Interaction' in second language performance assessment: whose performance? App. Linguistics 18, 446–466. doi: 10.1093/applin/18.4.446

McNamara, T. F., and Lumley, T. (1997). The effect of interlocutor and assessment mode variables in overseas assessments of speaking skills in occupational I settings. Lang. Test. 14, 140–156. doi: 10.1177/026553229701400202

Messick, S. (1989). “Validity,” in Educational Measurement , 3rd Edn, ed R. L. Linn (New York, NY: McMillan; American Council on Education), 13–103.

Miles, M. B., Huberman, A. M., and Saldaña, J. (2014). Qualitative Data Analysis: A Methods Sourcebook, 3rd Edn. Thousand Oaks, CA: Sage.

Moeller, A. K., Creswell, J. W., and Saville, N. (2016). Second Language Assessment and Mixed Methods Research . Cambridge: Cambridge University Press.

Nakatsuhara, F., Inoue, C., Berry, V., and Galaczi, E. (2017). Exploring the use of video-conferencing technology in the assessment of spoken language: a mixed-methods study. Lang. Assess. Q. 14, 1–18. doi: 10.1080/15434303.2016.1263637

Norris, J. M., and Ortega, L. (2006). Synthesizing Research on Language Learning and Teaching , Vol. 13, Amsterdam; Philadelphia, PA: John Benjamins Publishing.

Ockey, G. J. (2009). The effects of group members' personalities on a test taker's L2 group oral discussion test scores. Lang. Test. 26, 161–186. doi: 10.1177/0265532208101005

Ockey, G. J., Gu, L., and Keehner, M. (2017). Web-based virtual environments for facilitating assessment of L2 oral communication ability. Lang. Assess. Q. 14, 346–359. doi: 10.1080/15434303.2017.1400036

O'Sullivan, B. (2014). “Assessing speaking,” in The Companion to Language Assessment , Vol. 1, ed A. J. Kunnan (Chichester: Wiley and Sons Inc), 156–171.

Pill, J., and McNamara, T. (2016). How much is enough? Involving occupational experts in setting standards on a specific-purpose language test for health professionals. Lang. Test. 33, 217–234. doi: 10.1177/0265532215607402

Qian, D. D. (2009). Comparing direct and semi-direct modes for speaking assessment: affective effects on test takers. Lang. Assess. Q. 6, 113–125. doi: 10.1080/15434300902800059

QSR (2018). NVivo Qualitative Data Analysis Software . Melbourne, VIC: QSR International Pty Ltd.

Richards, L. (1999). Using NVivo in Qualitative Research . Thousand Oaks, CA: Sage.

Roever, C., and Kasper, G. (2018). Speaking in turns and sequences: interactional competence as a target construct in testing speaking. Lang. Test. 35, 331–355. doi: 10.1177/0265532218758128

Sato, T., and McNamara, T. (2018). What counts in second language oral communication ability? The perspective of linguistic laypersons. Appl. Linguist. 40, 894–916. doi: 10.1093/applin/amy032

Sawaki, Y. (2007). Construct validation of analytic rating scale in speaking assessment: reporting a score profile and a composite. Lang. Test. 24, 355–390. doi: 10.1177/0265532207077205

Scott, M. L. (1986). Student affective reactions to oral language tests. Lang. Test. 3, 99–118. doi: 10.1177/026553228600300105

Turner, C., and Purpura, J. (2015). “Learning-oriented assessment in the classroom,” in Handbook of Second Language Assessment , eds D. Tsagari and J. Banerjee (Berlin; Boston, MA: DeGruyter Mouton), 255–273.

Turner, C. E. (2013). “Classroom assessment,” in The Routledge Handbook of Language Testing , eds G. Fulcher and F. Davidson (London; New York, NY: Routledge), 79–92.

Wei, J., and Llosa, L. (2015). Investigating differences between American and Indian raters in assessing TOEFL iBT speaking tasks. Lang. Assess. Q. 12, 283–304. doi: 10.1080/15434303.2015.1037446

Weir, C. J. (2005). Language Testing and Validation: An Evidence-Based Approach . Basingstoke: Palgrave Macmillan.

Wigglesworth, G., and Elder, C. (2010). An investigation of the effectiveness and validity of planning time in speaking test tasks. Lang. Assess. Q. 7, 1–24. doi: 10.1080/15434300903031779

Winke, P. (2012). “Rating oral language,” in The Encyclopedia of Applied Linguistics , ed C. A. Chapelle (New York, NY: Blackwell Publishing Ltd.).

Winke, P., Gass, S., and Myford, C. (2013). Raters' L2 background as a potential source of bias in rating oral performance. Lang. Test. 30, 231–252. doi: 10.1177/0265532212456968

Xi, X. (2007). Evaluating analytic scoring for the TOEFL ® Academic Speaking Test (TAST) for operational use. Lang. Test. 24, 251–286. doi: 10.1177/0265532207076365

Yan, X. (2014). An examination of rater performance on a local oral Englissh proficiency test: a mixed-methods approach. Lang. Test. 31, 501–527. doi: 10.1177/0265532214536171

Yu, G., He, L., Rea-Dickins, P., Kiely, R., Lu, Y., Zhang, J., et al. (2017). Preparing for the speaking tasks of the TOEFL iBT ® test: an investigation of the journeys of Chinese test takers. ETS Res. Rep. Ser. 2017, 1–59. doi: 10.1002/ets2.12145

Zhang, Y., and Elder, C. (2011). Judgments of oral proficiency by non-native and native English speaking teacher raters: competing or complementary constructs? Lang. Test. 28, 31–50. doi: 10.1177/0265532209360671

Keywords: speaking assessment, speaking proficiency, argument-based validation framework, research methods, narrative review

Citation: Fan J and Yan X (2020) Assessing Speaking Proficiency: A Narrative Review of Speaking Assessment Research Within the Argument-Based Validation Framework. Front. Psychol. 11:330. doi: 10.3389/fpsyg.2020.00330

Received: 20 November 2019; Accepted: 11 February 2020; Published: 27 February 2020.

Reviewed by:

Copyright © 2020 Fan and Yan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jason Fan, jinsong.fan@unimelb.edu.au ; Xun Yan, xunyan@illinois.edu

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

corat coret

Cari blog ini, proposal: improving students’ speaking skill through role play.

Dapatkan link
Aplikasi Lainnya

Saya izin copas untuk kuliah saya

Posting Komentar

Two Unconventional Ways to Improve Your Public Speaking Skills

June 17, 2024
Member News

Engineers tend to invest in technical skills rather than communication skills. However, developing the latter can help you deliver your message in a way that is understandable by a wider (and not necessarily technical) audience.

Do you think you need to improve your public speaking skills? Do you think you are ready to speak at a conference? Welcome to the club! I bet you and I are not the only members.

What kind of advice have you usually received on the topic? Probably that you should practice as much as possible with a smaller audience — like preparing some slides and run a presentation in front of your team.

This is a good approach, but I can imagine possible second thoughts you might have. Maybe it’s imposter syndrome whispering, “What if I look too nervous and they won’t believe I know the subject?” or “They will ask questions, I won’t know the answers, and they will think I’m stupid.” Or maybe you don’t want to practice in a work environment and would like to have a less official experience.

Let me share a couple of other methods you could try. This is not based on any scientific research, only personal opinion. I’ve tried these ideas to improve my own speaking skills, and I enjoyed both of them a lot.

Join a Debate Club

Debate clubs are an excellent place to develop essential public speaking skills.

First of all, only eight participants are required for a debate round, plus judges. This means you probably won’t feel overwhelmed by the audience size. Next, the discussion topic is shared 15 minutes before the start of the round, so you do not have time to stress out — you only have time to prepare your speech. Moreover, the side your team is playing on might be opposite to your personal beliefs, so you will have a chance to hear and understand other points of view.

Depending on the regularity of your attendance, you will speak in front of familiar people. Eventually, it will be almost the same as presenting in front of your team, as you will get to know the other people who join regularly.

Other participants can ask you questions during your speech, so you will practice answering them without prior preparation. By practicing this skill in a controlled environment, it will come easier when such need arises in a real-life public speaking situation.

Discussion topics can be unexpected and related to politics, economics, art, health — basically anything. You will be challenged to prepare a speech about something you may have never thought about previously. You will also need to anticipate possible arguments from your opponents and have questions ready for them. This exercise makes your brain work fast, and this skill can be useful if you are invited to a discussion panel in the future.

Attend an Acting Class

Inside the acting studio, you are in a safe place: people come here to practice their acting, not to criticize you. This understanding can help you be more open during the class without self-censoring your creative flow. In my case, it wasn’t like that from the very beginning, as I usually need some time to feel comfortable around new people.

Classes are different and focus on various techniques of stage play. For example, one day it’s pure improvisation, where your classmates suggest what is happening to your character and you need to adapt accordingly. This can prepare you if something unexpected happens during a conference talk.

When a class is devoted to a group performance, you need to notice your partners and keep several attention points simultaneously. This is helpful to practice maintaining eye contact with the audience.

You might also have exercises to improve pronunciation and better control your breath and voice. All of those will come handy when you start practicing your future talk.

I have personally tried both of these methods to improve my self-confidence in preparation to become a public speaker. This is working for me, as after four months of the acting course, I have gained enough courage to perform a dramatic etude in front of 80 people.

Yevheniia Trefilova is a Ukrainian expat in Poland. She has been in the IT industry for around 10 years, previously as quality assurance engineer, and currently as a product owner. She enjoys traveling and hopefully one day will be able to say she has visited every country of the world.

You Might Also Like…

Chemical Engineering Student Spotlight: Abby and Pilar -

ABOUT ALL TOGETHER

All Together is the blog of the Society of Women Engineers . Find stories about SWE members, engineering, technology, and other STEM related topics. It’s up-to-date information and news about the Society and how our members are making a difference everyday.

Session expired

Please log in again. The login page will open in a new tab. After logging in you can close it and return to this page.

Privacy Overview

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

IMAGES

English research proposal
(PDF) Improving Students’ Speaking Skill Through the Use of Cooperative
(PDF) THE USE OF SCAFFOLDING STRATEGY IN IMPROVING STUDENTS’ SPEAKING SKILL
(PDF) Improving Students’ Speaking Skills through Task-Based Learning
(PDF) Improving speaking skill using the speaking practice tool Spacedeck
How to write a research proposal (Chapter 2)

VIDEO

@DSELSIDHIFORENGLISH FINAL UNIT TEST का ANALYSIS PART -2
@DSELSIDHIFORENGLISH MIGHT और MAY का PASSIVE VOICE"
Lecture 7: Writing a Grant Proposal or A Research proposal
@DSELSIDHIFORENGLISH " PASSIVE VOICE OF HAS TO /HAVE TO ,HAD TO AND WILL HAVE TO "
Importance of Resilience
Research Proposal #summary

COMMENTS

PDF An action research on developing English speaking skills through ...
This action research aims at developing an action plan to alleviate foreign language speaking anxiety, and accordingly improving speaking performance. The study, which is a collaborative action research type, was carried out of 19 prospective Chemical Engineering students at the CEFR-A1 level at Ege University School of Foreign Languages (EUSFL).
(Pdf) a Classroom Action Research: Improving Speaking Skills Through
Then, the 343 A Classroom Action Research: Improving Speaking Skills through Information Gap Activities (M. Afrizal) students are expected to be able to speak effectively in various contexts to convey information, thoughts and feelings as well as to build social relationships. In conducting the curriculum, some English lecturers view changes as ...
(PDF) IMPROVING STUDENTS' ABILITY TO SPEAK ENGLISH ...
Research Proposal PDF Available. ... could improve students' speaking skills with an average speaking skill of students learning English in the first cycle was 66 and the second cycle was 78 ...
(Pdf) Language Skills: a Study of Improving English Speaking Skills
The term language skill refers to a person's ability to. use language effectively, in its spoken and written forms, for different activities and purposes. There are major language skills, which ...
PDF EXPLORING STUDENTS' LEARNING STRATEGIES IN SPEAKING PERFORMANCE
by the students in speaking performance, the factors influence students' learning strategies and the roles that the lecturer does in enhancing students' speaking performance. The writer used qualitative design to obtain the data and to present the result of this research. Setting and Participants This research was conducted in the third
PDF Improving Students' Speaking Proficiency Using Games (A ...
This action research study is designed to improve students' speaking proficiency by implementing games. The aims of this research are: 1) to describe whether or not and to what extent games improve speaking proficiency; 2) to identify the situation when games are implemented in the speaking class.
PDF Role-Playing Improves Speaking Proficiency Skills
toward Improving Student's Speaking Achievement of Fifth-Grade Students in MI Sukerejo 02 Suruh in the Academic Year of 2011/2012" that using the Role Play method the samples can improve their speaking ability. However, the examples are from Elementary students. Suryani, L(2015) concluded in her research entitled "The Effectiveness of Role-play in
A Hybrid Methodology to Improve Speaking Skills in English ...
Speaking skills will be worked on twice a week, in two 45-min classes. The proposal is designed for eight classes spanning two months, as shown Table 4. After this time, an evaluation will be conducted to determine whether the use of the hybrid methodology has improved the students' speaking skills.
(PDF) Systematic Review on Speaking Skill Teaching ...
PDF | On Jan 20, 2022, Hooi Sieng Liew and others published Systematic Review on Speaking Skill Teaching Approaches in the ESL / EFL Classroom: Before and During Covid 19 | Find, read and cite all ...
Improving Students' Speaking Skills through Task-Based Learning: An
This study aims to improve students' speaking skills at the Department of English. Based on interviews carried out to get initial data on the students' speaking skills, it was shown that the students had problems in speaking due to inadequate knowledge of the language which in turn made the students felt unconfident to speak.
Improving EFL learners' speaking skills and ...
1. Introduction. As speaking is a critical skill for second language learners to communicate with native and non-native speakers and to participate in real-life situations (Jabber & Mahmood, 2020; Kohn & Hoffstaedter, 2017; Li & Chan, 2024; Wan & Moorhouse, 2024), it can be a vital component in affecting and shaping learners' overall language development.
PDF IMPROVING STUDENTS' SPEAKING ABILITY
IMPROVING STUDENTS' SPEAKING ABILITY . BY USING ROLE PLAY (A . Classroom Action Research at . VII . Grade of SMPN 251 Jakarta). A "Skripsi" Presented to the Faculty of Tarbiyah and Teacher's Training in a Partial Fulfillment of the
PDF IMPROVING THE STUDENTS' SPEAKING SKILL THROUGH TALKING CHIPS ...
cycle 2, although the results of the research were just a little bit improvement. It could be concluded that the use of Talking Chips has proven to be an effective way in teachinglearning of - Speaking Skill on Second Semester of English Study Program at STAIN Bengkalis. Keyword: Speaking Skill, Talking Chips (TC) Technique
Assessing Speaking Proficiency: A Narrative Review of Speaking
1 Language Testing Research Centre, The University of Melbourne, Melbourne, VIC, Australia; 2 Department of Linguistics, University of Illinois at Urbana-Champaign, Champaign, IL, United States; This paper provides a narrative review of empirical research on the assessment of speaking proficiency published in selected journals in the field of language assessment.
(PDF) Improving Speaking Skills
86. Improving Speaking Skills. Betsabé Navarro Romero. Abstract. This article examines the d ifferent circumstances under which infant and adult lear ners develop speaking skills. We will. see ...
PDF IMPROVING THE STUDENT S' SPEAKING SKILL
The research is an action research. The objectives of this research are to improve the student's speaking skill, to check whether role play is effective learning technique, to know the strengths and weaknesses of role play. The problem of this research is the poor speaking competence of the eleventh grade
PDF Improving Students' Speaking Ability Through Public
IMPROVING STUDENTS' SPEAKING ABILITY THROUGH PUBLIC SPEAKING METHOD (A Classroom Action Research at the Ninth-Grade Students of MTs Darel-Azhar, Rangkasbitung in Academic Year 2021/2022) A Skripsi Presented to the Faculty of Educational Sciencies in Partial Fulfillment of the Requirements for the Degree of S.Pd. (S-1) in the English Education
PDF Improving Students' Speaking Skill Through Three Steps Interview Technique
Three Steps Interview Technique improve the students' speaking skill. The subject of this study was 36 students of class X AK.1 in SMK N 9 Semarang. The research method used was an action research. This study conducted . five meetings The first cycle was conducted in two meetings and the for two cycles.
(PDF) Enhancing Students' Speaking Ability Using Story Telling
The findings of this research showed that after teaching reading through Story Telling. The researchers found that Story Telling successfully improves students' speaking skills and learning ...
PDF The Effectiveness of Duolingo in Improving Students' Speaking Skill at
with other people, speaking skill take the highest place as the most important skill to master (Bright & McGregor, 1970). Speaking, which is said to be one of the productive skills of language, is an essential part in learning language. As stated by Spratt, Marry, et. al. (2005:34), speaking is a productive skill, like writing which 1
PROPOSAL: IMPROVING STUDENTS' SPEAKING SKILL THROUGH ROLE PLAY
The hypothesis of this proposal is role play will solve students' problem in speaking, and it can improve students' speaking. E. Significance of Role Play in Improving Students' Speaking Skill. In my opinion, this method is very useful for anyone who interest in English, especially for the students, teachers and institution.
Improving Student's Speaking Skill Using Youtube Video as Media: An
Abstract and Figures. p> This research is an action research that aims to improve students' speaking skills using Youtube media. Discussing how Youtube as a learning medium can improve students ...
Two Unconventional Ways to Improve Your Public Speaking Skills
Join a Debate Club. Debate clubs are an excellent place to develop essential public speaking skills. First of all, only eight participants are required for a debate round, plus judges. This means you probably won't feel overwhelmed by the audience size. Next, the discussion topic is shared 15 minutes before the start of the round, so you do ...
Improving English Language Speaking Skills of Ajloun National
4. Department of English Language and Literature, Faculty of Arts and Educational Sciences, Ajloun National University, Jordan. Abstract: This study aimed to enhance the oral communication skills ...