Publications | Qiao (Judy) WANG

2025

Evaluation of automated vocabulary quiz generation with VocQGen

Qiao Wang, Ralph Rose, Ayaka Sugawara, and 1 more author

Vocabulary Learning and Instruction, Mar 2025

Abs DOI Bib HTML PDF

VocQGen is an automated tool designed to generate multiple-choice cloze (MCC) questions for vocabulary assessment in second language learning contexts. It leverages several natural language processing (NLP) tools and OpenAI’s GPT-4 model to produce MCC items quickly from user-specified word lists. To evaluate its effectiveness, we used the first sublist in the Academic Word List (AWL) to generate 60 questions with VocQGen. Then we compared the quality of 60 autogenerated questions with 40 manually created ones through expert reviews and through pilot testing with 68 students. Expert review results indicate that automatically generated questions exhibit higher grammatical accuracy and clearer contexts in question stems. However, the tool occasionally produces distractors that are acceptable as correct responses. Pilot testing results show that in general the number of correct responses is higher in autogenerated questions, indicating the less challenging nature of these questions. The study concludes that manual check is still required for questions generated by VocQGen and future work should focus on improving distractor effectiveness.
@article{wang2024evaluation, title = {Evaluation of automated vocabulary quiz generation with VocQGen}, author = {Wang, Qiao and Rose, Ralph and Sugawara, Ayaka and Orita, Naho}, journal = {Vocabulary Learning and Instruction}, volume = {14}, number = {1}, pages = {2079--2079}, year = {2025}, month = mar, doi = {10.29140/vli.v14n1.2079}, }

2024

Effectiveness of Large Language Models in Automated Evaluation of Argumentative Essays: Finetuning vs. Zero-Shot Prompting

Qiao Wang and John Maurice Gayed

Computer Assisted Language Learning, Jul 2024

Abs DOI Bib HTML

To address the long-standing challenge facing traditional automated writing evaluation (AWE) systems in assessing higher-order thinking, this study built an AWE system for scoring argumentative essays by finetuning the GPT-3.5 Large Language Model. The system’s effectiveness was compared with that of the non-finetuned GPT-3.5 and GPT-4 base models via zero-shot prompting, which involves applying the model to perform tasks without any prior specific training or examples on those tasks. The dataset used was the TOEFL Public Writing Dataset provided by Education Testing Service (ETS), containing 480 argumentative essays with ground truth scores under two essay prompts. Three finetuned models were generated: two finetuned exclusively on either prompt and one on both. All finetuned and base models were used to score the remaining essays after finetuning and their scoring effectiveness was compared with ground truth scores, i.e., benchmark scores assigned by ETS-trained human raters. The impact of the variety of finetuning prompts and the robustness of finetuned models were also explored. Results showed a 100% consistency of all models in two scoring sessions. More importantly, the finetuned models significantly outperformed the base models in accuracy and reliability. The best-performing model, finetuned on prompt 1, showed an RMSE of 0.57, a percentage agreement (score discrepancy ≤ 0.5) of 84.72% and a QWK of 0.78. Further, the model finetuned on both prompts did not exhibit enhanced performance, and the two models finetuned on one prompt remained robust when scoring essays from the alternative prompt. These results suggest (1) task-specific finetuning for AWE is beneficial; (2) finetuning does not require a large variety of essay prompts; and (3) fine-tuned models are robust to unseen essays.
@article{wang2024effectiveness, title = {Effectiveness of Large Language Models in Automated Evaluation of Argumentative Essays: Finetuning vs. Zero-Shot Prompting}, author = {Wang, Qiao and Gayed, John Maurice}, journal = {Computer Assisted Language Learning}, year = {2024}, month = jul, doi = {10.1080/09588221.2024.2371395}, }
Assessing the Efficacy of Grammar Error Correction: A Human Evaluation Approach in the Japanese Context

Qiao Wang and Zheng Yuan

In Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), May 2024

Abs Bib HTML PDF

In this study, we evaluated the performance of the state-of-the-art sequence tagging grammar error detection and correction model (SeqTagger) using Japanese university students’ writing samples. With an automatic annotation toolkit, ERRANT, we first evaluated SeqTagger’s performance on error correction with human expert correction as the benchmark. Then a human-annotated approach was adopted to evaluate Seqtagger’s performance in error detection using a subset of the writing dataset. Results indicated a precision of 63.66% and a recall of 20.19% for error correction in the full dataset. For the subset, after manual exclusion of irrelevant errors such as semantic and mechanical ones, the model shows an adjusted precision of 97.98% and an adjusted recall of 42.98% for error detection, indicating the model’s high accuracy but also its conservativeness. Thematic analysis on errors undetected by the model revealed that determiners and articles, especially the latter, were predominant. Specifically, in terms of context-independent errors, the model occasionally overlooked basic ones and faced challenges with overly erroneous or complex structures. Meanwhile, context-dependent errors, notably those related to tense and noun number, as well as those possibly influenced by the students’ first language (L1), remained particularly challenging.
@inproceedings{wang2024assessing, title = {Assessing the Efficacy of Grammar Error Correction: A Human Evaluation Approach in the Japanese Context}, author = {Wang, Qiao and Yuan, Zheng}, booktitle = {Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING)}, year = {2024}, month = may, }
Reducing Redundancy in Japanese-to-English Translation: A Multi-Pipeline Approach for Translating Repeated Elements in Japanese

Qiao Wang, Yixuan Huang, and Zheng Yuan

In Proceedings of the Ninth Conference on Machine Translation (WMT), May 2024

Abs DOI Bib HTML PDF

This paper presents a multi-pipeline Japanese-to-English machine translation (MT) system designed to address the challenge of translating repeated elements from Japanese into fluent and lexically diverse English. The system is developed as part of the Non-Repetitive Translation Task at WMT24, which focuses on minimizing redundancy while maintaining high translation quality. Our approach utilizes MeCab, the de facto NLP tool for Japanese, for the identification of repeated elements, and Claude Sonnet 3.5, a large language model (LLM), for translation and proofreading. The system effectively accomplishes the shared task by identifying and translating in a diversified manner 89.79% of the 470 repeated instances in the testing dataset, and achieving an average translation quality score of 4.60 out of 5, significantly surpassing the baseline score of 3.88. Analysis also revealed the challenges encountered, particularly in identifying standalone noun-suffix elements and occasional cases of consistent translations or mistranslations.
@inproceedings{wang2024reducing, title = {Reducing Redundancy in Japanese-to-English Translation: A Multi-Pipeline Approach for Translating Repeated Elements in Japanese}, author = {Wang, Qiao and Huang, Yixuan and Yuan, Zheng}, booktitle = {Proceedings of the Ninth Conference on Machine Translation (WMT)}, year = {2024}, pages = {1047--1055}, organization = {Association for Computational Linguistics}, doi = {10.18653/v1/2024.wmt-1.107}, }
Sequence Tagging Approach in Grammar Error Detection: Identifying Areas of Improvement for the State-of-the-Art

Qiao Wang and Zheng Yuan

preprint, May 2024

Abs Bib HTML

This study provides a qualitative evaluation of Seqtagger, a state-of-the-art machine learning-based sequence-tagging model developed for grammatical error detection (GED) and correction (GEC). The model’s performance is evaluated on error detection against human benchmarks, with academic texts written by Japanese university students. Through human annotation and subsequent thematic analysis on failures in error detection, this study reveals that Seqtagger performs well in detecting errors related to simpler grammatical rules such as adverb position and prepositions in fixed collocations, with poorer performance in errors possibly influenced by the Japanese language, macro-structure errors and errors where human judgment is required. The underlying reasons for failures in detection are identified to be a narrow context window that fails to capture broader textual information, insufficient training data, particularly data that fully represents the linguistic characteristics of the Japanese students, and overgeneralization of patterns from the training data. These findings highlight the need for sequence-tagging GED and GEC tools to enhance their context window, be more adaptable to the diverse linguistic features of global learners and to enhance the ability to understand the linguistic complexities of the English language.
@article{wang2024sequence, title = {Sequence Tagging Approach in Grammar Error Detection: Identifying Areas of Improvement for the State-of-the-Art}, author = {Wang, Qiao and Yuan, Zheng}, year = {2024}, journal = {preprint}, }
Assessing the Efficacy of Grammar Error Correction: A Human Evaluation Approach in the Japanese Context

Qiao Wang and Zheng Yuan

Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), May 2024

Abs Bib HTML PDF

In this study, we evaluated the performance of the state-of-the-art sequence tagging grammar error detection and correction model (SeqTagger) using Japanese university students’ writing samples. With an automatic annotation toolkit, ERRANT, we first evaluated SeqTagger’s performance on error correction with human expert correction as the benchmark. Then a human-annotated approach was adopted to evaluate Seqtagger’s performance in error detection using a subset of the writing dataset. Results indicated a precision of 63.66% and a recall of 20.19% for error correction in the full dataset. For the subset, after manual exclusion of irrelevant errors such as semantic and mechanical ones, the model shows an adjusted precision of 97.98% and an adjusted recall of 42.98% for error detection, indicating the model’s high accuracy but also its conservativeness. Thematic analysis on errors undetected by the model revealed that determiners and articles, especially the latter, were predominant. Specifically, in terms of context-independent errors, the model occasionally overlooked basic ones and faced challenges with overly erroneous or complex structures. Meanwhile, context-dependent errors, notably those related to tense and noun number, as well as those possibly influenced by the students’ first language (L1), remained particularly challenging.
@article{wang2024assessinh, title = {Assessing the Efficacy of Grammar Error Correction: A Human Evaluation Approach in the Japanese Context}, author = {Wang, Qiao and Yuan, Zheng}, journal = {Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING)}, year = {2024}, }
Automated Generation of Multiple-Choice Cloze Questions for Assessing English Vocabulary Using GPT-turbo 3.5

Qiao Wang, Ralph Rose, Naho Orita, and 1 more author

arXiv preprint arXiv:2403.02078, May 2024

Abs Bib HTML PDF Code

A common way of assessing language learners’ mastery of vocabulary is via multiple-choice cloze (i.e., fill-in-the-blank) questions. But the creation of test items can be laborious for individual teachers or in large-scale language programs. In this paper, we evaluate a new method for automatically generating these types of questions using large language models (LLM). The VocaTT (vocabulary teaching and training) engine is written in Python and comprises three basic steps: pre-processing target word lists, generating sentences and candidate word options using GPT, and finally selecting suitable word options. To test the efficiency of this system, 60 questions were generated targeting academic words. The generated items were reviewed by expert reviewers who judged the well-formedness of the sentences and word options, adding comments to items judged not well-formed. Results showed a 75% rate of well-formedness for sentences and 66.85% rate for suitable word options. This is a marked improvement over the generator used earlier in our research which did not take advantage of GPT’s capabilities. Post-hoc qualitative analysis reveals several points for improvement in future work including cross-referencing part-of-speech tagging, better sentence validation, and improving GPT prompts.
@article{wang2024automated, title = {Automated Generation of Multiple-Choice Cloze Questions for Assessing English Vocabulary Using GPT-turbo 3.5}, author = {Wang, Qiao and Rose, Ralph and Orita, Naho and Sugawara, Ayaka}, journal = {arXiv preprint arXiv:2403.02078}, year = {2024}, }

2023

Automated Generation of Multiple-Choice Cloze Questions for Assessing English Vocabulary Using GPT-turbo 3.5

Qiao Wang, Ralph Rose, Naho Orita, and 1 more author

In Proceedings of the NLP4DH-IWCLUL 2023 Conference, Nov 2023

Abs Bib

This paper presents a novel approach to generating multiple-choice cloze questions for English vocabulary assessment using GPT-turbo 3.5. The system demonstrates promising results in creating contextually appropriate and pedagogically sound vocabulary assessment items.
@inproceedings{wang2023automated, title = {Automated Generation of Multiple-Choice Cloze Questions for Assessing English Vocabulary Using GPT-turbo 3.5}, author = {Wang, Qiao and Rose, Ralph and Orita, Naho and Sugawara, Ayaka}, booktitle = {Proceedings of the NLP4DH-IWCLUL 2023 Conference}, year = {2023}, month = nov }
Mapping the research trends of digital game-based language learning (DGBLL): a scientometrics review

Ke Li, Mark Peterson, Qiao Wang, and 1 more author

Computer Assisted Language Learning, Nov 2023

Abs DOI Bib HTML

The research on digital game-based language learning (DGBLL) keeps growing, but a comprehensive account of its development in the recent two decades is still lacking. Therefore, the study presents a scientometrics review of the field based on the bibliometric records retrieved from the Web of Science Core Database. 205 publications with their references were included in this review. The science mapping software Citespace was employed to compute the properties of the publications with a view to outlining the major research areas and detecting the trends of DGBLL. The document co-citation analysis has identified major research clusters as educational game, MMORPG, out-of-school gameplay, vocabulary, game-based learning and theoretical underpinnings. The content of the major clusters was examined to gain a deeper understanding of the field and the burstness analysis has revealed that over the years, research on MMORPG has experienced fluctuations in activity, whereas the interest in vocabulary learning remained stable. The findings also highlighted the need for more research into the educational vs. commercial adoption in the context of language learning and teaching. As the first scientometrics review in the field, the study supplements traditional reviews by tracing the development of DGBLL over time via data-driven analysis. The discussion concludes by identifying gaps in the literature and offering suggestions for future research.
@article{li2023mapping, title = {Mapping the research trends of digital game-based language learning (DGBLL): a scientometrics review}, author = {Li, Ke and Peterson, Mark and Wang, Qiao and Wang, Haitao}, journal = {Computer Assisted Language Learning}, pages = {1--30}, year = {2023}, publisher = {Taylor \& Francis}, doi = {10.1080/09588221.2023.2299436}, }
The role of computer games in Chinese students’ narrative writing: A case study with The Sims

Qiao Wang and Ke Li

Osaka JALT Journal, Dec 2023

Abs Bib PDF

This single-participant study explores the role of story-rich computer games in an EFL narrative writing course for Chinese university students. In a two-month English writing class using the game The Sims 4, a participant received instruction on gameplay techniques andlanguage, completedgame quests set by the teacher, wrote 14 narratives based on gameplay events, and receivedall-encompassing corrective feedback on her writing samples. A pre-test on game vocabulary and two writing preand post-tests were also administered. The researchers evaluated the participant’s EFL narrative writing performance with an in-depth analysis of linguistic features using NLP tools which included syntax, lexicon, cohesion, and content. A follow-up interview was conducted to complement the results from the writing evaluation and to provide information on the participant’s perception and attitude towards the GBW class. Results show that while there was no consistent improvement in the participant’s EFL narrative writing performance, the rich stories and contextualized vocabulary in the game contributed to the participant’s content and lexicon and helped the participant to better understand howtowritenarratives.
@article{wang2023role2, title = {The role of computer games in Chinese students' narrative writing: A case study with The Sims}, journal = {Osaka JALT Journal}, volume = {10}, author = {Wang, Qiao and Li, Ke}, year = {2023}, month = dec, }
The role of live transcripts in synchronous online L2 classrooms: Learning outcomes and learner perceptions

Qiao Wang and Yijun Chen

Education and Information Technologies, Apr 2023

Abs DOI Bib HTML PDF

This study explored the role of live transcripts in online synchronous academic English classrooms by focusing on how automatically generated live transcripts influence the learning outcomes of lower-proficiency and higher-proficiency learners and on their perceptions towards live transcripts. The study adop ted a 2 × 2 factorial design, with the two factors being learner proficiency (high vs. low) and availability of live transcription (presence and absence). The participants were 129 second-year Japanese university students from four synchronous classes taught on Zoom by the same teacher under an academic English reading course. Learning outcomes in this study were evaluated according to the course syllabus through grades and participation in class activities. A questionnaire consisting of nine Likert-scale questions and a comment box was administered to explore participants’ perceived usefulness of, perceived ease of use of, and perceived reliance on live transcripts. Results showed that contrary to previous studies reporting the effectiveness of captioned audiovisual materials in L2 learning, live transcripts as a special type of captions were not effective in promoting the grades of learners of either proficiency. However, it significantly improved the activity participation of lower-proficiency learners, but not that of higher-proficiency learners. Questionnaire results showed that there were no significant differences between learners of two proficiencies in their perceptions towards live transcription, which contradicts previous findings that lower-proficiency learners tend to rely more on captions. Besides enhancement of lecture comprehension, participants reported innovative uses of live transcripts such as screenshots with transcripts for notetaking purposes and transcripts downloaded for later review.
@article{wang2023role, title = {The role of live transcripts in synchronous online L2 classrooms: Learning outcomes and learner perceptions}, author = {Wang, Qiao and Chen, Yijun}, journal = {Education and Information Technologies}, volume = {28}, number = {11}, pages = {14783--14804}, year = {2023}, month = apr, publisher = {Springer}, doi = {10.1007/s10639-023-11784-8} }
A content-controlled monolingual comparable corpus approach to comparing learner and proficient argumentative writing

Qiao Wang, Laurence Anthony, and Nurul Ihsan Arshad

Research Methods in Applied Linguistics, Aug 2023

Abs DOI Bib HTML

This mixed-methods study approaches the differences between learner and proficient argumentative writing by building a content-controlled monolingual comparable corpus (CCMCC) that contains learner-teacher sample pairs of the same semantic content. Twenty-seven learner samples were collected from 27 Chinese university students who each wrote on one topic from the second writing task of IELTS Academic. To generate content-controlled teacher samples, an experienced teacher revised or rewrote each learner sample after confirming the ideas learners intended to express through individual and face-to-face communication with each learner. Then, a native speaker checked the language of the teacher samples. In data analysis, each learner-teacher sample pair was analyzed using Coh-Metrix to generate statistics in 45 indices under text length, syntax, lexicon, and cohesion, after which a shortlist of indices of both statistically and practically significant differences was identified. Qualitatively, the researchers identified the differences through side-by-side comparisons of sample pairs and coded the important patterns in the differences to explore their underlying reasons. This approach generated different quantitative results from previous corpus-based comparative writing studies, such as the ineffectiveness of cohesion indices to distinguish learner and proficient writing. Qualitative analysis further revealed noteworthy findings including the lack of concision in learner writing and learners’ unfamiliarity with using prepositional phrases to express actions. The advantages, limitations and implications of this approach are discussed.
@article{wang2023content, title = {A content-controlled monolingual comparable corpus approach to comparing learner and proficient argumentative writing}, volume = {2}, issn = {2772-7661}, doi = {10.1016/j.rmal.2023.100053}, number = {2}, journal = {Research Methods in Applied Linguistics}, publisher = {Elsevier BV}, author = {Wang, Qiao and Anthony, Laurence and Arshad, Nurul Ihsan}, year = {2023}, month = aug, pages = {100053}, }
The Use of Network-Based Virtual Worlds in Second Language Education: A Research Review

Mark Peterson, Qiao Wang, and Maryam Sadat Mirzaei

Dec 2023

Abs DOI Bib HTML

This chapter reviews 28 learner-based studies on the use of network-based social virtual worlds in second language learning published during the period 2007-2017. The purpose of this review is to establish how these environments have been implemented and to identify the target languages, methods used, research areas, and important findings. Analysis demonstrates that research is characterized by a preponderance of small-scale studies conducted in higher education settings. The target languages most frequently investigated were English, Spanish, and Chinese. In terms of the methodologies adopted, analysis reveals the majority of studies were qualitative in nature. It was found that the investigation of learner target language production, interaction, and affective factors represent the primary focus of research. Although positive findings relating to the above areas have been reported, the analysis draws attention to gaps in the current research base. The researchers provide suggestions for future research.
@inbook{peterson2023use, title = {The Use of Network-Based Virtual Worlds in Second Language Education: A Research Review}, isbn = {9781668475980}, doi = {10.4018/978-1-6684-7597-3.ch011}, booktitle = {Research Anthology on Virtual Environments and Building the Metaverse}, publisher = {IGI Global}, author = {Peterson, Mark and Wang, Qiao and Mirzaei, Maryam Sadat}, year = {2023}, month = dec, pages = {218--236}, }

2022

The use of semantic similarity tools in automated content scoring of fact-based essays written by EFL learners

Qiao Wang

Education and Information Technologies, Jun 2022

Abs DOI Bib HTML

This study searched for open-source semantic similarity tools and evaluated their effectiveness in automated content scoring of fact-based essays written by English-as-a-Foreign-Language (EFL) learners. Fifty writing samples under a fact-based writing task from an academic English course in a Japanese university were collected and a gold standard was produced by a native expert. A shortlist of carefully selected tools, including InferSent, spaCy, DKPro, ADW, SEMILAR and Latent Semantic Analysis, generated semantic similarity scores between student writing samples and the expert sample. Three teachers who were lecturers of the course manually graded the student samples on content. To ensure validity of human grades, samples with discrepant agreement were excluded and an inter-rater reliability test was conducted on remaining samples with quadratic weighted kappa. After the grades of the remaining samples were proven valid, a Pearson correlation analysis between semantic similarity scores and human grades was conducted and results showed that InferSent was the most effective tool in predicting the human grades. The study further pointed to the limitations of the six tools and suggested three alternatives to traditional methods in turning semantic similarity scores into reporting grades on content.
@article{wang2022use, title = {The use of semantic similarity tools in automated content scoring of fact-based essays written by EFL learners}, volume = {27}, issn = {1573-7608}, doi = {10.1007/s10639-022-11179-1}, number = {9}, journal = {Education and Information Technologies}, publisher = {Springer Science and Business Media LLC}, author = {Wang, Qiao}, year = {2022}, month = jun, pages = {13021--13049}, }
Evaluation Dataset of Multiple-Choice Cloze Items for Vocabulary Training and Testing

Ralph L. Rose, Naho Orita, Ayaka Sugawara, and 1 more author

In Proceedings of the 2022 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Sep 2022

Abs DOI Bib HTML

Vocabulary learning is a typical part of nearly any second language learning curriculum. This entails methodologies and materials for training and testing vocabulary knowledge in learners. In large-scale programs, the preparation of such materials can be labor intensive and thus automatic means of generation are desirable. VocaTT (Vocabulary Training and Testing) is an ongoing project to use machine learning methods to generate novel multiple choice cloze (i.e., fill-in-the-blank) items for use in second language learning programs. This paper describes the ongoing creation of a gold standard set of multiple-choice cloze items to be used in training a machine learning algorithm. Machine-generated multiple choice cloze items were reviewed by two experienced language teachers, who evaluated each item for well-formedness (i.e., suitability as multiple-choice cloze test item) with three options: reject as unsalvageable, keep as-is, or revise into a well-formed item as they thought best. Results for a 600-item set that both checkers evaluated show moderate agreement on the question of rejection but slight agreement for keeping as-is. For revised items, the agreement on what type of revisions to make was slight to fair. In an expanded set of 2,792 items, checkers judged most items as needing revision but made varying kinds of revisions to yield well-formed items. Interested researchers may contact the authors to inquire about how they may access and use the evaluation dataset.
@inproceedings{rose2022evaluation, series = {UbiComp/ISWC '22}, title = {Evaluation Dataset of Multiple-Choice Cloze Items for Vocabulary Training and Testing}, doi = {10.1145/3544793.3560378}, booktitle = {Proceedings of the 2022 ACM International Joint Conference on Pervasive and Ubiquitous Computing}, publisher = {ACM}, author = {Rose, Ralph L. and Orita, Naho and Sugawara, Ayaka and Wang, Qiao}, year = {2022}, month = sep, collection = {UbiComp/ISWC '22}, }
Out-of-school language learning through digital gaming: a case study from an activity theory perspective

Ke Li, Mark Peterson, and Qiao Wang

Computer Assisted Language Learning, May 2022

Abs DOI Bib HTML

This study applies Activity Theory to describe and analyze an out-of-school project in which eight Chinese university students utilized a massively multiplayer online game (MMOG) to learn English. Based on data collected through questionnaires, gaming journals, gaming recordings and interviews, thematic analysis was performed to identify the recurrent themes, which were then mapped onto the activity system. Four contradictions were identified in the process. Temporary contradictions dominated the early phase of the project and were easily resolved. However, inherent contradictions, mainly manifesting themselves through inadequate competence and learner variation, remained unresolved. Efforts to overcome these tensions resulted in the evolvement of the activity system. In terms of the actual outcomes, there was evidence for the development and exercise of autonomy. Learners also reported enhanced confidence and gains in vocabulary, listening and oral fluency. The study contributes new knowledge to the field by revealing how non-gamers make use of digital gaming for language learning in an informal setting. Pedagogical implications for digital game-based language learning are discussed and suggestions for future research are also provided.
@article{li2022out, title = {Out-of-school language learning through digital gaming: a case study from an activity theory perspective}, issn = {1744-3210}, doi = {10.1080/09588221.2022.2067181}, journal = {Computer Assisted Language Learning}, publisher = {Informa UK Limited}, author = {Li, Ke and Peterson, Mark and Wang, Qiao}, year = {2022}, month = may, pages = {1--29}, }
A Review of Research on the Application of Digital Games in Foreign Language Education

Mark Peterson, Jeremy White, Maryam Sadat Mirzaei, and 1 more author

May 2022

Abs DOI Bib HTML

The use of digital games represents an expanding domain in computer-assisted language learning (CALL) research. This chapter reviews the findings of 26 learner-based studies in this area that are informed by cognitive and social accounts of SLA. The analysis shows that massively multiplayer online role-playing games (MMORPGs) are the most frequently investigated game type and the majority of studies involved EFL learners in higher education. Mixed methods were the most frequent research tool utilized by researchers. Limitations of current research include the preponderance of small-scale experimental studies that investigated only a limited number of factors. Although the research is not conclusive, findings indicate that game play facilitates collaboration, the production of target language output, vocabulary learning, and reduces the influence of factors that inhibit learning. This chapter concludes by identifying promising areas for future research.
@inbook{peterson2022review, title = {A Review of Research on the Application of Digital Games in Foreign Language Education}, doi = {10.4018/978-1-6684-3710-0.ch094}, booktitle = {Research Anthology on Developments in Gamification and Game-Based Learning}, publisher = {IGI Global}, author = {Peterson, Mark and White, Jeremy and Mirzaei, Maryam Sadat and Wang, Qiao}, year = {2022}, pages = {1948--1971}, }

2021

Using Community of Inquiry to Scaffold Language Learning in Out-of-School Gaming: A Case Study

Ke Li, Mark Peterson, and Qiao Wang

International Journal of Game-Based Learning, Jan 2021

Abs DOI Bib HTML

This paper reports on a project that draws upon the framework of the community of inquiry to support game-based language learning outside classroom. Case study design was employed to collect and analyze both qualitative and quantitative data, with a view to investigating the participants’ language development, participation, and perception. This study spanned a 6-week period and involved 11 intermediate English learners in China. The volunteer participants played an interactive adventure game in an out-of-class setting, with the instructor present and scaffolds available online. Results showed that the participants gained statistically significant vocabulary development and believed they made progress in listening and reading. Moreover, it is found that the participants were the most active in the first two and final weeks. The findings also showed general satisfaction and improved learning autonomy, highlighting the pivotal role of the instructor. The paper concludes by discussing its limitations and identifying future research directions.
@article{li2021using, title = {Using Community of Inquiry to Scaffold Language Learning in Out-of-School Gaming: A Case Study}, volume = {11}, issn = {2155-6857}, doi = {10.4018/ijgbl.2021010103}, number = {1}, journal = {International Journal of Game-Based Learning}, publisher = {IGI Global}, author = {Li, Ke and Peterson, Mark and Wang, Qiao}, year = {2021}, month = jan, pages = {31--52}, }

2020

The Role of Classroom-Situated Game-Based Language Learning in Promoting Students’ Communicative Competence

Qiao Wang

International Journal of Computer-Assisted Language Learning and Teaching, Apr 2020

Abs DOI Bib HTML

The study is the second in a series of mixed-methods studies on the integration of The Sims 4, a life-simulation game, into language classrooms. In this study, the researcher explores the effect of game-based language learning (GBLL) on students’ English communicative competence from three aspects, interaction, fluency and content, in a Japanese university. In class, students received instruction from the teacher on game language and gameplay skills, played the game on their own and presented gameplay stories. The presentations were recorded for evaluation. Surveys were also administered for students’ perceptions on the GBLL classroom. Results showed that no clear improvement in communicative competence was suggested by quantitative evaluation. Qualitatively data, however, indicated that the game afforded students interesting events and proper expressions in presentations and that the teacher played a vital role in ensuring ample interactional opportunities and linguistic support. Suggestions for future research in classroom-situated GBLL were also proposed.
@article{wang2020role, title = {The Role of Classroom-Situated Game-Based Language Learning in Promoting Students' Communicative Competence}, volume = {10}, issn = {2155-7101}, doi = {10.4018/ijcallt.2020040104}, number = {2}, journal = {International Journal of Computer-Assisted Language Learning and Teaching}, publisher = {IGI Global}, author = {Wang, Qiao}, year = {2020}, month = apr, pages = {59--82}, }

2019

Classroom intervention for integrating simulation games into language classrooms: An exploratory study with the SIMS 4

Qiao Wang

CALL-EJ, Apr 2019

Abs Bib HTML PDF

This study explored three forms of classroom intervention: teacher instruction, peer interaction and in-class activities, for the purpose of integrating simulation games into a vocabulary-focused English classroom. The aim was to establish which intervention is most effective, as well as what improvements should be made for future application. The study took the form of a controlled experiment and evaluation of the interventions was based on concurrently collected quantitative and qualitative data. The researcher concluded that while quantitative data failed to confirm any statistical significance between the two groups, qualitative data suggested two forms of intervention, teacher instruction and in-class activities, were effective. Peer interaction, however, did little to promote vocabulary acquisition. The researcher proposes implementing more diversified in-class activities and game quests relating to curriculum goals in existing classroom interventions. The discussion concludes by highlighting promising areas for future research.
@article{wang2019classroom, title = {Classroom intervention for integrating simulation games into language classrooms: An exploratory study with the SIMS 4}, author = {Wang, Qiao}, journal = {CALL-EJ}, volume = {20}, number = {2}, pages = {101--127}, year = {2019}, publisher = {Asia-Pacific Association for Computer-Assisted Language Learning (APACALL)} }
The Use of Network-Based Virtual Worlds in Second Language Education: A Research Review

Mark Peterson, Qiao Wang, and Maryam Sadat Mirzaei

Apr 2019

Abs DOI Bib HTML

This chapter reviews 28 learner-based studies on the use of network-based social virtual worlds in second language learning published during the period 2007-2017. The purpose of this review is to establish how these environments have been implemented and to identify the target languages, methods used, research areas, and important findings. Analysis demonstrates that research is characterized by a preponderance of small-scale studies conducted in higher education settings. The target languages most frequently investigated were English, Spanish, and Chinese. In terms of the methodologies adopted, analysis reveals the majority of studies were qualitative in nature. It was found that the investigation of learner target language production, interaction, and affective factors represent the primary focus of research. Although positive findings relating to the above areas have been reported, the analysis draws attention to gaps in the current research base. The researchers provide suggestions for future research.
@inbook{peterson2019use, title = {The Use of Network-Based Virtual Worlds in Second Language Education: A Research Review}, issn = {2372-1111}, doi = {10.4018/978-1-5225-7286-2.ch001}, booktitle = {Advances in Linguistics and Communication Studies}, publisher = {IGI Global}, author = {Peterson, Mark and Wang, Qiao and Mirzaei, Maryam Sadat}, year = {2019}, pages = {1--25} }