ENQUIRE PROJECT DETAILS BY GENERAL PUBLIC

Project Details
Funding Scheme : General Research Fund
Project Number : 14600019
Project Title(English) : Chinese Lexicon Project Part II: A Database of Normed Naming Performance for Chinese Two-character Compound Words 
Project Title(Chinese) : 漢語詞匯方案第二部分:一個關於雙字詞的標準化唸名任務的數據庫 
Principal Investigator(English) : Prof Tse, Chi-Shing 
Principal Investigator(Chinese) :  
Department : Educational Psychology
Institution : The Chinese University of Hong Kong
E-mail Address : cstse@cuhk.edu.hk 
Tel :  
Co - Investigator(s) :
Prof Yap, Melvin J.
Panel : Humanities, Social Sciences
Subject Area : Psychology and Linguistics
Exercise Year : 2019 / 20
Fund Approved : 470,227
Project Status : Completed
Completion Date : 30-6-2022
Project Objectives :
To examine the main effects of lexical variables and interactive effects among them on young adults’ speeded naming performance with two-character Chinese compound words in multiple regression analyses
To compare the lexical effects on young adults’ speeded naming and lexical decision performance with two-character Chinese compound words to determine whether the effects are task-general or task-specific
To make available online the repository of lexical variables and speeded naming behavioral measures for two-character Chinese compound words
Abstract as per original application
(English/Chinese):

此項目的研究目的是建立一個標準化、25,281個雙字詞的唸名反應時間和準確率的數據庫,並探討字符和詞彙變量如何影響大學生對這些雙字詞的唸名反應時間和準確率。 項目背景 透過香港研究資助局的支持,我們創立了一個關於雙字詞的標準化詞匯判斷(即,決定兩個字符是否構成中文雙字詞)反應時間和準確率的數據庫。此數據庫提供雙字詞的詞彙變量和其描述性統計(Tse et al.,2017)。與詞匯判斷任務相似,在唸名任務(即,朗讀出雙字詞)中的表現,也常用於過往中文詞語識別過程的研究。 項目設計 我們計劃創立一個唸名反應時間和準確率的數據庫來擴展漢語詞匯方案。我們將收集大學生對由Tse等人編制的25,281個雙字詞的唸名反應時間和準確度數據。同時,我們會添加與字符和詞彙音韻相關的其他變量。我們將透過項目級多元回歸模型分析平均唸名反應時間和準確率。 項目意義 透過詞彙變量的資料和標準化雙字詞的唸名反應時間和準確率,我們計劃解決以下問題。首先,我們將探討詞彙變量的主要影響及其潛在的相互作用,以測試字形變量(例如,筆劃數),語音變量(例如,同音密度)和語義變量(例如,語義透明度)對唸名反應時間和準確率的預測能力。其次,我們將比較詞彙變量對詞匯判斷和唸名表現的影響,以確定很多詞語識別過程的結果是任務特定的(例如,僅在唸名任務中發生,而不在詞匯判斷中發生)還是任務不特定的現象。第三,對一些先前在因子設計實驗中證據不太穩固但理論上很重要的詞彙效應(例如,同音密度×字符頻率相互作用),我們會透過對標準化雙字詞的唸名反應時間和準確率的項目級多元回歸模型分析來驗證這些效應能否在超過二萬個詞語的數據庫中出現。 總體而言,此項研究將為學術界提供一個重要的數據庫,以提高我們對中文詞語識別過程的認識,及為心理語言學研究作出重要貢獻。
Realisation of objectives: We successfully recruited participants and collected data as planned in our proposal. We collected norms for speeded naming reaction times and accuracy rates, and compiled lexical variables (e.g., phonological consistency and semantic neighborhood size) for 25,281 two-character Chinese words. We published our work (i.e., Tse, Chan, Yap, & Tsang, in press) in Behavior Research Methods, which is a top journal in the field (impact factor = 5.953 and 5-year impact factor = 7.867) and ranked 6th out of 91 in Psychology, Experimental Category, and 1st out of 13 in Psychology, Mathematical Category in the most updated version of Social Science Citation Index. In Tse, Chan, Yap, and Tsang (in press), we reported this database and using the megastudy approach we conducted item-level regression analyses to test the relative predictive power of orthographic variables (e.g., stroke count), phonological variables (e.g., phonological consistency), and semantic variables (e.g., semantic transparency) in speeded naming performance. This fulfills our Objective 1. We also compared the effects of lexical variables on naming performance and Tse et al.’s (2017, Behavior Research Methods) lexical decision performance to examine the extent to which those effects are task-specific or task-general. This fulfills our Objective 2. We make this dataset freely accessible to the research community (see https://osf.io/vwnps) as this resource provides a valuable addition to other influential mega-databases, such as Balota et al. (2004) and furthers our understanding of Chinese word recognition processes. The large-scale corpus data of various lexical variables and normed lexical decision and speeded naming performance with more than 25,000 Chinese words provide important benchmarking data for Chinese word processing, which will facilitate the informing and constraining of computational models. Freely accessible to the research community, this resource provides a valuable addition to other influential Lexicon Projects (e.g., English, Dutch, French, and Malay) being rapidly developed across the world and in line with the current research zeitgeist. This fulfills our Objective 3. In Tse, Yap, and Chan (2023), a poster titled, “Are Chinese words accessed as a whole or via characters in lexical decision and speeded naming task?”, that is under review for the 33rd International Congress of Psychology, we performed linear mixed-effect analyses on Tse et al.’s (2017) lexical decision data and Tse et al.’s (in press) speeded naming data. We first controlled for the effects of orthographic (e.g., stroke count), phonological (e.g., consistency), and semantic (e.g., number of meanings) character-variables and word-variables. Next, we tested interactive effects among semantic transparency (ST, whether the characters are semantically related to the word), word frequency (WF), and character frequency (CF). In lexical decision, we obtained ST x CF and WF x CF interactions, but not a ST x WF interaction. The effect of CF was stronger for transparent words than for opaque words and was stronger for low-frequency words than for high-frequency words. In speeded naming, we found ST x CF and WF x CF interactions, resembling the patterns seen in lexical decision. Importantly, there was also a ST x WF interaction: the effect of ST was positive (faster recognition for transparent, compared to opaque, words) for high-frequency words but negative for low-frequency words. These results suggest that word-level processing is involved in Chinese word processing, as demonstrated by the influence of WF and ST variables, and character-level processing is also involved, as indicated by the modulation of the CF variable. These results shed light on the interactions between character-level and word-level properties. These empirical benchmarks will help inform and constrain models of Chinese word processing. This also fulfills our Objective 1. Furthermore, we completed data analyses to investigate the main effects of neighborhood size, neighbor frequency, homophone density, and phonological frequency, as well as the interaction effects associated with neighborhood size in lexical decision and speeded naming performance, as motivated by previous research (e.g., Li et al., 2015; Tse & Yap, 2018; Wu et al., 2013). A manuscript based on these analyses is submitted to Journal of Memory and Language, which is attached with this report.
Summary of objectives addressed:
Objectives Addressed Percentage achieved
1.To examine the main effects of lexical variables and interactive effects among them on young adults’ speeded naming performance with two-character Chinese compound words in multiple regression analysesYes100%
2.To compare the lexical effects on young adults’ speeded naming and lexical decision performance with two-character Chinese compound words to determine whether the effects are task-general or task-specificYes100%
3.To make available online the repository of lexical variables and speeded naming behavioral measures for two-character Chinese compound wordsYes100%
Research Outcome
Major findings and research outcome: In Tse, Chan, Yap, and Tsang (in press, Behavior Research Methods), we reported a database of norms for speeded naming reaction times and accuracy rates, as well as the compilation of lexical variables (e.g., consistency and neighborhood size) for 25,281 two-character Chinese words. In item-level regression analyses, we tested the relative predictive power of orthographic variables (e.g., stroke count), phonological variables (e.g., regularity), and semantic variables (e.g., semantic transparency) in speeded naming performance. We replicated most of the benchmark findings in speeded naming, such as the effects of phonological consistency, neighborhood size, and phonological regularity. Moreover, we compared the influence of the words and/or the first and second characters on speeded naming and found that the first character had a stronger influence than the second character in most of the lexical variables (e.g., character frequency and semantic transparency). This suggests that the pronunciation of Chinese words is likely activated through the combination of the characters, rather than via holistic representation of the word. Furthermore, we compared the predictive power of phonological versus semantic variables in speeded naming versus lexical decision by comparing the current speeded naming dataset with the lexical decision dataset that we established in our previous work (Tse et al., 2017, Behavior Research Methods). Importantly, we observed that task demands influence relative contribution of phonological and semantic variables in speeded naming and lexical decision performance: orthographic variables accounted for more variance than semantic and phonological variables in both lexical decision and speeded naming performance, but semantic variables accounted for more variance than phonological variables in lexical decision while phonological variables accounted for more variance than semantic variables in speeded naming. In Tse, Yap, and Chan (2023), we performed linear mixed-effect analyses on Tse et al.’s (2017, in press) data to test interactive effects among semantic transparency, word frequency, and character frequency. These findings suggest that both word-level and character-level processing are at play in Chinese word recognition, whether participants make lexical decision or read aloud the words. Similar to our previous work (Tse et al., 2017), we have made speeded naming dataset freely available to researchers across various disciplines. They can use it to select word stimuli matched on extraneous lexical variables across conditions when designing factorial-design experiments, run virtual replication experiments to test whether previous findings can be replicated with different stimuli that share similar lexical characteristics, and compile more lexical variables to address future research questions.
Potential for further development of the research
and the proposed course of action:
Using the current speeded naming dataset and Tse et al.’s lexical decision dataset, we are now investigating the interaction effects among lexical variables in lexical decision and speeded naming performance. This research is motivated by previous studies, e.g., word frequency x neighborhood size interaction (Li et al., 2015). Those interaction effects were previously obtained in factorial design experiments that may not necessarily generalize when more extraneous variables are controlled for (as was the case in our megastudy approach). This work is submitted for publication and attached with this report. In the next step, we plan to apply for General Research Fund to norm auditory lexical decision performance. In this task, participants listen to a sound clip of a two-character Chinese word or a recombined-character Chinese nonword and decide whether it is a correct Chinese word by keypress. By comparing this dataset with Tse et al.'s (2017) visual lexical decision dataset, we will examine relative importance of lexical variables influencing performance in different modalities (visual and auditory). Similar projects were reported in English (Goh, Yap, & Chee, 2020; Tucker et al., 2019) and French (Ferrand et al., 2018). To our knowledge, no such project has been reported for Chinese two-character words.
Layman's Summary of
Completion Report:
In this project, we have created a large-scale repository of lexical variables and speeded naming responses for more than 25,000 traditional Chinese two-character words. Using this dataset and the one we previously developed for lexical decision responses, we identified how continuously varying lexical characteristics, such as the number of strokes and character/word frequency, individually and jointly modulate word recognition performance. We also tested whether this modulation depends on the task demands, such as judging the lexicality of the word or reading aloud the word. This repository of lexical characteristics and behavioral measures is a valuable resource for researchers. It allows them to select word stimuli matched on lexical variables across conditions when designing factorial-design experiments, run virtual replication experiments to test whether previous findings can be replicated with different stimuli that share similar lexical characteristics, and compile new lexical variables to address research questions of their interest. This resource is freely accessible to the research community and provides a valuable addition to other influential Lexicon Projects, such as those for English, Dutch, French, and Malay, that are being rapidly developed across the world and are in line with the current research zeitgeist.
Research Output
Peer-reviewed journal publication(s)
arising directly from this research project :
(* denotes the corresponding author)
Year of
Publication
Author(s) Title and Journal/Book Accessible from Institution Repository
Chi‑Shing Tse*, Yuen‑Lai Chan, Melvin J. Yap, Ho Chung Tsang  The Chinese Lexicon Project II: A megastudy of speeded naming performance for 25,000+ traditional Chinese two‑character words  No 
Chi-Shing Tse*, Melvin J. Yap, and Yuen-Lai Chan  The Role of Lexical Variables in the Lexical Decision and Naming of Two-Character Chinese Words: A Megastudy Analysis  No 
Recognized international conference(s)
in which paper(s) related to this research
project was/were delivered :
Month/Year/City Title Conference Name
Online A megastudy of speeded naming performance for 8,427 traditional Chinese two-character compound words in Hong Kong Cantonese speakers  The 62nd Annual Meeting of the Psychonomic Society 
Other impact
(e.g. award of patents or prizes,
collaboration with other research institutions,
technology transfer, etc.):

  SCREEN ID: SCRRM00542