傳播研究「向運算轉」的趨勢漸明,結合文本探勘技術來做新聞研究的嘗試也不斷湧現。新的研究工具為研究者帶來了新的挑戰,過去新聞理論的落腳點多是中觀的抽象概念,如框架、新聞價值、報導立場、隱喻等,而文本探勘技術的基本分析單位卻是微觀的詞語,這使得新聞文本探勘的概念化與操作化發生在兩個不同的層次,要如何銜接這兩個層次,是研究者亟需解決的問題,也是本研究問題意識的出發點。在此,我們提出一個基於語料庫的質性內容分析法(Corpus-based qualitative content analysis)的研究設計模型,這一模型批判性地繼承了內容分析、詮釋典範與文本探勘技術三個元素,且展示了一套系統的、穩定的、數據驅動(data-driven)的關鍵詞篩選方法,它使研究者可以依照總體分佈特徵來確定篩選門檻,有助於避免過度依賴經驗法則的問題。
The coming age of computational communication has greatly witnesses a rapid growth of introducing text mining technology into news analysis, bringing with it some opportunities as well as some challenge to us researchers. Speaking of challenge, conventional journalism theory cared much about the use of meso-level concepts, such as frames, news value and metaphors, while the unit of text mining analysis is micro-level words. This causes fracture between the conceptualization and the operationalization process of text analysis. To solve this problem, we propose a new approach of corpus-based qualitative content analysis, which critically integrates the convention of content analysis, interpretive paradigm and text mining technology and demonstrates a systematic data-driven method of keywords selection. We hope this comprehensive analysis approach could find the balance between qualitative interpretation and quantitative calculation and could be further employed by other researchers in this field.