Text Analytics for Supporting Decision Discussion in Online Discourse
|School||Shanghai Jiaotong University|
|Course||Management Science and Engineering|
|Keywords||Text analytics in online discourse Sense making Deception detection Networkopinion modeling Social meida Language-action perspective|
The rapid growth of social media has dramatically changed the communication manner.Fact-to-face meeting is no longer a unique manner for discussion and/or decision-making withinorganization. Organizations are increasingly utilizing social media technologies to support theirbusiness-related functions. Although organizations are deriving considerate benefits from their use,the text analytics in online discourse presents several important challenges. How to use text inonline discourse to support decision discussion is still a critical issue in practice. In this study, wefocus on text analytics to support decision discussion in online discourse.The research background and significance is firstly introduced in this dissertation, and theresearch objective is defined as text in online discourse. After completing the literature reviewrelevant to discourse data field, we describe the research gap and then state the four researchequestions that need exploration. The main works and contributions are summarized as follows:(1) A LAP-based text analytics to support sense-making in online discourse is proposed.Based on “Language-Action Perspective” theory, we propose a LAP-based text analyticsinvolving a LAP-based text analysis framework and a collection of hypothesizes to supportsense-making in online discourse. It converts the disrupted messages into SATrees. Exiting textanalytics tools tend to focus on the semantic dimension of language (e.g., word segmentation, POS,and sentiment analysis). Previous researches cannot resolve the problem of disrupted turnadjacency. SATrees provides enhanced representation of coherence relations and communicationactions. Guided by LAP-based text analytics, we developed LTAS. The results of experimentsshow:1) SATrees gernerated by LTAS can improve representation of conversation structure andactions/intentions;2) SATrees with more accurate identification of coherence relations canimprove representation of social network centrality measures for discussion participants.3)SATrees can facilitate enhanced user sense-making involving action, situated action and symbolicaction of online discourse as compared to other methods.(2) This dissertation propose a series of text analysis algorithms including conversationdisentanglement, coherence analysis and speech act classification.We developed algorithms about identification of coherence relations and speech act.①Theconversation disentanglement algorithm (DSA) uses linguistic features to compute inter-messagesimilarity. The result of experiment shows DSA significantly outperformed all five comparisonmethods in terms of precision, recall and F-measure.②The discussion logic feature is inputtedinto a machine learning classifier, TBL. We collectively refer to the TBL plus residual matching method as TBL-RM. This method can automatically identify the coherence relation amongmessages. The result of experiment demonstrates the efficacy of the discussion logic feature.Furthermore, TBL-RM outperformed all three comparison methods and manual method gain overTBL-RM was not significant.③Speech act classification uses a two-stage approach comprisedof an initial classifier and a tree kernel-based classifier. The result of experiment shows theproposed speech act classification outperformed all five comparison methods.(3) We find deceptive opinion spam by incorporating deception linguistic featuresPeople increasingly tend to publish their review and comments on social media platform.Consequently, websites containing reviews are becoming targets of opinion spam. While recentwork has focused primarily on manually identifiable instances of opinion spam, in this work, westudy deceptive opinion spam-fictious reviews that have been deliberately written to soundauthentic. Integrating work from psychology and deception behavior in social media, we propose11deception linguistic cues which are divided into three categories including term frequency,information abundance and information convinced ability. Next, we develop online review spamdetection system and compare different collection of deception features. The result of experimentdemonstrates the precision of identification of fake review is nearly80%on our gold-standardreview dataset. Lastly, the analysis of deception linguistic cues reveals relations between fakereview and previous deception detection theory. The research will contribute to identification ofimaginative writing review.(4) A network opinion modeling method of “topic-stakeholder-sentiment” is pointed out.In previous study, the network opinion analysis is based on structured information processing.In understanding text content perspective, we propose a network opinion modeling frameworkincluding data preparation and network modeling. The data preparation attempts to collect andfilter online reviews on decision problem, and then labels the linguistic content using varioussemantic marks. The network opinion modeling part involves topic analysis of decision problem,similarity computation between online review content and topics, discovering stakeholders andsentiment analysis. Last, we create the “topic-stakeholder-sentiment” model which enhances theanalysis ability of online reviews’ sentiment polarity degree to specific decision problems. Wecollected text data from major websites and forums as experiment dataset. And we choose “works’high burden of medical expense” and “rural cooperative medical system” as discussion problem.Case analysis demonstrates the efficacy of proposed network opinion modeling method.