众包旅游文本热度地名的共现挖掘Data mining method of hot-toponym and its co-occurrence in crowdsourcing text written by tourists
智烈慧;李仁杰;傅学庆;郭风华;
摘要(Abstract):
针对众包旅游文本中蕴含着大量有待挖掘的旅游者时空行为信息的问题,该文提出基于文本集合的地名权重分配与共现矩阵相结合的计算方法,深入挖掘众包旅游文本中的地名信息,获取旅游地高热度地名间的共现关系。计算结果以共现矩阵和三元组共同存储的方式呈现,便于依据地名类型提取多研究视角下的共现关系子集。以九寨沟为例做实证研究,实现了九寨沟旅游地热度地名自动提取,实现旅游地内热度景观名称之间,及九寨沟与省内、省外热度城市名、省内旅游地名之间4类共现关系的可视化。结果表明:该方法能够适应旅游地理和相关学科对旅游文本内容挖掘的研究需求;挖掘结果对特定群体的旅游空间感知特征与结构有显著的表征意义。
关键词(KeyWords): 众包;旅游文本;热度地名;共现关系;数据挖掘
基金项目(Foundation): 国家自然科学基金项目(41171105,41471127);; 河北省杰出青年科学基金培育项目(D2015205208);; 河北省软科学研究计划项目(134060020);; 河北省校级研究生创新资助项目(xj2015015)
作者(Authors): 智烈慧;李仁杰;傅学庆;郭风华;
DOI: 10.16251/j.cnki.1009-2307.2016.08.030
参考文献(References):
- [1]SPARKS B A,PERKINS H E,BUCKLEY R.Online travel reviews as persuasive communication:the effects of content type,source,and certification logos on consumer behavior[J].Tourism Management,2013,39(2):1-9.
- [2]PAN B,MACLAURIN T,CROTTS J C.Travel blogs and the implications for destination marketing[J].Journal of Travel Research,2007,46(1):35-45.
- [3]XIANG Z,GRETZEL U.Role of social media in online travel information search[J].Tourism Management,2010,31(2):179-188.
- [4]AKEHURST G.User generated content:the use of blogs for tourism organisations and tourism consumers[J].Service Business,2009,34(1):51-61.
- [5]LU W,STEPCHENKOVA S.Ecotourism experiences reported online:classification of satisfaction attributes[J].Tourism Management,2011,33(3):702-712.
- [6]STEPCHENKOVA S,ZHAN F.Visual destination images of Peru:comparative content analysis of DMO and user-generated photography[J].Tourism Management,2013,36(3):590-601.
- [7]GOODCHILD M F.Citizens as sensors:the world of volunteered geography[J].GeoJournal,2007,69(4):211-221.
- [8]SUI D Z,ELWOOD S,GOODCHILD M,et al.Crowdsourcing geographic knowledge:volunteered geographic information(VGI)in theory and practice[M].Berlin Heidelberg:Springer Science&Business Media,2012.
- [9]甄峰,王波,陈映雪.基于网络社会空间的中国城市网络特征:以新浪微博为例[J].地理学报,2012,67(8):1031-1043.
- [10]陈映雪,甄峰,王波,等.基于社会网络分析的中国城市网络信息空间结构[J].经济地理,2013,33(4):56-63.
- [11]SHEN Y,KWAN M,CHAI Y.Investigating commuting flexibility with GPS data and 3Dgeovisualization:a case study of Beijing,China[J].Journal of Transport Geography,2013,32(7):1-11.
- [12]王守成,郭风华,傅学庆,等.基于自发地理信息的旅游地景观关注度研究:以九寨沟为例[J].旅游学刊,2014,29(2):84-92.
- [13]肖亮,赵黎明.互联网传播的台湾旅游目的地形象:基于两岸相关网站的内容分析[J].旅游学刊,2009,24(3):75-81.
- [14]戴光全,梁春鼎.基于网络文本内容分析的重大事件意义研究:以2011西安世界园艺博览会为例[J].旅游学刊,2012,27(10):36-45.
- [15]郭风华,王琨,张建立,等.成都“五朵金花”乡村旅游地形象认知:基于博客游记文本的分析[J].旅游学刊,2015,30(4):84-94.
- [16]赵振斌,党娇.基于网络文本内容分析的太白山背包旅游行为研究[J].人文地理,2011,26(1):134-139.
- [17]SHI G,BARKER K.Extraction of geospatial information on the web for GIS applications[C]//Proceedings of the 10th IEEE International Conference on Cognitive Informatics and Cognitive Computing,ICCI*CC 2011.New Jersey,USA:IEEE,2011:41-48.
- [18]CAMPELO C E C,DE SOUZA BAPTISTA C.A model for geographic knowledge extraction on web documents[M]//Advances in Conceptual Modeling-Challenging Perspectives.Berlin Heidelberg:Springer,2009:317-326.
- [19]余丽,陆锋,张恒才.网络文本蕴涵地理信息抽取:研究进展与展望[J].地球信息科学,2015,17(2):127-134.
- [20]张雪英,张春菊,闾国年.地理命名实体分类体系的设计与应用分析[J].地球信息科学学报,2010,12(2):220-227.
- [21]刘瑜,张毅,田原,等.广义地名及其本体研究[J].地理与地理信息科学,2008,23(6):1-7.
- [22]LIU Y,WANG F,KANG C,et al.Analyzing relatedness by toponym co-occurrences on web pages[J].Transactions in GIS,2014,18(1):89-107.
- [23]MAMEI M,ROSI A,ZAMBONELLI F.Automatic analysis of geotagged photos for intelligent tourist services[C]//IEEE International Conference on Intelligent Environments(IE).New Jersey,USA:IEEE,2010:146-151.
- [24]STEPCHENKOVAA S,MORRISONB A M.The destination image of Russia:from the online induced perspective[J].Tourism Management,2006,27(5):943-956.
- [25]CHOI S,LEHTO X Y,MORRISON A M.Destination image representation on the web:content analysis of Macau travel related websites[J].Tourism Management,2007,28(1):118-129.
- [26]王曰芬,宋爽,卢宁,等.共现分析在文本知识挖掘中的应用研究[J].中国图书馆学报,2007,33(2):59-64.
- [27]OVERELL S,RGER S.Using co-occurrence models for placename disambiguation[J].International Journal of Geographical Information Science,2008,22(3):265-287.
- [28]LEIBOVICIA D G,BROSSETB D,CLARAMUNTB C,et al.k-Co-occurrences density map estimation[J].Procedia Environmental Sciences,2015,26:105-109.
- [29]李照航,郭风华,李仁杰,等.大量网络游记文本中热度地名提取方法与实证研究[J].地理与地理信息科学,2015,31(1):68-73.
- [30]杨洁,季铎,蔡东风,等.基于联合权重的多文档关键词抽取技术[J].中文信息学报,2008,22(6):75-79.
- [31]李敏,张捷,罗浩,等.基于旅游动机的旅游业灾后恢复重建研究:以“5·12”汶川地震后的九寨沟为例[J].旅游学刊,2012,27(1):39-48.
- [32]董雪旺,张捷,蔡永寿,等.基于旅行费用法的九寨沟旅游资源游憩价值评估[J].地域研究与开发,2012,31(5):78-84.
- [33]LIU Y,LIU X,GAO S,et al.Social sensing:a new approach to understanding our socioeconomic environments[J].Annals of the Association of American Geographers,2015,105(3):512-530.