- Web Data Mining
Exploring Hyperlinks, Contents, and Usage Data
- © 2011
- Latest edition
![](http://academichelp.site/777/templates/cheerup/res/banner1.gif)
Dept. Computer Science, University of Illinois, Chicago, Chicago, USA
You can also search for this author in PubMed Google Scholar
- Covers all key tasks and techniques of Web search and Web mining, i.e., structure mining, content mining, and usage mining
- Includes major algorithms from data mining, machine learning, information retrieval and text processing, which are crucial for many Web mining tasks
- Contains a rich blend of theory and practice, addressing seminal research ideas and also looking at the technology from a practical point of view
- Second edition includes new/revised sections on supervised learning, opinion mining and sentiment analysis, recommender systems and collaborative filtering, and query log mining
- Ideally suited for classes on data mining, Web mining, Web search, and knowledge discovery in data bases
- Provides internet support with lecture slides and project problems
- Includes supplementary material: sn.pub/extras
Part of the book series: Data-Centric Systems and Applications (DCSA)
134k Accesses
361 Citations
4 Altmetric
This is a preview of subscription content, log in via an institution to check access.
Access this book
- Available as PDF
- Read on any device
- Instant download
- Own it forever
- Compact, lightweight edition
- Dispatched in 3 to 5 business days
- Free shipping worldwide - see info
- Durable hardcover edition
Tax calculation will be finalised at checkout
Other ways to access
Licence this eBook for your library
Institutional subscriptions
About this book
Liu has written a comprehensive text on Web mining, which consists of two parts. The first part covers the data mining and machine learning foundations, where all the essential concepts and algorithms of data mining and machine learning are presented. The second part covers the key topics of Web mining, where Web crawling, search, social network analysis, structured data extraction, information integration, opinion mining and sentiment analysis, Web usage mining, query log mining, computational advertising, and recommender systems are all treated both in breadth and in depth. His book thus brings all the related concepts and algorithms together to form an authoritative and coherent text.
The book offers a rich blend of theory and practice. It is suitable for students, researchers and practitioners interested in Web mining and data mining both as a learning text and as a reference book. Professors can readily use it for classes on data mining, Web mining, and text mining. Additional teaching materials such as lecture slides, datasets, and implemented algorithms are available online.
Similar content being viewed by others
Text and Web Content Mining: A Systematic Review
Web Usage Mining—Process, Tools and Practices
Text Mining: The New Data Mining Frontier
Information integration.
- Information Retrieval
- Machine Learning
- Opinion Mining
- Pattern Mining
- Recommender Systems
- Schema Matching
- Semi-Supervised Learning
Social Network Analysis
- Structured Data Extraction
Unsupervised Learning
Web crawling.
- Web Link Analysis
Web Usage Mining
- Wrapper Generation
Table of contents (12 chapters)
Front matter, introduction, data mining foundations, association rules and sequential patterns, supervised learning, partially supervised learning.
- Bing Liu, Wee Sun Lee
Information Retrieval and Web Search
- Bing Liu, Filippo Menczer
Structured Data Extraction: Wrapper Generation
Opinion mining and sentiment analysis.
- Bing Liu, Bamshad Mobasher, Olfa Nasraoui
Back Matter
From the reviews:
"This is a textbook about data mining and its application to the Web. […] Liu succeeds in helping readers appreciate the key role that data mining and machine learning play in Web applications. […] It also motivates the student by adding immediacy and relevance to the concepts and algorithms described. I liked the way the concepts are introduced in a stepwise manner. […] I also appreciated the bibliographical notes at the end of each chapter." ACM Computing Reviews, W. Hu, , January 2009
From the reviews of the second edition:
“Liu (Univ. of Illinois, Chicago) discusses all three types of Web mining--structure, content, and usage--in the technology’s efforts to glean information from hyperlinks, Web page content, and usage logs. […] Practical examples complement the discussions throughout the text, and each chapter includes useful ‘Bibliographic Notes’ and an extensive bibliography. […] Liu states that his intended audience includes bothundergraduate and graduate students, but notes that researchers and Web programmers could benefit from this text as well. Summing Up: Recommended. Upper-division undergraduates through professionals.” J. Johnson, Choice, Vol. 49 (5), January 2012
"[...] Liu's book provides a comprehensive, self-contained introduction to the major data mining techniques and their use in Web data mining. [...] Professionals and researchers alike will find this excellent book handy as a reference. Its extensive lists of references at the end of each chapter provide hundreds of pointers for further reading. As a textbook, it is also suitable for advanced undergraduate and graduate courses on Web mining; it is highly selfcontained and includes many easy-to-understand examples that will help readers grasp the key ideas behind current Web data mining techniques." ACM Computing Reviews, Fernando Berzal, February 2012
Authors and Affiliations
About the author.
Bing Liu is a professor of Computer Science at the University of Illinois at Chicago (UIC). He received his PhD in Artificial Intelligence from the University of Edinburgh. Before joining UIC, he was with the National University of Singapore. His current research interests include opinion mining and sentiment analysis, text and Web mining, data mining, and machine learning. He has published extensively in top journals and conferences in these fields. Several of his publications are considered seminal papers of the fields and are highly cited. He has also given more than 30 keynote and invited talks in academia and in industry. On professional services, Liu has served as associate editors of IEEE Transactions on Knowledge and Data Engineering (TKDE), Journal of Data Mining and Knowledge Discovery (DMKD), and SIGKDD Explorations, and is on the editorial boards of several other journals. He has also served as program chairs of IEEE International Conference on Data Mining (ICDM-2010), ACM Conference on Web Search and Data Mining (WSDM-2010), ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2008), SIAM Conference on Data Mining (SDM-2007), ACM Conference on Information and Knowledge Management (CIKM-2006), and Pacific Asia Conference on Data Mining (PAKDD-2002). Additionally, Liu has served extensively as area chairs and program committee members of leading conferences on data mining, Web mining, natural language processing, and machine learning. More information about him can be found from http://www.cs.uic.edu/~liub.
Bibliographic Information
Book Title : Web Data Mining
Book Subtitle : Exploring Hyperlinks, Contents, and Usage Data
Authors : Bing Liu
Series Title : Data-Centric Systems and Applications
DOI : https://doi.org/10.1007/978-3-642-19460-3
Publisher : Springer Berlin, Heidelberg
eBook Packages : Computer Science , Computer Science (R0)
Copyright Information : Springer-Verlag GmbH Germany, part of Springer Nature 2011
Hardcover ISBN : 978-3-642-19459-7 Published: 26 June 2011
Softcover ISBN : 978-3-642-26891-5 Published: 03 August 2013
eBook ISBN : 978-3-642-19460-3 Published: 25 June 2011
Series ISSN : 2197-9723
Series E-ISSN : 2197-974X
Edition Number : 2
Number of Pages : XX, 624
Topics : Information Storage and Retrieval , Statistics for Engineering, Physics, Computer Science, Chemistry and Earth Sciences , Data Mining and Knowledge Discovery , Pattern Recognition , Artificial Intelligence
- Publish with us
Policies and ethics
- Find a journal
- Track your research
- Corpus ID: 18032585
Web Mining - Concepts, Applications & Research Directions
- J. Srivastava , P. Desikan , Vipin Kumar
- Computer Science
Figures from this paper
![web mining research papers 2019 pdf figure 3.1](https://d3i71xaburhd42.cloudfront.net/eaa8d33e838c65d54e562d163f8f4b5cbfe98ae4/4-Figure3.1-1.png)
17 Citations
A review on web mining, analysis of web mining technology and their impact on semantic web.
- Highly Influenced
A Lime Light on the Emerging Trends of Web Mining
A guesstimate on web usage mining algorithms and techniques, a hand to hand taxonomical survey on web mining, a survey: techniques of an efficient search annotation based on web content mining, see blockindiscussions, blockinstats, blockinand blockinauthor blockinprofiles blockinfor blockinthis blockinpublication web blockinmining blockinto blockincreate blockinsemantic blockincontent: blockina case blockinstudy blockinfor blockinthe blockinenvironment, web mining to create semantic content: a case study for the environment, web usage mining for automatic link generation, survey of web content mining and relation extraction techniques, 62 references, web mining: information and pattern discovery on the world wide web, web usage mining: discovery and applications of usage patterns from web data, web mining research: a survey, data preparation for mining world wide web browsing patterns, data mining on the web, mining e-commerce data: the good, the bad, and the ugly, adaptive web sites: conceptual cluster mining, the world-wide web: quagmire or gold mine, the anatomy of a large-scale hypertextual web search engine, a belief-driven method for discovering unexpected patterns, related papers.
Showing 1 through 3 of 0 Related Papers
![web mining research papers 2019 pdf](https://journalcjast.com/public/site/pageHeaderTitleImage_en_US.png)
Current Journal of Applied Science and Technology
Published: 2023-08-14
DOI: 10.9734/cjast/2023/v42i244179
Page: 32-42
Issue: 2023 - Volume 42 [Issue 24]
Review Article
Exploring the Landscape of Web Data Mining: An In-depth Research Analysis
Laxmi Choudhary *
Computer Science, Sabarmati University, Ahmedabad, India.
Shashank Swami
Department of Computer Science, Sabarmati University, Ahmedabad, India.
*Author to whom correspondence should be addressed.
The exponential growth of Web services and Web-based applications has led to an enormous volume of data, providing a rich source for mining valuable insights. Web mining differs from traditional data mining due to the unique nature of the data it handles. Web data exists in diverse forms, including web server logs, news pages, and hyperlinks. As the usage of the internet continues to surge, web mining has become essential to extract meaningful information and patterns from these varied data sources. Traditional data mining methods may not be directly applicable to web data due to its unstructured and heterogeneous nature. Web server logs contain valuable information about user interactions, click-streams, and user preferences, which can be mined to understand user behavior and improve website performance. News pages and other forms of web content are valuable sources for sentiment analysis, topic modeling, and information retrieval, helping businesses and researchers gain insights into public opinions and trends. Additionally, web structure mining deals with the analysis of hyperlinks, enabling the discovery of relationships between web pages and identifying authoritative sources. The continuous growth of web-based data necessitates the use of specialized methods in web mining to effectively extract knowledge and valuable patterns. Researchers and practitioners in this field are constantly exploring innovative techniques to make sense of the vast amount of data available on the World Wide Web. The paper provides web mining techniques on web data and presenting the latest advancements, researchers and practitioners can gain insights into the state of the field and identify potential areas for further exploration. This paper also reports the comparisons and summary of various methods of web data mining with applications, which gives the overview of development in research and some important research issues.
Keywords: Information retrieval, semantic web, text mining, web crawling, web mining, web content mining, web data mining, web structure mining, web usage mining
How to Cite
- Download Citation
- Endnote/Zotero/Mendeley (RIS)
Margaret H. Dunham, ―Data Mining Introductory & Advanced Topics‖, Pearson Education.
Mustafa Ali Bamboat, Ghulam Sarfaraz Khan, Naadiya Mirbahar, Sheeba Memon, “Web Content Mining Techniques for Structured Data: A Review” ( SJHSE) Sindh Journal of Headways in Software Engineering. 2022;1(1).
Richlin Selina Jebakumari A. Nancy Jasmine Goldena. A Survey on Web Content Mining Methods and Applications for Perfect Catch Responses. International Research Journal of Engineering and Technology (IRJET). 2019;06(01): 407-412. e-ISSN: 2395-0056 p-ISSN: 2395-0072.
O Etzioni. The world wield web: Quagmire or Gold Mining. Communicate of the ACM. 1996;39:11:65-68.
Kosala and Blockeel, “Web mining research: A survey”, SIGKDD:SIGKDD Explorations: Newsletter of the Special Interest Group (SIG) on Knowledge Discovery and Data Mining, ACM. 2000;2.
Qingyu Zhang and Richard s. Segall, “Web mining: a survey of current research, Techniques, and software”, in the International Journal of Information Technology & Decision Making. 2008;7(4): 683–720.
Sharma PS, Yadav D, Thakur RN. Web Page Ranking Using Web Mining Techniques: A Comprehensive Survey”. In M. P. Kumar Reddy (Ed.), Mobile Information Systems. 2022;2022:1–19. Hindawi Limited. Available: https://doi.org/10.1155/2022/7519573 .
Kumar S, Kumar R. A Study on Different Aspects of Web Mining and Research Issues. In IOP Conference Series: Materials Science and Engineering. 2021;1022(1):012018. IOP Publishing.
Available: https://doi.org/10.1088/1757- 899x/1022/1/012018.
Andemariam Mebrahtu, Balu Srinivasulu. Web Content Mining Techniques and Tools;2017. IJSCMC
URL: https://www.ijcsmc.com/docs/papers/April2017/V6I4201725.pdf
Zhang H, Chen Z, Li M, Su Z. Relevance feedback and learning in content-based image search, World Wide Web. 2003;6(2):131–155.
Anil B. Pawar, Madhuri A. Jawale, Chaitanya P. Kale. A Powerful Techniques and Applications of Web Mining” in Intelligent Systems and Computer Technology. 269-276. DOI:10.3233/APC200153.
Wang Bo, Xu Jing. Research on Web Data Mining Hadoop Simulation Platform Based on Cloud Computing", Electronic Design Engineering. 2018;26(2): 22-25.
Chen L, Lian W, Chue W. Using web structure and summarization techniques for web content mining, Inform. Process. Management: Int. J. 2005;41(5):1225–1242.
Kavita, Mahani P, Ruhil N. Web data mining: A perspective of research issues and challenges. In international conference on computing for sustainable global development. 2016;3235-8. IEEE.
Yu-Hui Tao, Tzung-Pei Hong, Yu-Ming Su. Web usage mining with intentional browsing data” in international journal of Expert Systems with Applications. 2007;34:1893–1904.
Ramakrishna, Gowdar et al. Web Mining: Key Accomplishments, Applications and Future Directions”, in the International Conference on Data Storage and Data Engineering; 2010.
Furnkranz J. Web structure mining — exploiting the graph structure of the worldwide web, OGAI-J. 2002;21(2):17–26.
Nacim Fateh Chikhi, Bernard Rothenburger, Nathalie Aussenac-Gilles. A Comparison of Dimensionality Reduction Techniques for Web Structure Mining. Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. 2007;116-119.
Singh B, Singh HK. Web data mining research: a survey. In international conference on computational intelligence and computing research. 2010;1-10. IEEE.
Sunil kumar T, Suvarchala K. A Study: Web Data Mining Challenges and Application for
Information Extraction, IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661, ISBN: 2278-8727. 2012;7(3):24-29.
Just J. A Short Survey of Web Data Mining WDS'13 Proceedings of Contributed Papers, Part I, 2013;59–62. ISBN 978-80-7378-250-4, MATFYZPRESS.
Jianhan Zhu, Jun Hong et al. Using Markov Models for Web Site Link Prediction” College Park, Maryland, USA ACM. 2002;11-15.
© Copyright 2010-Till Date, Current Journal of Applied Science and Technology. All rights reserved.
- Search Research
- Eindhoven Artificial Intelligence Systems Institute
- Institute for Complex Molecular Systems
- Eindhoven Hendrik Casimir Institute
- Eindhoven Institute for Renewable Energy Systems
- Artificial Intelligence
- Smart Mobility
- Engineering Health
- Integrated Photonics
- Quantum Technology
- High Tech Systems Center
- Data Science
- Humans and Technology
- Future Chips
- Research Groups
- Other labs and facilities
- Researchers
- Applied Physics and Science Education
- Biomedical Engineering
- Built Environment
- Chemical Engineering and Chemistry
- Eindhoven School of Education
- Electrical Engineering
- Industrial Design
- Industrial Engineering and Innovation Sciences
- Mathematics and Computer Science
- Mechanical Engineering
- National Grants
- International Grants
- TU/e Distinctions
- Sectorplans
- Research assessments
- Winners TU/e Science Awards
- Research Support Network
Information Systems IE&IS
The Information Systems (IS) group studies novel tools and techniques that help organizations use their information systems to support better operational decision making.
![web mining research papers 2019 pdf web mining research papers 2019 pdf](https://assets.w3.tue.nl/w/fileadmin/_processed_/0/0/csm_Information_Systems_keyvisual_3de12db607.jpg)
Create value through intelligent processing of business information
Information Systems are at the core of modern-day organizations. Both within and between organizations. The Information Systems group studies tools and techniques that help to use them in the best possible way, to get the most value out of them.
In order to do that, the IS group helps organizations to: (i) understand the business needs and value propositions and accordingly design the required business and information system architecture; (ii) design, implement, and improve the operational processes and supporting (information) systems that address the business need, and (iii) use advanced data analytics methods and techniques to support decision making for improving the operation of the system and continuously reevaluating its effectiveness.
We do so in various sectors from transportation and logistics, mobility services, high-tech manufacturing, service industry, and e-commerce to healthcare.
Against this background, IS research concentrates on the following topics:
- Business model design and service systems engineering for digital services.
- Managing digital transformation.
- Data-driven business process engineering and execution.
- Innovative process modeling techniques and execution engines.
- Human aspects of information systems engineering.
- Intelligent decision support through Artificial Intelligence and Computational Intelligence.
- Data-driven decision making.
- Machine learning to optimize resource allocation.
- All IS news
![web mining research papers 2019 pdf web mining research papers 2019 pdf](https://assets.w3.tue.nl/w/fileadmin/_processed_/a/2/csm_carlos-aranda-QMjCzOGeglA-unsplash_d6a5b81e8e.jpg)
Research Areas
We work on Information Systems topics in three related research areas.
Process Engineering
Process Engineering (PE) develops integrated tools and techniques for data-driven decision support in the design and execution of…
AI for decision-making
AI for Decision-Making (AI4DM) develops methods, techniques and tools for AI-driven decision making in operational business process.
Business Engineering
Business Engineering (BE) investigates and develops new concepts, methods, and techniques - including novel data-driven approaches - for the…
![](http://academichelp.site/777/templates/cheerup/res/banner1.gif)
Application domains
We focus on the application of Information Systems in the following domains.
Service Industry
Service organizations, including banks, insurance companies, and governmental bodies, fully rely on information provisioning to do their…
Information Systems are the backbone of modern health(care) ecosystems. They are critical for clinical research, clinical operations, and…
Information Systems focuses on the business architecture design of new mobility solutions that are safe, efficient, affordable and…
Transportation and Logistics
Information Systems facilitate monitoring and planning of transportation and logistics resources. By doing so, they ultimately help to…
Smart Industry
The digital transformation of industry is leveraged by Information Systems providing integrated data and process management and AI-enabled…
Meet some of our researchers
Yingqian zhang, pieter van gorp, karolin winter, baris ozkan, isel grau garcia, laura genga, sybren de kinderen, banu aysolmaz, remco dijkman, maryam razavian, laurens bliek, oktay türetken, konstantinos tsilionis.
- Meet all our researchers
Recent Publications
- See all publications
Our most recent peer reviewed publications
Acceptance of Mobility-as-a-Service: Insights from empirical studies on influential factors
A revised cognitive mapping methodology for modeling and simulation, backpropagation through time learning for recurrence-aware long-term cognitive networks, an explainable data-driven decision support framework for strategic customer development, data-driven aggregate modeling of a semiconductor wafer fab to predict wip levels and cycle time distributions.
![web mining research papers 2019 pdf web mining research papers 2019 pdf](https://assets.w3.tue.nl/w/fileadmin/_processed_/b/f/csm_shutterstock_238107649_14dc736662.jpg)
Open source
We encourage innovation from our research. This is why we share the open-source codes from our research projects.
- Link to our open source codes
Work with us!
Please check out the TU/e vacancy pages for opportunities within our group.
If you are a student, potential sponsor or industrial partner and want to work with us, please contact the IS secretariat or the Information Systems group chair, dr.ir. Remco Dijkman
Visiting address
Postal address.
How IBM helps Wimbledon use generative AI to drive personalised fan engagement
This collaboration with Wimbledon teams extends beyond the fan-facing digital platform, into enterprise-wide transformation.
Authentication vs. authorization: What’s the difference?
6 min read - Authentication verifies a user’s identity, while authorization gives the user the right level of access to system resources.
Applying generative AI to revolutionize telco network operations
5 min read - Learn the many potential applications that operators and suppliers are capitalizing on to enhance network operations for telco.
Re-evaluating data management in the generative AI age
4 min read - A good place to start is refreshing the way organizations govern data, particularly as it pertains to its usage in generative AI solutions.
Top 7 risks to your identity security posture
5 min read - Identity misconfigurations and blind spots stand out as critical concerns that undermine an organization’s identity security posture.
June 27, 2024
IBM announces new AI assistant and feature innovations at Think 2024
June 26, 2024
A major upgrade to Db2® Warehouse on IBM Cloud®
June 25, 2024
Increase efficiency in asset lifecycle management with Maximo Application Suite’s new AI-power...
Achieving operational efficiency through Instana’s Intelligent Remediation
June 24, 2024
Manage the routing of your observability log and event data
Best practices for augmenting human intelligence with AI
2 min read - Enabling participation in the AI-driven economy to be underpinned by fairness, transparency, explainability, robustness and privacy.
Microcontrollers vs. microprocessors: What’s the difference?
6 min read - Microcontroller units (MCUs) and microprocessor units (MPUs) are two kinds of integrated circuits that, while similar in certain ways, are very different in many others.
Mastering budget control in the age of AI: Leveraging on-premises and cloud XaaS for success
2 min read - As organizations harness the power of AI while controlling costs, leveraging anything as a service (XaaS) models emerges as strategic.
Highlights by topic
Use IBM Watsonx’s AI or build your own machine learning models
Automate IT infrastructure management
Cloud-native software to secure resources and simplify compliance
Run code on real quantum systems using a full-stack SDK
Aggregate and analyze large datasets
Store, query and analyze structured data
Manage infrastructure, environments and deployments
Run workloads on hybrid cloud infrastructure
Responsible AI can revolutionize tax agencies to improve citizen services
Generative AI can revolutionize tax administration and drive toward a more personalized and ethical future.
Intesa Sanpaolo and IBM secure digital transactions with fully homomorphic encryption
6 min read - Explore how European bank Intesa Sanpaolo and IBM partnered to deliver secure digital transactions using fully homomorphic encryption.
What is AI risk management?
8 min read - AI risk management is the process of identifying, mitigating and addressing the potential risks associated with AI technologies.
How IBM and AWS are partnering to deliver the promise of responsible AI
4 min read - This partnership between IBM and Amazon SageMaker is poised to play a pivotal role in shaping responsible AI practices across industries
Speed, scale and trustworthy AI on IBM Z with Machine Learning for IBM z/OS v3.2
4 min read - Machine Learning for IBM® z/OS® is an AI platform made for IBM z/OS environments, combining data and transaction gravity with AI infusion.
The recipe for RAG: How cloud services enable generative AI outcomes across industries
4 min read - While the AI is the key component of the RAG framework, other “ingredients” such as PaaS solutions are integral to the mix
Rethink IT spend in the age of generative AI
3 min read - It's critical for organizations to consider frameworks like FinOps and TBM for visibility and accountability of all tech expenditure.
IBM Newsletters
Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser .
Enter the email address you signed up with and we'll email you a reset link.
- We're Hiring!
- Help Center
![web mining research papers 2019 pdf paper cover thumbnail](https://0.academia-photos.com/attachment_thumbnails/43379260/mini_magick20190216-29886-jjr6hk.png?1550319728)
Use of web mining in studying innovation
![web mining research papers 2019 pdf Profile image of Philip Shapira](https://a.academia-assets.com/images/s65_no_pic.png)
2014, Scientometrics
Related Papers
Scientometrics
Janna Axenbeck
Existing approaches to model innovation ecosystems have been mostly restricted to qualitative and small-scale levels or, when relying on traditional innovation indicators such as patents and questionnaire-based survey, suffered from a lack of timeliness, granularity, and coverage. Websites of firms are a particularly interesting data source for innovation research, as they are used for publishing information about potentially innovative products, services, and cooperation with other firms. Analyzing the textual and relational content on these websites and extracting innovation-related information from them has the potential to provide researchers and policy-makers with a cost-effective way to survey millions of businesses and gain insights into their innovation activity, their cooperation, and applied technologies. For this purpose, we propose a web mining framework for consistent and reproducible mapping of innovation ecosystems. In a large-scale pilot study we use a database with ...
![web mining research papers 2019 pdf web mining research papers 2019 pdf](https://a.academia-assets.com/images/loswp/related-pdf-icon.png)
Web-based innovation indicators may provide new insights into firm-level innovation activities. However, little is known yet about the accuracy and relevance of web-based information. In this study, we use 4,485 German firms from the Mannheim Innovation Panel (MIP) 2019 to analyze which website characteristics are related to innovation activities at the firm level. Website characteristics are measured by several text mining methods and are used as features in different Random Forest classification models that are compared against each other. Our results show that the most relevant website characteristics are the website’s language, the number of subpages, and the total text length. Moreover, our website characteristics show a better performance for the prediction of product innovations and innovation expenditures than for the prediction of process innovations
Web-based innovation indicators may provide new insights into firm-level innovation activities. However, little is known yet about the accuracy and relevance of web-based information for measuring innovation. In this study, we use data on 4,487 firms from the Mannheim Innovation Panel (MIP) 2019, the German contribution to the European Community Innovation Survey (CIS), to analyze which website characteristics perform as predictors of innovation activity at the firm level. Website characteristics are measured by several data mining methods and are used as features in different Random Forest classification models that are compared against each other. Our results show that the most relevant website characteristics are textual content, the use of English language, the number of subpages and the amount of characters on a website. In our main analysis, models using all website characteristics jointly yield AUC values of up to 0.75 and increase accuracy scores by up to 18 percentage point...
6th Global TechMining Conference (GTM)
Marisela Rodríguez Salvador
Understanding the impact of the Management Innovation Systems in a company remains a challenge due to the difficulty of recovering all the activities around the innovation process. Does the company have increased its innovative activities, after the certification process? We can document outputs of innovation such as, projects increases, increased collaborations, awards and patents registered.... This information can be achieved only with traditional sources. But we would like to find a way to predict early deployment of innovation in the company, for this reason, we can use their web pages business to recognize and quantify these changes. The approach to the study is very new. We wish to study the innovation outputs in companies (before and after of certification), using information from traditional sources and also mining the website.
Technovation
Philip Shapira
Quantitative Science Studies
Mikaël Héroux-Vaillancourt
This study explores the use of web content analysis to build innovation indicators from the complete texts of 79 corporate websites of Canadian nanotechnology and advanced materials firms. Indicators of four core concepts (R&D, IP protection, collaboration, and external financing) of the innovation process were built using keywords frequency analysis. These web-based indicators were validated using several indicators built from a classic questionnaire-based survey with the following methods: correlation analysis, multitraits multimethods (MTMM) matrices, and confirmatory factor analysis (CFA). The results suggest that formative indices built with the questionnaire and web-based indicators measure the same concept, which is not the case when considering the items from the questionnaire separately. Web-based indicators can act either as complements to direct measures or as substitutes for broader measures, notably the importance of R&D and the importance of IP protection, which are no...
Andrew Kusiak
A newly introduced product or service becomes an innovation after it has been proven in market. No one likes the fact that market failures of products and services are much more common than commercial successes. The ideas introduced in this paper are applicable to the evaluation of the innovativeness of planned introductions of design changes, products, and services. In fact, blends of products and services could be the most promising way of bringing innovations to the market. The most important toll gates of innovation are a generation of new ideas and their evaluation. People have limited ability to generate and evaluate a large number of potential innovation alternatives. The proposed approach provides a number of such alternatives and evaluates them from the market perspective.
The Journal of Technology Transfer
Entrepreneurial scholarship suggests that a small firm’s ability to grow is a function of its capacity to sense and respond to changes in the market as well as the broader environment for the firm’s goods and services. Developing detailed measures of internal capabilities at a large scale, however, is often hampered by limitations in the availability of data from conventional sources, low survey response rates and panel attrition. The emergence of new information sources, including big data sets derived from the online activities of firms, coupled with advanced computational approaches, raises fresh analytical possibilities. In this exploratory study, we turn to freely accessible website data to gauge internal capabilities, specifically for market sensing and responding. To operationalize the construct of seizing, the paper uses an application of topic modeling, a text mining approach commonly used in computer science, on archived website data from the Wayback Machine for two time p...
Rosa Rio-Belver
EUROPEAN RESEARCH STUDIES JOURNAL
Izabela Dembińska
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
RELATED PAPERS
4th International Conference on Advanced Research Methods and Analytics (CARMA 2022)
Davide Pulizzotto
IJESRT Journal
Data Mining IX
Nelson F R A N C I S C O F A V I L L A Ebecken
Joanna Wiśniewska , Agnieszka Barczak
Economics of Innovation and New Technology
Open Research Europe
Levan Bzhalava
… opportunities: the development of an online …
Keith Bevis
Cybermetrics
Loet Leydesdorff
Travis Horsley
Umi Hartanti
Banu Demirel , Burcu İlter , Güzin Özdağoğlu
Sylvan Katz
Marina Rybalka
WSEAS TRANSACTIONS ON BUSINESS AND ECONOMICS
Bożena Kaczmarska
California Management Review
Deborah Raccagni , Gianmario Verona , Emanuela Prandelli
Internet Histories
Maria Priestley
International Journal of Production Economics
SSRN Electronic Journal
Stanislav Zaichenko
Bianica Pires
BISMA (Bisnis dan Manajemen)
Luthfina Ariyani
Marcelo Persegona , Alan Porter , Roberto de Camargo
ICERI2019 Proceedings
Isabel Seruca
RELATED TOPICS
- We're Hiring!
- Help Center
- Find new research papers in:
- Health Sciences
- Earth Sciences
- Cognitive Science
- Mathematics
- Computer Science
- Academia ©2024
![](http://academichelp.site/777/templates/cheerup/res/banner1.gif)
IMAGES
VIDEO
COMMENTS
PDF | On Nov 28, 2019, Mrs Sunita and others published Research on Web Data Mining | Find, read and cite all the research you need on ResearchGate
Barsagade2 provides a survey paper on web mining usage and pattern discovery. Chau et al.4 discuss personalized multilingual web content mining. Kolari and Joshi24 provide an overview of past and current work in the three main areas of web mining research-content, structure, and usage as well as emerging work in semantic web mining.
Address: 20, Myasnitskaya Street, Moscow 101000, Russia. Abstract. This work analyzes the intellectual structure of data mining as a scientific discipline. T o do this, we use. topic analysis ...
Abstract and Figures. Web Data Mining is an important area of Data Mining which deals with the extraction of interesting knowledge from the World Wide Web, It can be classified into three ...
It is suitable for students, researchers and practitioners interested in Web mining and data mining both as a learning text and as a reference book. Professors can readily use it for classes on data mining, Web mining, and text mining. Additional teaching materials such as lecture slides, datasets, and implemented algorithms are available online.
The Web mining research is a converging research area from several research communities, such as database, IR, and AI research communities especially from machine learning and NLP. This paper is an attempt to put the research done in a more structured way from the machine learning point of view.
Web Mining Research: A Survey Raymond Kosala Department of Computer Science Katholieke Universiteit Leuven Celestijnenlaan 200A, B-3001 Heverlee, Belgium [email protected] Hendrik Blockeel Department of Computer Science Katholieke Universiteit Leuven Celestijnenlaan 200A, B-3001 Heverlee, Belgium [email protected] ABSTRACT
The top most 20 prolific institutions involved in this research have published 21 and more research articles. The mean average is 1.09 research articles per institution. Out of 3034 institutions, top 20 institutions published 556 (16.75%) research papers and the rest of the institution published 2764 (83.25%) research papers respectively.
[12]. Jiawei Han, Kevin, Chen-Chuan Chang "Data Mining for Web Intelligence" IEEE International Conference on Data Mining, 2002. [13]. Qingyu Zhang and Richard s. Segall, Web mining: a survey of current research, Techniques, and software, in the International Journal of Information Technology & Decision Making Vol. 7, No. 4 (2008). [14]. J.
This paper provided a comprehensive survey on the current situation and recent trends on web content mining (WCM) and its applications thereby contributing to the enhancement of the upcoming research in WCM. In recent years, the emergence of WWW (World Wide Web) led to the accumulation of huge amount of information and data. Hence the web is found to consist of unstructured and structured ...
This paper defines Web mining and presents an overview of the various research issues, techniques, and development efforts, and briefly describes WEBMINER, a system for Web usage mining, and concludes the paper by listing research issues. Expand. 1,507. PDF.
Through this systematic review, we review the research of web usage mining (WUM) techniques from 2014 2019 in order to understand the current state of WUM research and answer our research questions; (RQ1) what data sources are used in web usage mining, (RQ2) what data analysis methods are used to extract the knowledge, (RQ3) what are the applications of Web usage mining, and (RQ4) what future ...
Volume-2, Issue-8, August-2019 www.ijresm.com | ISSN (Online): 2581-5792 222 Abstract: Web mining is a newly emerging research area concerned with analyzing the World Wide Web. It is concerned mainly with its content, structure and usage. ... This paper presents an overview on web mining techniques. References [1] Manoj Manuja and Deepak Garg ...
Arkansas 72467-0130, USA. [email protected]. The purpose of this pap er is to provide a more current evaluation and update of web. mining research and techniques available. Current advances in ...
21.1.1 Web Content Mining Web content mining is the process of extracting useful information from the contents of web documents. Content data is the collection of facts a web page is designed to contain. It may consist of text, images, audio, video, or struc-tured records such as lists and tables. Application of text mining to web con-tent has ...
Web structure mining is the process of discovering structure information from the web. The structure of typical web graph consists of Web pages as nodes, and hyperlinks as edges connecting between two related pages.
Wang Bo, Xu Jing. Research on Web Data Mining Hadoop Simulation Platform Based on Cloud Computing", Electronic Design Engineering. 2018;26(2): 22-25. Chen L, Lian W, Chue W. Using web structure and summarization techniques for web content mining, Inform. Process. Management: Int. J. 2005;41(5):1225-1242.
Web usage mining. Furthermore, we survey some of the emerging tools and techniques, and identify sev- eral future research directions. 2 A Taxonomy of Web Mining In this section we present a taxonomy of Web min- ing, i.e. Web content mining and Web usage mining. We also describe and categorize some of the recent
PDF | With the huge amount of information available online, the World Wide Web is a fertile area for data mining research. The Web mining research is at... | Find, read and cite all the research ...
Web mining is an emerging field of data mining used to provide personalization on the web. It consist three major categories i.e. Web Content Mining, Web Usage Mining, and Web Structure Mining. This paper focuses on web usage mining and algorithms used for providing personalization on the web.
In order to do that, the IS group helps organizations to: (i) understand the business needs and value propositions and accordingly design the required business and information system architecture; (ii) design, implement, and improve the operational processes and supporting (information) systems that address the business need, and (iii) use advanced data analytics methods and techniques to ...
The present study examined 3320 global Web Mining research publications, as indexed in Web of Science database during 2009- 18, with a view to understand their growth rate, prolific authors ...
News and thought leadership from IBM on business topics including AI, cloud, sustainability and digital transformation.
Academia.edu is a platform for academics to share research papers ... websites, forums and social media (Sobkowicz et al. 2012; Sobkowicz and Sobkowicz 2012). There are also attempts to use web mining in health research: for instance content mining of website discussion forums to detect concern levels for HIV/AIDS (Sung et al. 2013) and mining ...
In this system the proposed work describe a. web mining is an application of data mining and it uses various techniques to discover. data. A web mining can used website or documents as a resou rce ...