Classification, the task of assigning objects to a given set of categories, is used in almost every field. One important sub-branch of classification consists of methods that learn classification functions from example data. The following chapter will provide an overview of the most basic concepts and methods of this type of data-driven classification. We will first highlight the basic ideas behind classification, along with some examples related to tourism. Thereafter, we will introduce measures of classification performance, which are necessary to direct data-driven training of classification functions and/or to evaluate classification results. As an essential part of this chapter, we will provide self-contained, yet stripped-down, descriptions of the most crucial data-driven classification methods. As such, we will focus on nearest neighbor classifiers, logistic regression, Naïve Bayes, decision trees and ensemble variants thereof, support vector machines, and finally, artificial neural networks. All of the concepts and methods will then be applied to a specific use case in an accompanying Jupyter notebook, demonstrating the practical implementation of these concepts and methods through the use of Python and the machine learning framework scikit-learn.
TitelApplied Data Science in Tourism: Interdisciplinary Approaches, Methodologies, and Applications
Redakteure/-innenRoman Egger
Herausgeber (Verlag)Springer
ISBN (Print)978-3-030-88389-8
PublikationsstatusVeröffentlicht - Jän. 2022


NameTourism on the Verge
BandPart F1051
ISSN (Print)2366-2611
ISSN (elektronisch)2366-262X


  • Classification
  • Machine learning
  • Logistic regression
  • Naïve Bayes
  • Decision tree
  • Random forest
  • Gradient tree boosting
  • Support vector machine
  • Neural network