Abstract

Regression is the task of calculating a numerical value based on an object's set of characteristics. One important sub-branch consists of methods that learn regression functions from example data. The following chapter will provide an overview of the most basic concepts and methods of this type of data-driven regression. While we refer to the previous chapter regarding the most basic principles of supervised machine learning, here we will introduce measures of regression performance that are necessary to direct data-driven training of regression functions and/or to evaluate regression results. Regarding regression methods, we will concentrate on linear regression, regression trees and ensemble variants thereof, support vector regression, and artificial neural networks. These concepts and methods will then be applied to a tourism-related case, demonstrating the practical implementation of such, using Python and the machine learning framework scikit-learn. The use case will exemplify how to model the prediction of the total sum of bookings for a hotel based on web tracking data. Predictions with linear regression, support vector regression, decision trees, random forests, and neural networks will be calculated and evaluated.
Original languageEnglish
Title of host publicationApplied Data Science in Tourism: Interdisciplinary Approaches, Methodologies, and Applications
EditorsRoman Egger
Place of PublicationCham
PublisherSpringer
Pages209-229
Number of pages21
ISBN (Print)978-3-030-88389-8
DOIs
Publication statusPublished - Jan 2022

Publication series

NameTourism on the Verge
VolumePart F1051
ISSN (Print)2366-2611
ISSN (Electronic)2366-262X

Keywords

  • Gradient tree boosting
  • Linear regression
  • Machine learning
  • Neural network
  • Random forest
  • Regression
  • Regression tree
  • Support vector machine

Cite this