Turun yliopisto

Capstone

Objective

The goal of this project was to explore whether large volumes of real-world transport data could be leveraged to improve a transport company’s price quotation process. In a highly competitive industry operating on thin margins, even a minor technological edge can offer significant business value.

A multidisciplinary student team from software development and computer science backgrounds set out to develop a software tool that applies machine learning (ML) techniques to generate transport price quotations based on historical data. The challenge required data analysis, machine learning expertise, technical implementation skills, and close collaboration with the client.

Approach

The core idea was to build a predictive model capable of estimating an appropriate price for new transport orders. The model would consider various parameters such as distance, product type, addresses, and historical pricing. The more relevant variables included, the more accurate and useful the predictions could be.

Results

The team successfully developed a working software tool utilizing a neural network model. The system allows the client to estimate transport prices for shipments within Finland, based primarily on distance.

However, it soon became clear that price prediction alone was not sufficient. To deliver more practical value, the tool was extended to include a data exploration interface. This feature provides the client with real-time insights into the data—such as the most similar past transport cases, frequently transported product categories, and order volumes—based on the user’s current query. This shift toward interpretability significantly enhanced the usability of the tool.

Future Development

The solution could be extended to handle international transport pricing, although this would require substantial data preprocessing and standardization. Additional machine learning models, such as Random Forests, show promise for identifying the most influential parameters in price determination. Preliminary tests with such models produced encouraging results.

Nevertheless, data quality remains a concern. Integrating more parameters into the model proved difficult due to inconsistencies in the available data. For meaningful progress in the future, improved data collection practices and standardization within the client’s operations would be essential.
Projektin kuva