Abstract
The number of vehicles in Kenya grows at a rate of 12% annually, with the national registered fleet standing at 4 million as of 2018. All these vehicles have to be valued regularly for a variety of reasons not limited to insurance, resale, leasing and accounting. As such, it is important to have an easy to use, reliable, readily available system that can determine the value of a vehicle given some properties about the said vehicle. The variation of values obtained from different valuers for identical vehicles exposes irregularities in the contemporary automobile valuation systems. When in need of quick car valuation services, the lack of consistent, accurate and readily available tools to perform the required valuation is glaring, as the primary way to get an automobile valued is through contacting an expert from a licensed evaluation firm or an insurance agent. The existing car valuation mechanisms rely chiefly on expert opinions and the use of the formulae to calculate a used car’s compound annual depreciation which is subtracted from the price at 0 mileage, adjusted for inflation over the years. There have been attempts to automate vehicle valuation by use of machine learning, which yielded promising results. Multiple regression analysis has been employed to identify vehicle properties that have the greatest bearing on the value of the vehicle, as well as predict the price given values of the different parameters. This approach has also been applied successfully in other domains for valuation of assets such as land and FMCGs. For this study, a multi-agent systems architecture was employed to encapsulate three regression models for vehicle value prediction, as well as a natural language processing model to extract vehicle features from vehicle descriptions in unstructured text. The three models were built and trained to generate predictions, each leveraging either of the SVM-based regression and Neural Networks (ANNs) implementation in WEKA, or the Deep Learning regression provided by WekaDeeplearning4j version 3.8.5. The best performing model provided a reliable option for vehicle valuation, with 11% relative mean error, having been trained on only 1000 rows of data, out of a possible 200,000 records, and thus was used in the design of the functional prototype. Given the temporal, budgetary and computational resource restrictions on this study, there is great potential for improving the performance of the prediction models given more time, data and computing power.
List of important Abbreviations
AMS – JADE platform's in-built Agent Management System
DF – JADE platform’s Directory Facilitator
RMS – JADE platform's in-built Remote Management System
ANNs – Artificial Neural Networks
CUDA – NVIDIA's Compute Unified Device Architecture
GPU – Graphical processing unit
JADE – Open-source Java Agent DEvelopment Framework.
JDK – Java Development Kit, Standard edition
MAS – Multi agent system
MLP – Multi-layer Perceptron
NLU – Natural language understanding
OpenNLP – Apache's open-source OpenNLP project for natural language processing
RSE – Relative Squared Error
SMOreg/SVMreg – SVM-based regression implementation
WEKA - Waikato Environment for Knowledge Analysis
Table of contents
Declaration 2
Acknowledgement 3
Abstract 3
List of important Abbreviations 5
Table of contents 6
List of Tables and Diagrams 8
Definition of Important Terms 10
Chapter 1: Introduction
1.1 Background of the study 11
1.2 Statement of the problem 12
1.3 Study Objectives 12
1.3.1 General Objective 12
1.3.2 Specific Objectives 13
1.4 Goals 13
1.5 Limitations of the study 13
1.6 Scope of the study 13
1.7 Expected Contributions 13
1.8 Proposal Organisation 14
Chapter 2. Literature Review
2.1 Theoretical literature review 15
2.1.1 Introduction 15
2.1.2 Linear Regression 15
2.1.3 Natural Language Processing 15
2.1.4 Agents 16
2.1.4.1 Multi-Agent Systems 17
2.1.5 Machine Learning 17
2.1.6 Feature Extraction from Natural Language 17
2.1.6.1 Natural language understanding 17
2.1.6.2 Named-entity recognition 18
2.1.7 Artificial Neural Networks 18
2.1.7.1 ANNs and Deep Neural Networks 18
2.1.8 WEKA, Waikato Environment for Knowledge Analysis 19
2.1.9 Asset Valuation 19
2.1.9.1 Machine Learning for Car Valuation 19
2.1.9 Web Scraping 19
2.2 Empirical literature review 20
2.2.1 Raphael Kieti M. 20
2.2.2 Hammad Hai & Haydn Ramanna Sonnad 20
2.2.3 Kaneeka Vidanage & Amjadh Ifthikar 20
2.2.4 Zhang Yuquan & Chang Jiangxue 21
2.2.5 Sandbhor & Chaphalkar 22
2.3 Opportunities for improvement 22
2.4 Conceptual model 23
2.5 Chapter Summary: Literature Review 23
Chapter 3: Research Methodology
3.1 Introduction 25
3.2 Feasibility study 25
3.2.1 Time feasibility 25
3.2.2 Technical feasibility 25
3.2.3 Financial feasibility 25
3.2.4 Functional feasibility 25
3.3 Prometheus Design Methodology for Multi-Agent Systems 25
3.3.1 Iterative Development Process 26
3.3.2 Phase 1: specification of the system 27
3.3.3 Phase 2: Agent specification and architectural design 28
3.3.3.1 Agent descriptor for the User interaction agent 28
3.2.3.2 Monitor and Restoration agent’s descriptor 29
3.2.3.3 Descriptor for structured dataset pre-processing agents: 29
3.2.3.4 Descriptor for the text pre-processing agent: 30
3.2.3.5 Agent descriptor for Model trainer agents 30
3.2.3.6 Agent descriptor for Instance Prediction Agents 31
3.2.4 Phase 3: The detailed design 31
3.2.4.1 Common Agent-Actions and Percepts 32
3.2.4.2 Behaviours common to all agents 32
3.2.4.3 Common functionality 32
3.2.4.4 System Event Descriptors 33
event descriptor: Performance evaluation 33
GUI Action 33
Instance valuation event descriptor 33
event descriptor: Raw Text Preprocessing 34
Model training Action 34
3.4 Research procedure 35
3.4.1 Resource aggregation stage 35
3.4.2 Methods Used to Collect Data 36
3.4.2.1 Considerations for web scraping 36
3.4.3 Data Pre-processing 36
3.5 Unprocessed data repository 37
3.6 Algorithm training and comparison 38
Chapter 4: Data Analysis, Prototype Design and Implementation
4.1 Overview 39
4.2 Current vehicle valuation practice analysis 39
4.2.1 Contemporary used car valuation methods 39
4.2.2 Inspection of unprocessed data repository 39
4.3 Selection of attributes 41
4.4 Data pre-processing 43
4.5 Splitting the dataset 44
4.6 Algorithm testing and selection 44
4.7 Implementation 46
4.7.1 Tools 46
4.7.2 Graphical User Interface Design 47
4.7.3 Agent Platform 49
Chapter 5: Results and Discussion
5.1 Results 51
5.2 Discussion 52
5.2.1 Quality of data 52
5.2.2 Nature of data 52
5.2.3 Dimensionality reduction 53
5.2.4 CPU vs GPU 53
5.2.5 Suitability of the Multi-agent Architecture 53
5.2.6 Ethical Considerations for Collecting Data by Web Scraping 53
Chapter 6: Conclusion and Recommendations
6.1 Conclusion 55
6.2 Contributions 55
6.3 Future Work 55
References 57
Appendices 60
Flow chart of machine learning centred research 60
Sample SQL queries deployed to cleanse unwanted data from the repository 60
Approximate project schedule 61
Requirements 61
Financial plan 62
List of Tables and Diagrams
List of Figures
Figure 1. A Deep Neural Network 19
Figure 2. A high-level diagram of the conceptual design. 24
Figure 3. Common notation used for Prometheus design elements 27
Figure 4. A stratified depiction of the 3 phases of Prometheus methodology 28
Figure 5. General flow of the research process 36
Figure 6. design of the repository database 38
Figure 7. screenshot of algorithm parameters on WEKA workbench 39
Figure 8. PhPMyadmin Visualization of Raw Data Table 41
Figure 9. WEKA pre-processor visualization window for all attributes 42
Figure 10. Attribute-Relation File Format supported by W.E.K..A. 44
Figure 11. SMOreg algorithm performance after preliminary testing 45
Figure 12. Visualization of ANN layers during preliminary testing 46
Figure 13: Out of memory error caused by training Weka on the full training dataset. 47
Figure 14. Prototype UI: Model Training Tab 48
Figure 15. Prototype UI: Prediction Tab 49
Figure 16. Prototype UI: Text Feature Extraction and Prediction Window 49
Figure 17. Prototype UI: System Logs Tab 50
Figure 18. Remote Monitoring Agent URI showing the main-container 50
Figure 19. Communication between agents shown by a Sniffer Agent 51
Figure 20. Research process, highlighting machine learning steps 61
Figure 21. Expected project duration 62
List of Tables
Table 1. Expected input data and corresponding actions.
29
Table 2. Sample agent-descriptor: User-interaction agent 30
Table 3. Sample agent-descriptor: System Runner 30
Table 4. Sample agent-descriptor: pre-processing agent 31
Table 5. sample agent-descriptor: Text pre-processing agents 31
Table 6. sample agent-descriptors: Model training agents
32
Table 7. sample agent-descriptor: Instance valuation 32
Table 8. event descriptor: Model evaluation 34
Table 9. event descriptor: User interface event 34
Table 10. event descriptor: Vehicle instance valuation 35
Table 11. event descriptor: NLP event 35
Table 12. event descriptor: Model training 35
Table 13. Sample features for Text based model 43
Table 14. Sample features used in main model 43
Table 15. The ignored attributes 44
Table 16. Algorithm performance: 4433 rows, 3 features. 52
Table 17. Algorithm performance; 500 rows, 1000 training cycles 52
Table 18. Algorithm performance; all attributes, 10 training cycles 52
Table 19. Expected costs 63
Definition of Important Terms
Autonomous: Capable of acting independently and of exercising control over an internal state.
Semantics: Using NLP techniques to extract meaning from text
Social agent: An agent with the ability to interact with other agents
Software agent: In the context of this research, a software agent (or an agent) is an intelligent autonomous social agent existing within a software multi-agent environment tasked with a particular function.
SMOreg: Regression algorithm based on the scalable vector machine algorithm.
Chapter 1
Introduction
1.1 Background of the study
A report published by the United Nations Environment Programme in 2020, the Used Vehicles and Environment report, stated that second-hand vehicle sales account for nearly 95% of vehicles imported into Kenya annually. As of 2018, there were 3,280,934 registered units as reported by the Kenya National Bureau of Statistics. This accounted for an annual growth rate of 12% as cited in a 2016 report.
These figures suggest that hundreds of thousands of cars are added annually to the existing millions of other old cars operating within Kenya. Prospective buyers, sellers, insurance firms, property valuers, and numerous stakeholders must estimate the value of these cars with the utmost precision.
According to the Automobile Association of Kenya reasons for vehicle valuation include:
Pre-insurance valuation: this valuation is conducted to ensure that vehicular owners pay precise premiums when subscribing to insurance policies for their vehicle and that Insurers generate concise premium rates and minimise revenue loss as a result of value miscalculation.
Technical Brief Valuation: This refers to the precise determination of the price of a car or a machine when purchasing, vending and or disposing of antiquated equipment or in a scenario where an automobile is presented as a guarantee during loan acquisition.
Full Mechanical Valuation: It is an evaluation of a preventative nature and done to identify possible mechanical faults and reduce the chance of a vehicle breakdown.
Pre-Theft/Pre-fire valuation: when a vehicle is stolen this procedure must be conducted by the insuring body. To correctly compensate the owner, the value of the vehicle before it was stolen has to be calculated.
Accident Assessment: This inspection is carried out upon the loss of the vehicle after the loss and a claim settlement have been issued. In circumstances whereby it is difficult to fix the vehicle due to high replacement costs or extensive damage; advice is given on the restorative and salvage cost values.
Examining of automotive components: This is performed when an unbiased perspective is consorted regarding automotive failure. Failure could encompass accidents or automotive damage occurring during a repair.
According to (Bennett, 2016), the globally accepted method used to evaluate the value of an old car consists of pinpointing the vehicle’s value when it was new - and deducting the amortised cost by the cumulative years the vehicle has been used.
The resale value of a vehicle is affected by other factors such as:
Mileage: This refers to the distance covered by a vehicle in miles; the more miles covered lower the car’s worth as it increases its depreciation.
Vehicle Condition: Refers to any sort of damage to both the exterior and interior of a used vehicle; this naturally has a negative impact on the vehicle’s value.
Make and Model: Understandably, vehicles that offer lower fuel consumption over the distance covered fetch higher resale prices are generally popular.
The pricing and availability of spare parts, servicing and models, which are no longer produced, are factors that determine the resale value.
Vehicle Age: Most automobile’s age is proportional to its sale price, the older the vehicle the lower the resale price. This is because vehicles have a lifespan and their functional usage is expected to erode with time.
Ownership chain: The more times an automobile is sold the more its sale price keeps depreciating as a consequence of increasing maintenance costs by latter owners.
After-Sale Service: The quality of post-sale services varies from brand, thus affecting the condition of an automobile over time.
Features, and Options: The availability of options in the market features a car has, the likelier the price is to deteriorate; however, it has been noted that additional safety features increase a car’s value.
Colour of the car: Basic colours such as silver, black, or white have a preference among potential purchasers. Selling a uniquely coloured car increases the difficulty of finding potential customers thus the value of the automobile could be negatively impacted.
Despite numerous pre-existing vehicle valuation mechanisms, value determination is still a major challenge in Kenya. Similar car models can fetch exceedingly different prices from valuers. Concise mechanisms are yet to be installed to ensure that there are standardised vehicular valuation tools to minimise inconsistencies from being experienced, Stakeholders use different methods and data to inform their valuation processes hence yielding different results, making it hard to get a standard market price for a vehicle. Thus, the importance of having a readily available accurate and consistent vehicle valuation mechanism that ensures the automobile’s worth can be accurately determined.
Natural language processing, artificial neural network algorithms and multi-agent systems are some of the proposed tools that will be used to accurately generate the value of a car.
1.2 Statement of the problem
The existence of irregularities in the contemporary vehicle valuation processes is proven in the variation of values provided by different valuers, for similar vehicles. When you need to determine an acceptable value for a vehicle, the lack of an easily accessible and accurate instrument for vehicle valuation is clear, since the only method to acquire a vehicle assessment is to contact a valuer, an insurance broker or a valuation firm. And without a method to cross-validate the data presented, this could result in incorrect vehicle valuation and potentially inflated vehicle costs and insurance premiums being paid by unwary individuals. Again, the majority of people lack the technical expertise required to construct, grasp, and apply valuation formulae to arrive at car value estimations. (Kieti, 2005) .
1.3 Study Objectives
1.3.1 General Objective
Creating a mixed strategy vehicle valuation prototype that is simple to use, multi-agent-based, and fully functional, and that uses neural networks for numerical regression and natural language processing for feature extraction to forecast the value of a vehicle given precisely defined parameters or textual descriptions.
1.3.2 Specific Objectives
● To aggregate and analyse data using the existing vehicle valuation models from varied domains, and identifying areas with inconsistencies and discrepancies within the current vehicle valuation procedures.
● To design a mixed strategy prototype that considers the findings of the analysis and the deduction on related topics by other researchers.
● To implement a functional mixed-strategy vehicle valuation prototype.
● To amass data and examine the reliability of the evaluation prototype by establishing its performance on real vehicle data.
1.4 Goals
The study aims to produce an accurate easy-to-use vehicle valuation tool for public use, as well as industrial and professional functions such as insurance and car sales.
1.5 Limitations of the study
The accuracy of the models created in this research are subject to inconsistency taking into consideration the varying prices from different car dealers will not be similar for identical cars. Car dealers have different profit and cost considerations and hence produce prices that are expected to differ and compromise the predictions of the real value.
Automobile valuation statistics provided by different dealers are based on different valuation techniques leading to different degrees of accuracy.
Supply and demand, monetary inflation and other financial and market factors have an impact on the valuation of a vehicle. When the market preference is in high demand for a specific model, its value tends to be greater than that of a less preferred model, and these dependent factors will be reflected in the dataset available for this research.
Import taxes on foreign vehicles affect the buying and selling prices, which proves challenging when it comes to accurately identifying the intrinsic value of the vehicle.
1.6 Scope of the study
The scope of this report is confined to the value estimation of cars in Kenya. This study aims to produce a correct vehicle and machinery valuation model that could be implemented in different fields of vehicle valuation such as inspection of motor vehicles, pre-insurance valuation, accident assessment, pre-fire and pre-theft assessments, full mechanical assessment and technical brief assessment.
1.7 Expected Contributions
1) At the closing of this study, it is expected to have derived in an operational automobile valuation framework based on natural language processing, regression algorithms and multi- agent systems to determine the value of a vehicle.
The framework will be expected to serve as a readily available instrument providing a closely accurate market value for an automobile taking into account various characteristics of the said vehicle.
2) Finally, this study aims to direct future research by laying a solid platform for future researchers to build on.
1.8 Proposal Organisation
Chapter 1: Introduction – Describes the definition of the problem, scope, the study's objectives, research questions, and constraints.
Chapter 2: Literature review - This section contains general facts about the relevant work, suggested solution's design and the problem domain.
Chapter 3: Methodology - This chapter covers the whole study process, as well as specific methods to be used, as well as a well-planned timeline, budget, and resource requirements.
Chapter 4: Data Analysis, Prototype Design and Implementation - This segment presents the analysis, design, and execution of the proposed solution.
Chapter 5: Results and Discussion – In this segment, the outcomes of this research are presented and analysed.
Chapter 6: Conclusion and Recommendation - This chapter offers the opinions of the researcher and provides suggestions for the preceding work.
Login To Comment