• $

DESIGN AND IMPLEMENTATION OF A PREDICTIVE MODEL FOR STUDENT'S ACADEMIC PERFORMANCE USING MACHINE LEARNING

  • 0 Review(s)

Product Category: Projects

Product Code: 00010232

No of Pages: 60

No of Chapters: 1-5

File Format: Microsoft Word

Price :

$40

  • $

ABSTRACT

The academic performance and retention of students are critical concerns for higher education institutions. Identifying students at risk of failure early enough for effective intervention remains a significant challenge, particularly in institutions like The Polytechnic Ibadan, which rely on reactive, manual systems. This study addresses this problem by designing and implementing a machine learning-based system to predict student academic performance proactively. The research leverages educational data mining techniques, utilizing the algorithm, Logistic Regression, on a dataset compiled from academic records. The methodology encompasses data collection, preprocessing, exploratory data analysis, model training, and evaluation. A comprehensive system analysis of the existing processes at The Polytechnic Ibadan was conducted, identifying key limitations such as delayed identification of at-risk students and inefficient use of available data. The proposed system aims to shift the paradigm from reactive to proactive by providing early warnings, enabling educators and advisors to implement timely support measures. By comparing multiple machine learning models, this research seeks to identify the most effective and generalizable predictor for student performance, thereby contributing to improved educational outcomes, resource allocation, and institutional decision-making.



TABLE OF CONTENT

i           TITLE PAGE / COVER PAGE                                                                                  i

ii          CERTIFICATION                                                                                                       ii

iii        DEDICATION                                                                                                            iii

iv         ACKNOWLEDGEMENT                                                                                          iv

v          ABSTRACT                                                                                                                v

 

CHAPTER ONE       INTRODUCTION

1.1            INTRODUCTION                                                                                                      1

1.2            STATEMENT OF THE PROBLEM                                                                           4

1.3            JUSTIFICATION OF STUDY                                                                                   4

1.4            AIM AND OBJECTIVES                                                                                           5

1.5            SCOPE OF STUDY                                                                                                   6

1.6            METHODOLOGY                                                                                                      6

1.7            SIGNIFICANCE OF THE STUDY                                                                           7

1.8            DEFINITION OF TERMS                                                                                         8

 

CHAPTER TWO      LITERATURE REVIEW

2.1       BACKGROUND THEORY OF STUDY                                                                   10

2.1.1    MACHINE LEARNING                                                                                            15

2.1.1.1 DECISION TREE                                                                                                       15

2.1.1.2 SUPPORT VECTOR MACHINE (SVM)                                                                  16

2.1.1.3 RANDOM FOREST                                                                                                   18

2.2       RELATED WORKS                                                                                                   19

2.3       CURRENT METHODS IN USE                                                                               19

2.4         APPROACH TO BE USED IN THIS STUDY                                                        19

 

CHAPTER THREE  SYSTEM INVESTIGATION AND ANALYSIS

3.1         BACKGROUND INFORMATION ON CASE STUDY                                         21

3.2         OPERATION OF EXISTING SYSTEM                                                                  22

3.3          ANALYSIS OF FINDINGS                                                                                    22

             (a)   OUTPUT FROM THE SYSTEM                                                                      22

             (b)   INPUTS TO THE SYSTEM                                                                              23

             (c)   PROCESSING ACTIVITIES CARRIED OUT BY THE SYSTEM                 23

             (d)   ADMINISTRATION/ MANAGEMENT OF THE SYSTEM                           24

             (e)   CONTROLS USED BY THE SYSTEM                                                           24

             (f)    HOW DATA AND INFORMATION ARE BEING STORED BY THE SYSTEM                                                                                                                                                        25

             (g)   MISCELLANEOUS                                                                                           25

3.4         PROBLEMS IDENTIFIED FROM ANALYSIS                                                      25

3.5         SUGGESTED SOLUTIONS TO PROBLEMS IDENTIFIED                                26

 

CHAPTER FOUR    SYSTEM DEVELOPMENT  

4.1       SYSTEM DESIGN                                                                                                     28

4.1.1      OUTPUT DESIGNS                                                                                                 28

             (a)  REPORTS TO BE GENERATED                                                                       28

             (b) SCREEN FORMS OF REPORTS                                                                       28

             (c) FILES USED TO PRODUCE REPORTS                                                            30

4.1.2        INPUT DESIGN                                                                                                     31

              (a) LIST OF INPUT ITEMS REQUIRED                                                                31

              (b) DATA CAPTURE SCREEN FORMS FOR INPUT                                           31

4.1.3         PROCESS DESIGN                                                                                              34

              (a)  LIST ALL PROGRAMMING ACTIVITIES NECESSARY                             34

              (b)  PROGRAM MODULES TO BE DEVELOPED                                               35

              (c)   VTOC                                                                                                                35

4.1.4        STORAGE DESIGN                                                                                              36

              (a)  DESCRIPTION OF DATABASE USED                                                           36

4.1.5        DESIGN SUMMARY                                                                                            36

             (a)  SYSTEM FLOWCHART                                                                                    36

             (b)  HIPO CHART                                                                                                     38

4.2       SYSTEM IMPLEMENTATION                                                                                38

4.2.1        PROGRAM DEVELOPMENT ACTIVITIES                                                       38

                (a)  PROGRAMMING LANGUAGE USED                                                         38

                (b)  ENVIRONMENT USED FOR DEVELOPMENT                                          38

                (c)  SOURCE CODE                                                                                              39

4.2.2          PROGRAM TESTING                                                                                         39

                (a)  CODING PROBLEMS ENCOUNTERED                                                      39

                (b)  USE OF SAMPLE DATA                                                                                39

4.2.3          SYSTEM DEPLOYMENT                                                                                   39

                (a)  SYSTEM REQUIREMENTS                                                                          39

                (b)  TASKS PRIOR TO DEPLOYMENT                                                               40

                              (i)   HARDWARE/SOFTWARE ACQUISITION                                    40

                              (ii)  PROGRAM INSTALLATION                                                          40

                (c)  USER TRAINING                                                                                           40

4.3       SYSTEM DOCUMENTATION                                                                                 41

4.3.1          FUNCTION OF PROGRAM MODULES                                                           41

4.3.2          USER MANUAL                                                                                                  41

 

CHAPTER FIVE   -     SUMMARY, CONCLUSION AND RECOMMENDATION

5.1          SUMMARY                                                                                                             43

5.2          CONCLUSION                                                                                                        44

5.3         RECOMMENDATION                                                                                             44

REFERENCES

APPENDICES

(a)    PROGRAM FLOWCHART

(b)  PROGRAM LISTING

(c)   TEST DATA

(d)  SAMPLE OUTPUT

 


CHAPTER ONE

INTRODUCTION

1.1       Introduction

The use of machine learning (ML) algorithms in academia has gained significant attention in recent years due to the increasing availability of educational data and advancements in ML techniques (Yagcı, 2022). Using ML algorithms to predict students’ academic performance can give valuable insights to educators, allowing them to identify at-risk students who may need additional support, modify instructional techniques, boost learning outcomes, tailor teaching approaches to specific students’ requirements, and increase student retention rates (Adnan et al., 2021). This procedure promotes the growth of the educational system at higher institutions because educators and policymakers can intervene early to prevent students from falling behind and increase their chances of success. Applying ML algorithms to predict student academic achievement can dramatically enhance educational results and give valuable insights into the aspects contributing to academic success (Alyahyan and Du¸steg¨, 2020). Therefore, it is critical to carefully assess these algorithms’ possible benefits and limitations and ensure they are appropriately utilized. Mechanisms, such as the type of ML algorithm employed, the variables analyzed, and the assessment metrics used to determine prediction accuracy, were included as part of our investigation criteria. Applying ML algorithms in education can transform how we approach teaching and learning with a qualitative or quantitative analysis, or a mix of the two, offering an overall assessment of the condition of the results (Zhai, 2021). The potential benefits of using ML algorithms to predict academic performance extend beyond individual students and can positively impact society (Waheed et al., 2020). By improving education outcomes, individuals are better equipped to contribute to the workforce and society, leading to economic growth and social development. A vast majority of the work in educational data analysis has been devoted to developing machine learning models capable of accurately predicting students’ performance in specific contexts. However, the existing body of literature often overlooks the crucial aspect of evaluating models for their ability to transcend beyond their original training settings and demonstrate robust generalizability across diverse student populations and learning environments. This oversight raises concerns about the potential biases introduced by relying solely on ‘best-performing’ models and neglecting the search for models that exhibit superior generalization capabilities. Consequently, a pressing need emerges to address this research gap by identifying and investigating the optimal machine learning model that can be a predictive tool for assessing students’ performance. This pursuit emphasizes ensuring that the identified model achieves accurate predictions and avoids any inherent bias stemming from feature selection, thereby ensuring its applicability and effectiveness across various educational contexts. This study contributes to advancing educational data analysis practices by addressing these challenges and encouraging a paradigm shift towards holistic and unbiased model evaluation and selection.

With the increased availability of data from numerous sources, including learning management systems, online platforms, and student records, ML algorithms can give significant insights into student behavior, performance, and learning patterns (Yu et al., 2020). A thorough examination of the literature on ML algorithms for forecasting students’ academic achievement may offer a complete knowledge of the various ML approaches employed, the parameters examined, and prediction accuracy (Rastrollo et al., 2020). Institutions can profit from properly anticipating student performance by concentrating on students who are more likely to perform poorly and helping them improve their performance (Batool et al., 2023). ML algorithms used to predict students’ academic achievement can give significant insights to academics, instructors, and educational policymakers (Waheed et al., 2020; Alyahyan and Du¨¸steg¨, 2020). ML algorithms may effectively predict students’ academic achievement by analyzing different academic and non-academic criteria such as previous grades, attendance records, socioeconomic background, and student behavior (Batool et al., 2023).

A growing interest has been in using ML algorithms to predict students’ academic performance, and several studies have explored the use of ML in this area, with promising results. Several schools of thought regarding using ML to predict academic performance have emerged. One school of thought focuses on using traditional statistical methods, such as regression analysis and logistic regression. These methods assume a linear relationship between the predictor and outcome variables. For example, studies by Yaacob et al. (2019) and Waheed et al. (2020) used logistic regression while El Aissaoui et al. (2020) used linear regression to predict students’ academic performance. Another school of thought revolves around using decision trees and random forests. Decision trees are hierarchical models that predict the outcome variable through binary decisions. On the other hand, random forests are an ensemble learning approach combining numerous decision trees. Vijayalakshmi and Venkatachalapathy (2019), Altabrawee et al. (2019), and Zhang et al. (2022), for example, employed decision trees and random forests to predict students’ academic performance. A third school of thought focuses on using neural networks (Baashar et al., 2022; Liu et al., 2022), which are computational models inspired by the structure and function of the human brain to predict students’ academic performance. Neural networks are beneficial when dealing with complex, non-linear relationships between variables. Hybrid approaches also combine multiple ML methods to improve prediction accuracy. For instance, a study by Francis and Babu (2019) used a hybrid approach that combined logistic regression, decision trees, and neural networks to predict academic performance based on students’ demographic information, prior academic performance, and study habits. Each approach has its strengths and weaknesses, and the choice of method depends on the research question and the nature of the data.

Numerous factors can impact a student’s academic performance, including individual factors, such as motivation and self-regulation, and environmental factors, such as socioeconomic status and school resources (de la Fuente et al., 2021). Motivation is one of the most significant factors impacting students’ academic performance. (Deci et al, 2020)  has shown that students who are intrinsically motivated to learn, meaning that they are motivated by their interest and enjoyment of the material, are more likely to perform well academically. On the other hand, extrinsically motivated students, meaning that they are motivated by external rewards such as grades or praise, may be less likely to perform well if these rewards are not provided. Another individual factor that can impact academic performance is self-regulation, which refers to the ability to manage one’s learning and behaviors. Students who can effectively regulate their learning by setting goals, monitoring their progress, and seeking help when needed are likelier to perform well academically (Feeney et al., 2023). Environmental factors can also have a significant impact on academic performance (Asvio et al., 2022). For example, students from lower socio-economic backgrounds may have less access to resources such as high-quality schools, educational materials, and extracurricular activities, which can negatively impact their academic performance. Additionally, students who attend schools with fewer resources, such as low-income schools, may be less likely to have access to experienced teachers or advanced coursework, which can also impact academic performance. Family support and involvement in education can also have an impact on student performance. (Garbacz et al, 2021)  Research has shown that students whose families are involved in their education, such as providing support and encouragement, attending parent-teacher conferences, and monitoring homework, are more likely to perform well academically. To improve student learning concentration and collaboration in response to the COVID-19 pandemic, Nyarko et al. (2023) utilize a Discrete Choice Experiment to investigate university instructors’ preferences for current teaching strategies.

1.2       Statement of the Problem

Student success is of utmost importance for educational institutions and society. Dropouts, failures, and a drop in the standard of education in higher institutions are increasing, but identifying students who are at risk of dropping out and implementing timely interventions can greatly contribute to improving graduation rates and ensuring academic success.       

1.3       Justification of Study

The justification for this study is multi-faceted, stemming from critical gaps in current educational practices and the transformative potential of machine learning:

  1. Addressing Systemic Inefficiencies: The current system at The Polytechnic Ibadan for monitoring student performance is reactive and manual. It identifies problems only after examinations, when opportunities for meaningful intervention are limited. This study is justified by the urgent need to computerize and automate this process, making it proactive and data-driven. This shift will allow the institution to support students before they fail, rather than afterward.
  2. Improving Student Outcomes and Retention: A primary goal of any educational institution is to ensure student success. By accurately predicting at-risk students, the proposed system enables early and targeted interventions, such as additional tutoring, counseling, or academic advising. This proactive approach is expected to directly contribute to reducing dropout rates, improving graduation rates, and enhancing overall academic standards.
  3. Optimizing Resource Allocation: Academic advisors and lecturers have limited time and resources. This system will help them focus their efforts on the students who need help the most, thereby increasing the efficiency and effectiveness of student support services.
  4. Contributing to Educational Data Science: While many studies have developed predictive models, a significant research gap exists concerning the generalizability of these models across different contexts. This study aims not only to build an accurate model but also to investigate which algorithm (e.g., Random Forest vs. Logistic Regression) offers the most robust and reliable predictions, contributing valuable insights to the field of educational data mining.
  5. Enhancing Institutional Decision-Making: The predictive analytics generated by the system can provide departmental and institutional leaders with insights into course difficulty, curriculum effectiveness, and trends in student performance. This data-driven intelligence can inform strategic planning, curriculum review, and quality assurance processes, ultimately strengthening the institution's academic offerings.

In summary, this study is justified by its potential to transform the educational experience at The Polytechnic Ibadan by leveraging technology to foster student success, optimize institutional resources, and advance the application of machine learning in education.

 

1.4       Aims and Objectives of the Study

Aim

The aim of this project is to predict students’ academic performance using random forest and logistic regression algorithms.

Objectives

       i.          To collect and compile relevant academic datasets, including students’ grades, demographic information, and behavioral records from kaggle.com. Help identify the various factors that affect students’ academic performance.

     ii.          To preprocess the collected data into a format suitable for machine learning by handling missing values, normalizing variables, and encoding categorical features.

   iii.          To explore and analyze the data to identify key academic and non-academic factors that influence student performance.

   iv.          To evaluate multiple machine learning models (e.g., logistic regression and random forest) to determine the most suitable for predicting students’ academic outcomes.

 

1.5       Scope of the study

The goal is to design and implement an ML model that predicts student academic performance based on historical data, demographics, and behavioral factors. The data used for this research was obtained from an online database known as Kaggle, where hundreds of collections of data are available for users to explore, analyze, and utilize in various data science and machine learning applications.  The model will assist educators in identifying struggling students and improving retention rates.

 

1.6       Methodology

1.6.1    Objective 1: To predict students’ final GPA classified as pass or fail, given the grades of all the mandatory courses

·       Collect academic records, including students’ grades in all mandatory courses and their final GPA.

·       Preprocess the data: normalize grade values, remove missing values, encode the GPA classification into binary labels (e.g., Pass = 1, Fail = 0).

·       Use classification algorithms (Logistic Regression, Decision Tree, and Random Forest) to model the data.

·       Train the models on the preprocessed dataset.

·       Evaluate using classification metrics like Accuracy, Precision, Recall, and F1-Score.

 

1.6.2    Help identify the various factors that affect students’ academic performance

·       Gather additional academic and non-academic data such as demographic data, attendance, family background, motivation levels, study habits, etc.

·       Perform exploratory data analysis (EDA) to identify patterns and correlations.

·       Use feature importance techniques (e.g., Gini importance for Decision Tree/Random Forest, coefficients in Logistic Regression) to rank the impact of different variables.

·       Optionally apply dimensionality reduction techniques (e.g., PCA) to visualize key influencing factors.

 

1.6.3    To implement a model to predict academic performance using Decision Tree classifiers

·       Use preprocessed training data (features such as grades, attendance, demographics).

·       Implement a Decision Tree classifier using scikit-learn or a similar ML library.

·       Tune hyperparameters like maximum depth, min samples split, etc., to avoid overfitting.

·       Train the model and validate its performance using techniques such as k-fold cross-validation.

·       Use a confusion matrix and tree visualization to interpret results and explainability.

 

1.7       Significance of the Study

The system offers enormous benefits to the following users:

1.     Lecturers/ Academic Advisors: The prediction model will help teachers and tutors identify weak and strong students so that teachers can lay more emphasis on instructions and procedures when dealing with weak students, to enhance overall academic performance. An academic advisor can refer to the prediction results when advising students who perform poorly in their studies so that preventive measures can be taken much earlier. In addition, a lecturer can further improve his/her teaching and learning approach, as well as plan interventions and support services for weak students.

2.     Academic Performance is an important factor people consider before applying to any tertiary institution. An institution that is known for producing low-performance students is at risk of having low intakes. The need for a Prediction Performance System arises as this will help in the early prediction of weak students and help them focus on their weak areas.

3.     Parents/Guardians/partners: Results have shown that Parents/Guardians and Partners have effects on the academic performance of Students. The Study helps to analyze the influence of family background on students’ performance predictions. Attributes such as the size of family, encouragement/motivation from parents/spouses/siblings, the highest qualification of the sponsor, and other factors will help determine those factors that affect performance.

 

1.8       Definition of Terms

  • Academic Performance: A measure of a student’s achievement in educational activities, often represented by grades, GPA, exam scores, or course completion rates.
  • Classification Algorithm: A type of machine learning algorithm used to categorize student outcomes, such as predicting whether a student will pass, fail, or drop out.
  • Decision Trees: A machine learning algorithm that predicts student performance by learning decision rules based on input features such as attendance, scores, and study behavior, structured like a flowchart.
  • Dropout Prediction: The process of identifying students at risk of discontinuing their studies before completion, using predictive models trained on academic, behavioral, and demographic data.
  • Feature Engineering: The process of creating, transforming, or selecting relevant variables (features) from raw student data to enhance the performance of predictive models.
  • Grade Point Average (GPA): A standardized measure of academic achievement calculated by averaging the grades received across courses, commonly used as a performance indicator.
  • K-Nearest Neighbors (KNN): A machine learning algorithm that classifies a student’s likely performance by comparing it to the 'k' most similar students in the training dataset.
  • Learning Management System (LMS): A software platform used by educational institutions to deliver, track, and manage student learning activities and data, which can be used as input for predictive models.
  • Linear Regression: A statistical technique used to predict continuous academic outcomes (e.g., GPA or final exam scores) based on one or more independent variables like study hours or attendance.
  • Machine Learning (ML): A subset of artificial intelligence that allows systems to learn from educational data and make predictions about student outcomes without being explicitly programmed for each scenario.
  • Predictive Analytics: The use of data analysis, statistical models, and machine learning to make informed predictions about students’ future academic performance or likelihood of success.
  • Student Dataset: A structured collection of data related to students, including demographic information, attendance, grades, and behavioral logs, used for training and testing machine learning models.
  • Support Vector Machine (SVM): A supervised learning algorithm that classifies student performance into distinct categories by finding the optimal boundary that separates different outcome groups.

Click “DOWNLOAD NOW” below to get the complete Projects

FOR QUICK HELP CHAT WITH US NOW!

+(234) 0814 780 1594

Buyers has the right to create dispute within seven (7) days of purchase for 100% refund request when you experience issue with the file received. 

Dispute can only be created when you receive a corrupt file, a wrong file or irregularities in the table of contents and content of the file you received. 

ProjectShelve.com shall either provide the appropriate file within 48hrs or send refund excluding your bank transaction charges. Term and Conditions are applied.

Buyers are expected to confirm that the material you are paying for is available on our website ProjectShelve.com and you have selected the right material, you have also gone through the preliminary pages and it interests you before payment. DO NOT MAKE BANK PAYMENT IF YOUR TOPIC IS NOT ON THE WEBSITE.

In case of payment for a material not available on ProjectShelve.com, the management of ProjectShelve.com has the right to keep your money until you send a topic that is available on our website within 48 hours.

You cannot change topic after receiving material of the topic you ordered and paid for.

Ratings & Reviews

0.0

No Review Found.

Review


To Comment


Sold By

EmaTech Projects Hub

18

Total Item
Visit Store

Reviews (32)

  • Anonymous

    3 months ago

    I really appreciate

  • Anonymous

    9 months ago

    This is so amazing and unbelievable, it’s really good and it’s exactly of what I am looking for

  • Anonymous

    10 months ago

    Great service

  • Anonymous

    10 months ago

    This is truly legit, thanks so much for not disappointing

  • Anonymous

    10 months ago

    I was so happy to helping me through my project topic thank you so much

  • Anonymous

    10 months ago

    Just got my material... thanks

  • Anonymous

    11 months ago

    Thank you for your reliability and swift service Order and delivery was within the blink of an eye.

  • Anonymous

    11 months ago

    It's actually good and it doesn't delay in sending. Thanks

  • Anonymous

    11 months ago

    I got the material without delay. The content too is okay

  • Anonymous

    11 months ago

    Thank you guys for the document, this will really go a long way for me. Kudos to project shelve👍

  • Anonymous

    11 months ago

    You guys have a great works here I m really glad to be one of your beneficiary hope for the best from you guys am pleased with the works and content writings it really good

  • Anonymous

    11 months ago

    Excellent user experience and project was delivered very quickly

  • Anonymous

    11 months ago

    The material is very good and worth the price being sold I really liked it 👍

  • Anonymous

    11 months ago

    Wow response was fast .. 👍 Thankyou

  • Anonymous

    1 year ago

    Trusted, faster and easy research platform.

  • TJ

    1 year ago

    great

  • Anonymous

    1 year ago

    My experience with projectselves. Com was a great one, i appreciate your prompt response and feedback. More grace

  • Anonymous

    1 year ago

    Sure plug ♥️♥️

  • Anonymous

    1 year ago

    Thanks I have received the documents Exactly what I ordered Fast and reliable

  • Anonymous

    1 year ago

    Wow this is amazing website with fast response and best projects topic I haven't seen before

  • Anonymous

    1 year ago

    Genuine site. I got all materials for my project swiftly immediately after my payment.

  • Anonymous

    1 year ago

    It agree, a useful piece

  • Anonymous

    1 year ago

    Good work and satisfactory

  • Anonymous

    1 year ago

    Good job

  • Anonymous

    1 year ago

    Fast response and reliable

  • Anonymous

    1 year ago

    Projects would've alot easier if everyone have an idea of excellence work going on here.

  • Anonymous

    1 year ago

    Very good 👍👍

  • Anonymous

    1 year ago

    Honestly, the material is top notch and precise. I love the work and I'll recommend project shelve anyday anytime

  • Anonymous

    1 year ago

    Well and quickly delivered

  • Anonymous

    1 year ago

    I am thoroughly impressed with Projectshelve.com! The project material was of outstanding quality, well-researched, and highly detailed. What amazed me most was their instant delivery to both my email and WhatsApp, ensuring I got what I needed immediately. Highly reliable and professional—I'll definitely recommend them to anyone seeking quality project materials!

  • Anonymous

    1 year ago

    Its amazing transacting with Projectshelve. They are sincere, got material delivered within few minutes in my email and whatsApp.

  • TJ

    1 year ago

    ProjectShelve is highly reliable. Got the project delivered instantly after payment. Quality of the work.also excellent. Thank you