Show simple item record

dc.contributor.authorMutungi, Gilbert
dc.date.accessioned2022-04-12T12:47:11Z
dc.date.available2022-04-12T12:47:11Z
dc.date.issued2021-05
dc.identifier.citationMutungi, G. (2021). A classification algorithm for delinquent invoices in Uganda (Unpublished master's dissertation). Makerere University, Kampala Ugandaen_US
dc.identifier.urihttp://hdl.handle.net/10570/10083
dc.descriptionA dissertation submitted to the Directorate of Research and Graduate Training in partial fulfilment of the requirements for the award of Master of Statistics Degree of Makerere Universityen_US
dc.description.abstractThis study aimed at developing a classification algorithm for delinquent invoices in Uganda. Data on 2028 invoices was extracted from Patasente, an e- procurement platform in Uganda. Gain Ratios for different attributes were used to determine each attribute’s importance in determining the payment outcome of an invoice. C4.5 decision tree, random forest and logistic regression models were developed to classify the invoices into two categories; those paid on time and the others paid late. Both models were tested using the 0.632 bootstrap method in order to compare their performance levels. Results showed that 34% of the invoices were paid late. Invoice base amount (0.021), proportion of previously delayed invoices (0.0166), Customer Location (0.01226) and product or service offered (0.01223) were the most important attributes in determining whether an invoice was paid on time. The Random Forest Algorithm had the highest classification accuracy with a rate of 83.76% while the C4.5 Decision tree and Logistic regression models had accuracy rates of 71.15% and 66.27% respectively. The Kappa statistics for the models were 0.621, 0.336 and 0.085 respectively. The study concluded that the Random Forest Classification Algorithm (83.76%) provides higher accuracy results than both the Decision tree Algorithm (71.15%) and Logistic Regression (66.27%) in classifying the payment outcome of an invoice. The study recommends using a larger dataset across more years so as this may increase accuracy rates. Furthermore, incorporating a cost matrix in the model that punishes wrongly predicting late invoices as on time (False Positives) may improve the model’s relevance to businesses and is thus recommended.en_US
dc.language.isoenen_US
dc.publisherMakerere Universityen_US
dc.subjectClassification Algorithmen_US
dc.subjectDelinquent Invoicesen_US
dc.titleA classification algorithm for delinquent invoices in Ugandaen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record