• Ei tuloksia

Visualizing the most important differences between variables

4.3 D ESCRIPTIVE STATISTICS

4.4.6 Visualizing the most important differences between variables

To find out clearer differences in late and repaid loans, the SOM was done once again by dividing the data set containing all variables into two parts by the status of the loan, not by the gender. The results of all variables can be seen from the U-matrices and the component planes in Appendices 3 and 4. The distance U-matrices, the clusters, the component planes of each variable, and map unit labels have been presented in Appendices 5 and 6. In the analysis the target is to concentrate only

on the variables that have clearer difference which can be seen most easily from figures. Next, the variables with the clearest difference will be presented.

In figure 35 are presented the U-matrices of late and repaid loans. The shape of both U-matrices is almost identical. The highest distances between neurons are located to the middle as a border in both matrices. The size of the grid of late loans was 27 X 22 and for repaid loans 25 x 21.

Figure 35. U-matrices for late and repaid loans.

The component planes of the variable use of loan are presented in figure 36. As there can be noticed, the purpose of late loans has been more likely loan consolidation, travelling and a vehicle compared to the repaid loans. The purpose of repaid loans has been more likely education and home improvement. So, as a conclusion investor should avoid funding loans that have the purpose for loan consolidation or vehicle.

Late Repaid

75 Figure 36. Component planes for use of loan.

The component planes of employment status are presented in figure 37. Here the most interesting note is that if the employment status of the borrower has been something other than fully employed, the entrepreneurship has indicated the loan to be more likely late than repaid. Instead, retired borrowers have paid their loans preferably.

Figure 37. Component planed for employment status.

By comparing education levels of borrower in figure 38, it seems borrowers who have paid their loans on time have had more often higher and secondary educations (red colored nodes) than borrowers who have been late with their instalments.

Late Repaid

Late Repaid

Use of loan Use of loan

Employment Employment

Figure 38. Component planes for education.

As it was naturally expected, borrowers with higher amount of free cash after compulsory expenses have performed better. This comparison can be seen clearly from figure 39. Here small amounts of free cash have been visualized with blue color and higher amounts with brighter colors.

Figure 39. Component planes for free cash.

The last comparison is presented in the component planes of borrowers’ marital status in figure 40. From the component plane of late loans, it can be noticed there have been more single and divorced borrowers (red and orange colored nodes) than in the component plane of repaid loans. As a conclusion it can be said that borrowers who are married or cohabitant are more likely to pay their liabilities on time.

Late Repaid

Late Repaid

Education Education

Free cash Free cash

77 Figure 40. Component planes for marital status.

4.4.7 Discussion

In this subsection a brief comparison of late and repaid loans was handled. The SOM was done for each status separately. The main focus was to find the variables which have clear differences between loans with different status. The loan purpose indicates strongly the delay of loans if the loan has been used for loan consolidation or a vehicle. Moreover, if the employment status has been something else than fully employed, entrepreneurs and self-employed borrowers defaulted more likely their loans. Instead, retired borrowers managed more likely to pay their loans on time.

When education level of the borrowers was under consideration, borrowers who had higher or secondary education, were more likely on the side of repaid loans.

Whereas borrowers had more often vocational or basic education when the loan has been late. In general, if borrowers had more free cash left after mandatory expenses, they had a higher probability to pay their liabilities on time. The marital status of the borrower had clearer differences as well. Borrowers who were married or cohabitant repaid their loans more often on time compared to single or divorced borrowers.

Late Repaid

Marital status Marital status

5 SUMMARY AND CONCLUSIONS

This final chapter of the study summarizes the findings of the SOMs. Previously set research questions will also get their answers. Then conclusions of the study will be described. We evaluate reliability and limitations and bring up suggestions for further research. Validity and reliability of the research are considered whether the information collected and achieved results can be generalized.

In the beginning of this study, concepts of peer-to-peer lending and information asymmetry as a theory behind it were introduced. The literature review was done for describing used methods and previously found variables to evaluate possible default in peer-to-peer lending. The study continued to methodology where used methods Principal Component Analysis and Self-Organizing Map were introduced.

The empirical part started by presenting briefly the results of Principal Component analysis finding the most important variables explaining most of the variance in the data. Then the case of exploring the Bondora data was introduced and analyzed by using the SOM. The goal was to learn usage of the SOM and find out if the SOM is a suitable tool for investors to analyze data of potential borrowers for support lending decision.