The goal of this project is to make a model which will predict a customer's loan status given a number of attributes such as the current loan amount and whether the loan is a long term loan or a short term loan.
Import packages
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv('LoansTrainingSet.csv')
df.head()
C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\IPython\core\interactiveshell.py:2698: DtypeWarning: Columns (16) have mixed types. Specify dtype option on import or set low_memory=False. interactivity=interactivity, compiler=compiler, result=result)
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 000025bb-5694-4cff-b17d-192b1a98ba44 | 5ebc8bb1-5eb9-4404-b11b-a6eebc401a19 | Fully Paid | 11520 | Short Term | 741.0 | 10+ years | Home Mortgage | 33694.0 | Debt Consolidation | $584.03 | 12.3 | 41.0 | 10 | 0 | 6760 | 16056 | 0.0 | 0.0 |
1 | 00002c49-3a29-4bd4-8f67-c8f8fbc1048c | 927b388d-2e01-423f-a8dc-f7e42d668f46 | Fully Paid | 3441 | Short Term | 734.0 | 4 years | Home Mortgage | 42269.0 | other | $1,106.04 | 26.3 | NaN | 17 | 0 | 6262 | 19149 | 0.0 | 0.0 |
2 | 00002d89-27f3-409b-aa76-90834f359a65 | defce609-c631-447d-aad6-1270615e89c4 | Fully Paid | 21029 | Short Term | 747.0 | 10+ years | Home Mortgage | 90126.0 | Debt Consolidation | $1,321.85 | 28.8 | NaN | 5 | 0 | 20967 | 28335 | 0.0 | 0.0 |
3 | 00005222-b4d8-45a4-ad8c-186057e24233 | 070bcecb-aae7-4485-a26a-e0403e7bb6c5 | Fully Paid | 18743 | Short Term | 747.0 | 10+ years | Own Home | 38072.0 | Debt Consolidation | $751.92 | 26.2 | NaN | 9 | 0 | 22529 | 43915 | 0.0 | 0.0 |
4 | 0000757f-a121-41ed-b17b-162e76647c1f | dde79588-12f0-4811-bab0-e2b07f633fcd | Fully Paid | 11731 | Short Term | 746.0 | 4 years | Rent | 50025.0 | Debt Consolidation | $355.18 | 11.5 | NaN | 12 | 0 | 17391 | 37081 | 0.0 | 0.0 |
Check for duplicated rows
df[df.duplicated()==True]
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
14 | 00029f9f-0cc5-4d4e-aabc-ea4a7fe74e12 | afbc2fa3-3bad-4d48-b691-829aed78bad5 | Charged Off | 17961 | Short Term | 701.0 | < 1 year | Own Home | 53053.0 | Debt Consolidation | $596.85 | 9.9 | 43.0 | 7 | 0 | 6810 | 22775 | 0.0 | 0.0 |
19 | 00035328-2636-4390-8322-5841de482d2b | 28eb75ac-6a09-4057-b73f-726c72ebf933 | Charged Off | 11604 | Short Term | 729.0 | 6 years | Home Mortgage | 35315.0 | Debt Consolidation | $662.16 | 15.0 | NaN | 12 | 0 | 17092 | 20743 | 0.0 | 0.0 |
32 | 0006572c-9567-484d-b49b-bfe06536aa96 | c1a7ba66-9f36-4f5c-86cd-1aa7ad68b954 | Charged Off | 14727 | Short Term | 740.0 | 3 years | Home Mortgage | 70690.0 | Debt Consolidation | $1,083.91 | 18.0 | 27.0 | 12 | 0 | 43375 | 225911 | 0.0 | 0.0 |
37 | 000795b0-d767-42bf-837a-80d5b78c1e50 | b533c78c-3096-4914-acb1-5453c0ff76a4 | Charged Off | 19731 | Long Term | 623.0 | 3 years | Rent | 43846.0 | Buy House | $738.08 | 16.2 | 55.0 | 13 | 1 | 9031 | 17955 | 1.0 | 0.0 |
40 | 0008a65c-3054-4b5d-931d-8352021f813d | 04513d99-d809-4eb6-b4bb-6abc767d861e | Charged Off | 7066 | Short Term | 741.0 | 10+ years | Own Home | 60564.0 | Debt Consolidation | $706.58 | 14.6 | 56.0 | 4 | 0 | 6229 | 9242 | 0.0 | 0.0 |
106 | 001b2e80-1220-40f4-87ad-63fc4814bdfe | f7d50460-c0a5-45d0-ae25-891947b4c7b9 | Charged Off | 11316 | Long Term | 703.0 | 6 years | Rent | 80832.0 | Debt Consolidation | $714.02 | 17.8 | 71.0 | 5 | 1 | 3666 | 7376 | 1.0 | 0.0 |
118 | 001df440-3495-4ea7-b519-b7b85dd7ad0e | c550dd85-dccb-4441-b2fe-a06e90bf48de | Charged Off | 22059 | Long Term | 679.0 | 10+ years | Home Mortgage | 200540.0 | Debt Consolidation | $578.23 | 24.1 | 38.0 | 10 | 0 | 17642 | 41904 | 0.0 | 0.0 |
131 | 0021a33d-62ce-44fa-9c8f-7b5396a1be81 | 9681f4f0-4d57-48e2-917d-a5f6d3accaab | Charged Off | 11945 | Short Term | NaN | 5 years | Rent | NaN | Debt Consolidation | $675.21 | 18.6 | NaN | 8 | 0 | 5258 | 22278 | 0.0 | 0.0 |
157 | 0027a983-b2d4-4363-8191-2e5480a92549 | 0c8991e7-43fe-478b-94cc-ba1185f3eb05 | Charged Off | 10057 | Short Term | NaN | 3 years | Rent | NaN | Debt Consolidation | $1,089.87 | 16.4 | 15.0 | 11 | 0 | 11615 | 14234 | 0.0 | 0.0 |
167 | 002a0be9-3c47-4471-bfac-130d33cdb038 | e6dad351-ac76-4453-827a-7841ceaf0914 | Charged Off | 19173 | Short Term | 609.0 | < 1 year | Home Mortgage | 80728.0 | Other | $905.50 | 19.6 | 14.0 | 5 | 0 | 12594 | 12619 | 0.0 | 0.0 |
182 | 002d9230-a4a2-4a53-b2e6-95df00866341 | a96d08d2-cbb3-42d8-9718-909f1ccfc9ad | Charged Off | 9795 | Short Term | 687.0 | 6 years | Rent | 27496.0 | other | $788.22 | 6.6 | 40.0 | 4 | 0 | 2290 | 4626 | 0.0 | 0.0 |
201 | 00336793-40a5-448c-81a6-528bf04a04ba | 7b182b4a-d3ed-419a-b889-83f44fff2203 | Charged Off | 7886 | Short Term | 741.0 | 5 years | Rent | 23854.0 | Debt Consolidation | $415.45 | 6.2 | NaN | 6 | 0 | 5394 | 6801 | 0.0 | 0.0 |
207 | 0033eb55-3702-4a17-8e68-ef95f3f57053 | 58a5de0c-b628-4730-a25f-5239ab65e191 | Charged Off | 2297 | Short Term | 708.0 | 5 years | Rent | 27564.0 | Debt Consolidation | $243.48 | 6.9 | NaN | 10 | 0 | 3859 | 12449 | 0.0 | 0.0 |
221 | 00393195-42fd-4b0f-a019-10cb0c86dce4 | 3fc2799b-87d2-4068-b4d9-abfba07be5be | Charged Off | 15958 | Short Term | 713.0 | 10+ years | Own Home | 48800.0 | Debt Consolidation | $727.93 | 17.0 | NaN | 9 | 1 | 11523 | 18146 | 0.0 | 0.0 |
228 | 003a346b-6ede-4179-8991-c63c4ef9af55 | ecad702f-d39e-46c9-9418-ac291588bc2d | Charged Off | 15018 | Long Term | 732.0 | 9 years | Home Mortgage | 45054.0 | Debt Consolidation | $1,006.21 | 17.0 | NaN | 15 | 0 | 15460 | 28058 | 0.0 | 0.0 |
230 | 003a4f3b-7599-4b5a-88a6-f569bb8bc955 | 32c9ce22-5628-4c4f-8a30-3c1e7df3e817 | Charged Off | 17304 | Short Term | 657.0 | 5 years | Own Home | 46823.0 | Debt Consolidation | $1,069.13 | 13.0 | NaN | 12 | 0 | 16000 | 16394 | 0.0 | 0.0 |
272 | 0044a2d5-3ffa-4103-bc5a-d1c9b77e64ac | 70d9d527-c3dc-45dc-9661-de97eb5ac94f | Charged Off | 16356 | Short Term | 709.0 | 10+ years | Home Mortgage | 60486.0 | Debt Consolidation | $1,048.42 | 20.4 | 36.0 | 8 | 1 | 11073 | 19563 | 1.0 | 0.0 |
307 | 004d4be5-c8b9-42cc-871c-313a58feb263 | 2937ddf0-c778-459f-b602-3e0a402f68a4 | Charged Off | 10799 | Long Term | 708.0 | 2 years | Rent | 29451.0 | Debt Consolidation | $912.98 | 24.1 | NaN | 12 | 0 | 14298 | 18545 | 0.0 | 0.0 |
315 | 004ee15f-da2a-4590-8f9a-ac139929f867 | c5afc0ee-482b-441a-8c81-23165bc95eef | Charged Off | 12072 | Short Term | 743.0 | < 1 year | Home Mortgage | 39420.0 | Debt Consolidation | $683.28 | 9.5 | NaN | 13 | 0 | 4096 | 22628 | 0.0 | 0.0 |
321 | 0051664e-e2f1-4f60-8ccb-f59fba657148 | 0ce8d48e-9584-4736-97ef-348e95a3e536 | Charged Off | 8891 | Short Term | 732.0 | 3 years | Rent | 61250.0 | Debt Consolidation | $1,260.73 | 12.9 | NaN | 12 | 0 | 8959 | 17397 | 0.0 | 0.0 |
341 | 0054b992-5d03-4624-83c3-b156ee89f393 | b7693a5e-da84-441e-bad8-b0508ab95884 | Charged Off | 10747 | Long Term | 6600.0 | 6 years | Rent | 80352.0 | Debt Consolidation | $1,459.72 | 14.0 | 66.0 | 7 | 0 | 2480 | 2914 | 0.0 | 0.0 |
349 | 005610f4-3fcd-4344-baa1-c53111dea319 | b283c93c-337f-4cb8-a3ea-2435ed5fb0c4 | Charged Off | 23312 | Short Term | 6760.0 | 8 years | Rent | 76853.0 | Debt Consolidation | $954.25 | 9.2 | NaN | 10 | 0 | 14038 | 18643 | 0.0 | 0.0 |
381 | 005fdf32-84ea-4e1c-9802-2d8d418c4655 | d8076865-2835-40ce-9326-7b231464bb27 | Charged Off | 3020 | Short Term | 699.0 | 3 years | Own Home | 59389.0 | Debt Consolidation | $673.08 | 6.5 | NaN | 17 | 0 | 17159 | 28503 | 0.0 | 0.0 |
418 | 006a2c4f-30a0-43cc-ab8f-dd152a1c3c4b | 6624426a-79bc-4d53-b32b-536a284cac5f | Charged Off | 35053 | Short Term | NaN | 3 years | Rent | NaN | Business Loan | $809.13 | 13.3 | NaN | 25 | 0 | 5153 | 42585 | 0.0 | 0.0 |
420 | 006a5d3b-d802-4874-a13e-0df0ff838b5b | 173f207d-8d77-406f-b596-3f641e42ff4f | Charged Off | 27359 | Short Term | 7460.0 | 10+ years | Home Mortgage | 390840.0 | Debt Consolidation | $5,243.77 | 27.3 | NaN | 22 | 0 | 257406 | 325008 | 0.0 | 0.0 |
432 | 006be99e-94e5-4c24-84af-c6106771a26c | cc35b2b5-97b3-4257-b15f-7c66fe7d45f9 | Charged Off | 18336 | Long Term | 703.0 | 10+ years | Home Mortgage | 66398.0 | Debt Consolidation | $1,101.10 | 15.4 | NaN | 11 | 0 | 47925 | 86978 | 0.0 | 0.0 |
437 | 006d81cf-6ced-4840-9a33-9783fc02d841 | e1fbde58-5f55-42ed-af5a-cafe9d7e22c9 | Charged Off | 7662 | Long Term | 726.0 | 10+ years | Home Mortgage | 26469.0 | Debt Consolidation | $467.63 | 19.8 | 79.0 | 6 | 0 | 4607 | 5339 | 0.0 | 0.0 |
443 | 006f090c-fbd5-466c-ae1e-0a2c1df1f31d | 1d2a9045-5e06-43ff-865f-5e393d65873f | Charged Off | 11933 | Long Term | 722.0 | 8 years | Home Mortgage | 74580.0 | Debt Consolidation | $1,280.29 | 25.9 | 20.0 | 15 | 0 | 17883 | 31429 | 0.0 | 0.0 |
457 | 0074c307-71da-4daa-8c25-83751b831101 | 77bbd683-dd15-438f-a902-2dbb04375f45 | Charged Off | 3564 | Short Term | 712.0 | 8 years | Rent | 45545.0 | Debt Consolidation | $565.52 | 15.8 | 50.0 | 8 | 0 | 3763 | 5176 | 0.0 | 0.0 |
460 | 00750cae-1b87-45cd-9ce5-ee6e002bf2ad | fea3d406-4070-45cf-932e-6e26c8666023 | Charged Off | 31709 | Long Term | NaN | 2 years | Rent | NaN | other | $1,307.99 | 16.8 | NaN | 15 | 0 | 3624 | 27246 | 0.0 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
256417 | ff6d32f8-d4f2-4a9e-a4e0-690781eaeb97 | a9a46824-4f43-49c6-aaca-8b2f9509f7f0 | Charged Off | 12248 | Short Term | NaN | 10+ years | Rent | NaN | Debt Consolidation | $772.36 | 17.1 | NaN | 6 | 0 | 6808 | 9796 | 0.0 | 0.0 |
256445 | ff73bf26-fd77-4433-9246-987173784524 | 68d6ec31-a319-447b-9190-af208793f9f0 | Charged Off | 16133 | Long Term | 702.0 | 5 years | Rent | 60498.0 | Debt Consolidation | $630.69 | 16.0 | NaN | 6 | 0 | 14663 | 20281 | 0.0 | 0.0 |
256469 | ff79d91e-661c-4da4-8ab4-acd0c73995a3 | 98558920-bb9b-493c-827e-b41e805bc705 | Charged Off | 4914 | Short Term | NaN | NaN | Rent | NaN | Debt Consolidation | $250.61 | 18.0 | NaN | 6 | 1 | 5768 | 7077 | 1.0 | 0.0 |
256472 | ff7a814f-a952-4950-9eea-7779b33e3a8d | c8362588-cb64-4b17-804d-95ba7b461ee3 | Charged Off | 10093 | Short Term | NaN | 8 years | Home Mortgage | NaN | other | $1,396.19 | 17.1 | 38.0 | 12 | 0 | 59702 | 64334 | 0.0 | 0.0 |
256474 | ff7a8fe7-888e-49c3-9305-9c6e34e6f6f2 | 03c22e05-969a-4673-aa26-9621a355f458 | Charged Off | 24490 | Long Term | 739.0 | 1 year | Home Mortgage | 63674.0 | Debt Consolidation | $986.95 | 19.6 | NaN | 13 | 0 | 5416 | 24288 | 0.0 | 0.0 |
256476 | ff7b269c-cacf-4440-a86b-ab64ebb0d436 | ea18d4d2-cb2a-4043-839d-808ea3e7a11b | Charged Off | 10156 | Short Term | 746.0 | 4 years | Home Mortgage | 73123.0 | Debt Consolidation | $1,413.72 | 28.9 | NaN | 9 | 0 | 23791 | 31470 | 0.0 | 0.0 |
256502 | ff7fce5a-3f95-4f03-9796-a9e60086e9c5 | aa619302-49c8-472b-bf76-934a14f65254 | Charged Off | 15656 | Long Term | 725.0 | 8 years | Rent | 43054.0 | Debt Consolidation | $681.69 | 11.5 | NaN | 5 | 0 | 3632 | 6167 | 0.0 | 0.0 |
256512 | ff81879a-8511-4b4a-a3ac-6bd51aaa6293 | 34707ab0-51dc-4cd8-a0c8-61b19bc45839 | Charged Off | 7859 | Short Term | 721.0 | 1 year | Rent | 29472.0 | Debt Consolidation | $397.87 | 11.9 | 72.0 | 16 | 0 | 3350 | 12454 | 0.0 | 0.0 |
256523 | ff847f57-6670-4b8e-b0ac-aa6a9b2efa69 | 647be725-9aff-4d7d-8b4f-7c8a6d7a2f91 | Charged Off | 16179 | Short Term | 747.0 | 7 years | Home Mortgage | 96064.0 | Debt Consolidation | $976.65 | 20.2 | NaN | 12 | 0 | 6065 | 55644 | 0.0 | 0.0 |
256563 | ff8f89a7-e168-4bcb-b678-bf11ace43813 | 4c516175-c4b0-4e4a-8ce1-b460bfa4c08f | Charged Off | 10045 | Short Term | NaN | 1 year | Home Mortgage | NaN | Debt Consolidation | $738.31 | 8.1 | NaN | 11 | 0 | 6479 | 8436 | 0.0 | 0.0 |
256566 | ff8fc7eb-0237-47b2-bfed-8bf1151f13df | 35f5f178-e329-4e90-b565-8203ddfd1992 | Charged Off | 15781 | Short Term | NaN | 1 year | Rent | NaN | Debt Consolidation | $1,307.83 | 11.2 | NaN | 17 | 0 | 18653 | 25835 | 0.0 | 0.0 |
256575 | ff921f7d-1ef0-4f12-9a86-d6cc8fce3a3d | 2278a185-7ec7-4769-a8af-205bf068c99e | Charged Off | 30579 | Long Term | NaN | 5 years | Home Mortgage | NaN | Debt Consolidation | $1,048.23 | 23.5 | NaN | 9 | 0 | 19244 | 82950 | 0.0 | 0.0 |
256589 | ff946889-eac4-4cb1-b245-7cf6d76c1a5e | 4973925c-a4e0-40a7-aa21-98caae67949c | Charged Off | 13834 | Long Term | 699.0 | 8 years | Rent | 39384.0 | Debt Consolidation | $347.89 | 13.5 | 16.0 | 19 | 0 | 7334 | 26008 | 0.0 | 0.0 |
256611 | ff9b0800-2363-4dbb-88a9-8a7b51f24a75 | 472c643f-e992-4a2f-b705-8357c78afdab | Charged Off | 14076 | Short Term | 712.0 | 2 years | Rent | 53688.0 | Debt Consolidation | $760.59 | 12.5 | NaN | 6 | 0 | 6343 | 9049 | 0.0 | 0.0 |
256652 | ffa5b141-790f-4547-a9e0-f9ba3d44e405 | edeb8731-845e-4f9e-8ae7-cdf7aa262dac | Charged Off | 24700 | Short Term | 715.0 | 3 years | Own Home | 138320.0 | other | $1,613.73 | 17.0 | NaN | 9 | 0 | 5305 | 19646 | 0.0 | 0.0 |
256690 | ffb05285-380a-4aa8-8180-b55f7aac388b | f82e4d81-36b3-40d6-942d-3e701f0e6650 | Charged Off | 12047 | Short Term | 744.0 | 2 years | Home Mortgage | 44172.0 | Debt Consolidation | $585.27 | 14.8 | NaN | 5 | 0 | 8708 | 9839 | 0.0 | 0.0 |
256723 | ffb844e3-73ef-451d-b084-06dddaf8bd56 | c7c1469e-3e56-4c9d-9eb6-cc94ff4f6c82 | Charged Off | 5858 | Short Term | 740.0 | NaN | Rent | 29624.0 | Debt Consolidation | $419.68 | 16.7 | 74.0 | 13 | 1 | 9339 | 15335 | 1.0 | 0.0 |
256777 | ffc6dd5c-ace9-41b0-bec3-4e5ae8567b49 | f27c9d87-5e7a-4527-99f4-6e8de41e9a8c | Charged Off | 7855 | Short Term | 735.0 | 3 years | Home Mortgage | 52190.0 | Debt Consolidation | $1,365.64 | 15.9 | NaN | 16 | 0 | 18617 | 25329 | 0.0 | 0.0 |
256815 | ffd081d8-0ce7-41cf-8318-baccaabf681a | 011bbf15-53a7-4962-a5ee-f2276a0d55f2 | Charged Off | 4780 | Short Term | 748.0 | 3 years | Rent | 27885.0 | Debt Consolidation | $418.28 | 13.9 | 80.0 | 12 | 0 | 7222 | 32829 | 0.0 | 0.0 |
256825 | ffd2c305-c78e-4dc2-b27c-2e46281020de | 4038da05-5677-4535-a00b-1322ff9b0060 | Charged Off | 5553 | Short Term | 741.0 | 10+ years | Rent | 71395.0 | Other | $1,181.00 | 15.5 | NaN | 11 | 0 | 17761 | 24099 | 0.0 | 0.0 |
256844 | ffda0100-49f8-4205-a853-72fd7f709328 | ac055546-80ce-4f32-9d32-b2c0ca132d33 | Charged Off | 24978 | Long Term | 6890.0 | 5 years | Home Mortgage | 144870.0 | Debt Consolidation | $1,943.67 | 15.9 | 34.0 | 10 | 2 | 5563 | 9287 | 1.0 | 0.0 |
256857 | ffdfbaba-e39f-46bd-b6b0-2e71f29d0f3d | a814b7cf-cac0-44f8-bc33-169a107eee0a | Charged Off | 4893 | Short Term | 729.0 | NaN | Home Mortgage | 47951.0 | Debt Consolidation | $439.56 | 15.0 | NaN | 5 | 0 | 6272 | 9784 | 0.0 | 0.0 |
256859 | ffdfd7b0-bfa9-4b5f-bccf-cef97575011b | 2e59e451-338e-45c3-8488-3656bc145506 | Charged Off | 14855 | Long Term | NaN | 4 years | Own Home | NaN | Debt Consolidation | $728.65 | 10.9 | 28.0 | 22 | 0 | 15825 | 28411 | 0.0 | 0.0 |
256869 | ffe22ad2-1f4c-4ad8-84b4-4c2ed333dcf9 | e6bdcc35-8f9c-4691-bbcc-fb35cb489d10 | Charged Off | 13682 | Long Term | 6600.0 | 1 year | Rent | 39092.0 | Debt Consolidation | $616.68 | 9.0 | NaN | 9 | 0 | 4464 | 10455 | 0.0 | 0.0 |
256875 | ffe33e62-83b6-4bc2-9ec7-7f865621542e | 307dc053-1357-40ac-a394-ad916ecca8ca | Charged Off | 1480 | Short Term | 7350.0 | 4 years | Rent | 34528.0 | Debt Consolidation | $417.21 | 6.5 | NaN | 13 | 0 | 2792 | 6997 | 0.0 | 0.0 |
256892 | ffe9a992-ed0a-4a80-bb8e-91e8fd778bec | 53fbf0c5-0852-47a8-9860-6418fe21dd56 | Charged Off | 10784 | Short Term | 721.0 | < 1 year | Rent | 24581.0 | Debt Consolidation | $559.21 | 14.8 | NaN | 9 | 0 | 18710 | 20674 | 0.0 | 0.0 |
256917 | ffef32fb-8f16-496f-88fa-08929936267d | 507f6a07-19f9-4e73-9064-046c445c45b4 | Charged Off | 18331 | Short Term | 733.0 | 3 years | Own Home | 101840.0 | Debt Consolidation | $3,190.98 | 18.3 | NaN | 19 | 0 | 14696 | 20874 | 0.0 | 0.0 |
256969 | fffc4837-d6f7-48c7-bc61-78422af92eb6 | 01511702-6e9f-490e-9fef-0a9dbf1c0e20 | Charged Off | 10782 | Short Term | 738.0 | 5 years | Rent | 36787.0 | Debt Consolidation | $732.67 | 9.5 | NaN | 8 | 0 | 13415 | 16440 | 0.0 | 0.0 |
256972 | fffd5218-faf0-4b82-becf-11664fa064a3 | 301954fb-2381-4d68-a3b0-478870cd49f3 | Charged Off | 5968 | Short Term | 746.0 | 8 years | Own Home | 79568.0 | Debt Consolidation | $1,412.33 | 22.1 | 63.0 | 10 | 1 | 4237 | 0 | 0.0 | 0.0 |
256976 | fffe5140-8725-4fd1-b3bf-9e59cb7834eb | df856c29-ce44-4df7-94a6-bd758a1c1332 | Charged Off | 34710 | Short Term | 700.0 | 9 years | Rent | 148755.0 | Business Loan | $47.11 | 34.0 | 1.0 | 3 | 0 | 2590 | 37004 | 0.0 | 0.0 |
16610 rows × 19 columns
df[df['Loan ID']=='00029f9f-0cc5-4d4e-aabc-ea4a7fe74e12']
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
12 | 00029f9f-0cc5-4d4e-aabc-ea4a7fe74e12 | afbc2fa3-3bad-4d48-b691-829aed78bad5 | Charged Off | 17980 | Short Term | NaN | < 1 year | Own Home | NaN | Debt Consolidation | $597.50 | 9.9 | 43.0 | 7 | 0 | 6817 | 22800 | 0.0 | 0.0 |
13 | 00029f9f-0cc5-4d4e-aabc-ea4a7fe74e12 | afbc2fa3-3bad-4d48-b691-829aed78bad5 | Charged Off | 17961 | Short Term | 701.0 | < 1 year | Own Home | 53053.0 | Debt Consolidation | $596.85 | 9.9 | 43.0 | 7 | 0 | 6810 | 22775 | 0.0 | 0.0 |
14 | 00029f9f-0cc5-4d4e-aabc-ea4a7fe74e12 | afbc2fa3-3bad-4d48-b691-829aed78bad5 | Charged Off | 17961 | Short Term | 701.0 | < 1 year | Own Home | 53053.0 | Debt Consolidation | $596.85 | 9.9 | 43.0 | 7 | 0 | 6810 | 22775 | 0.0 | 0.0 |
15 | 00029f9f-0cc5-4d4e-aabc-ea4a7fe74e12 | afbc2fa3-3bad-4d48-b691-829aed78bad5 | Charged Off | 17980 | Short Term | 701.0 | < 1 year | Own Home | 53112.0 | Debt Consolidation | $597.50 | 9.9 | 43.0 | 7 | 0 | 6817 | 22800 | 0.0 | 0.0 |
no_dup = df.copy()
Remove duplicate rows
no_dup.drop_duplicates(keep = 'first',inplace = True)
no_dup[no_dup['Loan ID']=='00029f9f-0cc5-4d4e-aabc-ea4a7fe74e12']
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
12 | 00029f9f-0cc5-4d4e-aabc-ea4a7fe74e12 | afbc2fa3-3bad-4d48-b691-829aed78bad5 | Charged Off | 17980 | Short Term | NaN | < 1 year | Own Home | NaN | Debt Consolidation | $597.50 | 9.9 | 43.0 | 7 | 0 | 6817 | 22800 | 0.0 | 0.0 |
13 | 00029f9f-0cc5-4d4e-aabc-ea4a7fe74e12 | afbc2fa3-3bad-4d48-b691-829aed78bad5 | Charged Off | 17961 | Short Term | 701.0 | < 1 year | Own Home | 53053.0 | Debt Consolidation | $596.85 | 9.9 | 43.0 | 7 | 0 | 6810 | 22775 | 0.0 | 0.0 |
15 | 00029f9f-0cc5-4d4e-aabc-ea4a7fe74e12 | afbc2fa3-3bad-4d48-b691-829aed78bad5 | Charged Off | 17980 | Short Term | 701.0 | < 1 year | Own Home | 53112.0 | Debt Consolidation | $597.50 | 9.9 | 43.0 | 7 | 0 | 6817 | 22800 | 0.0 | 0.0 |
no_dup[no_dup['Loan ID'].duplicated()==True]
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
7 | 0000afa6-8902-4f8f-b870-25a8fdad0aeb | e49c1a82-a0f7-45e8-9f46-2f75c43f9fbc | Charged Off | 24613 | Long Term | NaN | 6 years | Rent | NaN | Business Loan | $542.29 | 17.6 | 73.0 | 7 | 0 | 14123 | 16954 | 0.0 | 0.0 |
13 | 00029f9f-0cc5-4d4e-aabc-ea4a7fe74e12 | afbc2fa3-3bad-4d48-b691-829aed78bad5 | Charged Off | 17961 | Short Term | 701.0 | < 1 year | Own Home | 53053.0 | Debt Consolidation | $596.85 | 9.9 | 43.0 | 7 | 0 | 6810 | 22775 | 0.0 | 0.0 |
15 | 00029f9f-0cc5-4d4e-aabc-ea4a7fe74e12 | afbc2fa3-3bad-4d48-b691-829aed78bad5 | Charged Off | 17980 | Short Term | 701.0 | < 1 year | Own Home | 53112.0 | Debt Consolidation | $597.50 | 9.9 | 43.0 | 7 | 0 | 6817 | 22800 | 0.0 | 0.0 |
21 | 00038a08-f058-4add-a8ed-497b91672a9e | 727bb429-dfa1-41c4-a347-23230e23949f | Charged Off | 16929 | Long Term | 734.0 | 3 years | Home Mortgage | 40308.0 | Debt Consolidation | $745.70 | 14.9 | NaN | 10 | 0 | 28513 | 101110 | 0.0 | 0.0 |
47 | 000a08fa-e669-46dc-9a96-9da6b96570a0 | 4b7782a6-b84c-4dd9-ac62-f927a68eeaa9 | Charged Off | 21160 | Long Term | 636.0 | 5 years | Home Mortgage | 77770.0 | Debt Consolidation | $1,290.98 | 20.8 | NaN | 28 | 0 | 28175 | 51227 | 0.0 | 0.0 |
56 | 000c0c27-1961-4336-b760-a300e1bce9b0 | 881edf98-387c-4eef-ab71-317a7a5e70c1 | Charged Off | 6463 | Short Term | NaN | 1 year | Own Home | NaN | Debt Consolidation | $633.29 | 19.9 | 74.0 | 9 | 0 | 8025 | 24317 | 0.0 | 0.0 |
72 | 000f1140-ef9f-42f9-9898-f9b3162bf0da | 442c2d9b-3d95-44a9-bcb4-0dc80a941e3c | Charged Off | 6007 | Short Term | NaN | 10+ years | Home Mortgage | NaN | Debt Consolidation | $538.15 | 23.6 | 13.0 | 6 | 0 | 5077 | 7811 | 0.0 | 0.0 |
74 | 000f5c35-dc3e-415f-9c22-e355ef96d78b | 0ecf29b6-93eb-4c13-b853-a9cbdfc7f85b | Charged Off | 12899 | Long Term | 667.0 | 2 years | Rent | 133947.0 | Debt Consolidation | $1,931.07 | 18.4 | 7.0 | 16 | 0 | 39417 | 90199 | 0.0 | 0.0 |
80 | 001227a1-4490-47c6-ada7-20faf636e9a3 | 3d4a8985-0b72-4084-8b20-f441d9868483 | Charged Off | 5060 | Short Term | 727.0 | < 1 year | Home Mortgage | 62738.0 | Other | $711.03 | 20.4 | 22.0 | 6 | 0 | 10777 | 12041 | 0.0 | 0.0 |
85 | 00158423-d7a4-4fef-8e0d-ad8de93e8ba1 | 11e878dc-0982-4c6b-a32a-ef5fca6ed6f4 | Charged Off | 11546 | Long Term | 707.0 | 9 years | Rent | 33551.0 | Debt Consolidation | $522.84 | 22.7 | NaN | 17 | 0 | 12113 | 15630 | 0.0 | 0.0 |
89 | 00163613-1360-4e43-8fbc-7818032510b7 | 2eba693e-f810-459d-8df7-9a2376138466 | Charged Off | 5041 | Short Term | 737.0 | 2 years | Rent | 51147.0 | Debt Consolidation | $613.77 | 10.9 | NaN | 11 | 0 | 3644 | 4428 | 0.0 | 0.0 |
113 | 001ce813-522a-4a34-bbcc-21eaeae79aea | cc73a15c-7ad4-4101-a2af-57306c335285 | Charged Off | 15191 | Long Term | NaN | 7 years | Rent | NaN | Debt Consolidation | $385.07 | 14.0 | NaN | 11 | 0 | 15655 | 24810 | 0.0 | 0.0 |
114 | 001ce813-522a-4a34-bbcc-21eaeae79aea | cc73a15c-7ad4-4101-a2af-57306c335285 | Charged Off | 15191 | Long Term | 706.0 | 7 years | Rent | 44863.0 | Debt Consolidation | $385.07 | 14.0 | NaN | 11 | 0 | 15655 | 24810 | 0.0 | 0.0 |
115 | 001ce813-522a-4a34-bbcc-21eaeae79aea | cc73a15c-7ad4-4101-a2af-57306c335285 | Charged Off | 15306 | Long Term | 706.0 | 7 years | Rent | 45204.0 | Debt Consolidation | $388.00 | 14.0 | NaN | 11 | 0 | 15774 | 24999 | 0.0 | 0.0 |
125 | 00201656-d70d-49cf-b5b0-118280383288 | 0d419e30-7336-4e52-980d-8c17365acc67 | Charged Off | 9663 | Short Term | 743.0 | 2 years | Home Mortgage | 42277.0 | Debt Consolidation | $806.79 | 13.3 | 81.0 | 14 | 0 | 11990 | 16008 | 0.0 | 0.0 |
127 | 00205bb2-b17c-4001-84ac-c6d928e4eaa6 | 45aa9267-04f2-49b5-8991-7bb1ffc8f669 | Charged Off | 14834 | Short Term | 741.0 | 3 years | Home Mortgage | 98890.0 | Debt Consolidation | $2,398.08 | 16.0 | 12.0 | 8 | 0 | 5810 | 5736 | 0.0 | 0.0 |
134 | 00221528-22cc-422d-ace3-71a4c5a43e59 | 79791672-07e1-423e-bc3e-beecee96e4a6 | Charged Off | 3278 | Short Term | 733.0 | NaN | Rent | 39948.0 | Medical Bills | $581.90 | 21.5 | 78.0 | 10 | 0 | 10392 | 14042 | 0.0 | 0.0 |
151 | 0026a1b5-8a22-478d-a6fe-5931fa8ed673 | 88e7f72d-e0f0-423b-86bc-0b05f023ca7e | Charged Off | 11017 | Short Term | NaN | 2 years | Rent | NaN | Debt Consolidation | $1,280.23 | 11.7 | NaN | 9 | 0 | 7140 | 18788 | 0.0 | 0.0 |
152 | 0026a1b5-8a22-478d-a6fe-5931fa8ed673 | 88e7f72d-e0f0-423b-86bc-0b05f023ca7e | Charged Off | 11017 | Short Term | 740.0 | 2 years | Rent | 61206.0 | Debt Consolidation | $1,280.23 | 11.7 | NaN | 9 | 0 | 7140 | 18788 | 0.0 | 0.0 |
162 | 00291dfe-e40a-4791-acc8-a322ee3d9b72 | 835f15b4-ccea-46fb-9a1f-1ccda42ae737 | Charged Off | 6017 | Short Term | NaN | 4 years | Rent | NaN | other | $448.63 | 8.9 | NaN | 12 | 0 | 7263 | 19736 | 0.0 | 0.0 |
171 | 002ae016-b028-483d-91fa-ab351bd326ce | 2bd8d9b0-989f-40ae-be5a-9bee5cf645f5 | Charged Off | 16122 | Long Term | 699.0 | 10+ years | Rent | 45342.0 | Debt Consolidation | $1,031.53 | 20.4 | 52.0 | 16 | 1 | 7780 | 17327 | 0.0 | 0.0 |
176 | 002bd679-cca8-4637-9e21-f07117547ade | ce4134c2-3656-4b45-8a98-7c90ae95ef2e | Charged Off | 6840 | Short Term | NaN | < 1 year | Rent | NaN | Buy House | $2,208.06 | 26.4 | 60.0 | 20 | 0 | 7983 | 10951 | 0.0 | 0.0 |
204 | 0033afa0-9ee0-4b80-8583-84fd5df32cdd | cdfc5fc0-0107-4770-b66d-dcfac1f08a38 | Charged Off | 9931 | Long Term | 700.0 | NaN | Rent | 34560.0 | other | $743.04 | 23.0 | NaN | 9 | 0 | 3711 | 12011 | 0.0 | 0.0 |
226 | 003a3241-adbe-4337-b73a-5085c6834a74 | 08117601-9319-41e3-9fa6-6da917c70fa5 | Charged Off | 7253 | Short Term | 744.0 | < 1 year | Rent | 45329.0 | Debt Consolidation | $30.22 | 15.9 | 0.0 | 4 | 0 | 3356 | 10968 | NaN | 0.0 |
234 | 003b2a04-c602-4685-bbe1-b049c6d208c8 | c281b763-fd5e-4fe1-a294-27965ffba250 | Charged Off | 32889 | Short Term | 664.0 | 2 years | Rent | 188811.0 | Home Improvements | $2,722.02 | 8.0 | NaN | 8 | 0 | 3304 | 5715 | 0.0 | 0.0 |
252 | 0041f6e0-7b30-45db-b796-44fe0f5defa8 | c9046850-92e6-4024-8a43-a7bdab580065 | Charged Off | 1530 | Short Term | 672.0 | 8 years | Rent | 43856.0 | other | $1,180.45 | 16.0 | NaN | 11 | 0 | 12135 | 17638 | 0.0 | 0.0 |
261 | 00430797-c7fa-4cdc-b1f5-1f777de86d4a | e39d2d77-a389-44b9-a0bd-82f112bf3777 | Charged Off | 29061 | Long Term | 700.0 | 5 years | Rent | 66281.0 | Debt Consolidation | $1,231.72 | 14.5 | NaN | 13 | 0 | 37839 | 59402 | 0.0 | 0.0 |
263 | 00430b50-f36d-4abd-9e96-29998ee4150a | 29ebed6e-e0d7-444b-8f9d-af6727d6b5c0 | Charged Off | 14423 | Short Term | 719.0 | 2 years | Rent | 34354.0 | Debt Consolidation | $755.78 | 14.5 | NaN | 22 | 0 | 14608 | 16562 | 0.0 | 0.0 |
270 | 0044a2d5-3ffa-4103-bc5a-d1c9b77e64ac | 70d9d527-c3dc-45dc-9661-de97eb5ac94f | Charged Off | 16381 | Short Term | 709.0 | 10+ years | Home Mortgage | 60576.0 | Debt Consolidation | $1,049.98 | 20.4 | 36.0 | 8 | 1 | 11089 | 19592 | 1.0 | 0.0 |
271 | 0044a2d5-3ffa-4103-bc5a-d1c9b77e64ac | 70d9d527-c3dc-45dc-9661-de97eb5ac94f | Charged Off | 16356 | Short Term | 709.0 | 10+ years | Home Mortgage | 60486.0 | Debt Consolidation | $1,048.42 | 20.4 | 36.0 | 8 | 1 | 11073 | 19563 | 1.0 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
256735 | ffbc24f2-d57b-4196-9787-78016f0e435a | c01d9949-9ec5-429d-8064-e007e0b20c67 | Charged Off | 20723 | Long Term | 695.0 | 10+ years | Home Mortgage | 52785.0 | Debt Consolidation | $967.73 | 28.3 | NaN | 11 | 0 | 14730 | 17391 | 0.0 | 0.0 |
256742 | ffbe5e46-dd9b-4a5e-937a-cb173c667cfa | 3bebc2f0-8618-41b4-bd6c-b98bc48f096a | Charged Off | 30000 | Long Term | 686.0 | 10+ years | Own Home | 89109.0 | Debt Consolidation | $2,398.52 | 20.4 | 16.0 | 8 | 0 | 41401 | 48029 | 0.0 | 0.0 |
256747 | ffbf871d-5f1d-49e2-b5f7-f9a0995e5089 | 62a7273b-6bd6-44b0-9b3a-327754b063ec | Charged Off | 24502 | Long Term | 718.0 | < 1 year | Home Mortgage | 98009.0 | Debt Consolidation | $1,462.79 | 19.6 | NaN | 4 | 0 | 25823 | 28283 | 0.0 | 0.0 |
256750 | ffc07baf-5656-49df-80a6-d753a7a71bcf | fa57bad2-238f-4172-afa5-2fae4deff911 | Charged Off | 10225 | Long Term | 717.0 | 10+ years | Home Mortgage | 86913.0 | Home Improvements | $811.18 | 21.2 | 18.0 | 7 | 0 | 5502 | 6443 | 0.0 | 0.0 |
256760 | ffc3506e-4f06-4233-abb2-85efff9b412b | bd1310e2-e9e4-471f-bf8d-1f3e72e010d5 | Charged Off | 18580 | Long Term | 723.0 | 7 years | Rent | 47917.0 | Debt Consolidation | $690.81 | 29.5 | 5.0 | 12 | 0 | 15800 | 30620 | 0.0 | 0.0 |
256765 | ffc41c13-790b-43c6-8aa8-b4bb96ec53ce | 313e03f7-487c-4e66-9c65-e2fbd625baf1 | Charged Off | 30129 | Long Term | 690.0 | 10+ years | Own Home | 75323.0 | Debt Consolidation | $1,060.79 | 13.5 | NaN | 15 | 0 | 22114 | 37930 | 0.0 | 0.0 |
256781 | ffc782cf-549a-4b7b-b930-c5da2191154d | 899532d0-ae51-49b7-ba08-b66ead574374 | Charged Off | 8110 | Short Term | 727.0 | 1 year | Rent | 24331.0 | Debt Consolidation | $490.68 | 9.4 | NaN | 5 | 0 | 9817 | 17752 | 0.0 | 0.0 |
256793 | ffcaa10f-5560-4737-b72a-8fb4c91d63c9 | 13d36ee8-a841-43d5-be8e-60ce1e27bc22 | Charged Off | 7888 | Short Term | 712.0 | 10+ years | Own Home | 59160.0 | other | $739.50 | 20.8 | NaN | 14 | 2 | 5464 | 7784 | 2.0 | 0.0 |
256801 | ffcd5c4c-c146-450e-b535-3f8fee1bbaf4 | 27019cfd-28a8-4a94-8a2a-33bba21e37e9 | Charged Off | 20012 | Short Term | 719.0 | 2 years | Rent | 82705.0 | Debt Consolidation | $1,026.92 | 19.4 | NaN | 11 | 0 | 36471 | 80866 | 0.0 | 0.0 |
256804 | ffce37ae-4877-4316-b985-ca66e9942861 | 93330384-1e1d-4f78-8d33-766b40040231 | Charged Off | 10605 | Short Term | NaN | 10+ years | Home Mortgage | NaN | Debt Consolidation | $1,164.08 | 14.5 | NaN | 16 | 0 | 18313 | 21750 | 0.0 | 0.0 |
256807 | ffce7776-d6f5-43d7-8a51-766ddad34b8a | df8ba181-9a11-4f13-8fc0-2af26fc1fb48 | Charged Off | 1589 | Short Term | 726.0 | 1 year | Rent | 64167.0 | Debt Consolidation | $320.30 | 15.2 | NaN | 4 | 0 | 2745 | 6460 | 0.0 | 0.0 |
256822 | ffd1aa2b-4231-4cae-82fc-9f182e9471b4 | 3e1dd868-496c-4893-b323-3e908163aa6e | Charged Off | 17509 | Long Term | 676.0 | 2 years | Rent | 37392.0 | Debt Consolidation | $709.52 | 18.5 | NaN | 9 | 1 | 22965 | 23748 | 1.0 | 0.0 |
256864 | ffe07979-874c-4f31-a8c3-550acaa6922e | b36cdf08-5a97-4073-b4da-abeb56f9d0fa | Charged Off | 11569 | Short Term | 739.0 | NaN | Home Mortgage | 50427.0 | Debt Consolidation | $748.00 | 11.0 | 71.0 | 10 | 0 | 6587 | 10292 | 0.0 | 0.0 |
256879 | ffe48901-0341-436b-bc10-8bb50b0e2b7f | 1f942698-0001-4676-b3f6-8b21bd7ca191 | Charged Off | 15352 | Short Term | 653.0 | 8 years | Rent | 300030.0 | Business Loan | $3,200.32 | 13.7 | NaN | 9 | 2 | 6798 | 8898 | 0.0 | 2.0 |
256894 | ffea733e-2ec3-45cd-ba92-ccf97037f5e6 | 869118c0-5868-4acc-8bd2-7ccfc8695cc3 | Charged Off | 35025 | Long Term | 652.0 | 6 years | Home Mortgage | 78055.0 | Home Improvements | $780.55 | 10.8 | NaN | 9 | 0 | 14180 | 19425 | 0.0 | 0.0 |
256901 | ffebcaee-5c82-4557-87dd-01f4b8ffc932 | 6a8860b3-b2eb-49c5-942d-2ef276baee92 | Charged Off | 20260 | Long Term | 718.0 | 9 years | Home Mortgage | 65845.0 | Debt Consolidation | $905.37 | 11.6 | NaN | 18 | 0 | 31082 | 41005 | 0.0 | 0.0 |
256903 | ffebec60-7520-43d6-a392-1f5c30b6d07a | 9b2098ea-7ac4-4479-a524-ce208004c70f | Charged Off | 16201 | Long Term | 699.0 | NaN | Own Home | 35221.0 | Debt Consolidation | $1,091.84 | 32.9 | NaN | 19 | 0 | 31101 | 88104 | 0.0 | 0.0 |
256911 | ffedd0f5-9ce5-4136-a72d-78e7310deea0 | 64278634-4e52-4b4d-beb7-d90f533a77a4 | Charged Off | 12089 | Short Term | 712.0 | 4 years | Rent | 110814.0 | Debt Consolidation | $1,708.38 | 11.4 | 25.0 | 10 | 0 | 6925 | 9978 | 0.0 | 0.0 |
256924 | fff08eca-26b5-4d02-b618-c1cf46736896 | 09106d18-88cd-401a-ab07-9c93599138c6 | Charged Off | 24595 | Long Term | 669.0 | 1 year | Home Mortgage | 59028.0 | Debt Consolidation | $855.91 | 21.0 | 21.0 | 10 | 0 | 17282 | 18040 | 0.0 | 0.0 |
256928 | fff142bd-1430-4bb9-9eac-a5b45436a5b4 | 819e9ff5-f8d4-4c6f-b5c1-9d274b93f159 | Charged Off | 24658 | Long Term | NaN | 9 years | Rent | NaN | Debt Consolidation | $1,060.27 | 20.1 | 21.0 | 12 | 0 | 3718 | 18317 | 0.0 | 0.0 |
256929 | fff142bd-1430-4bb9-9eac-a5b45436a5b4 | 819e9ff5-f8d4-4c6f-b5c1-9d274b93f159 | Charged Off | 24503 | Long Term | 696.0 | 9 years | Rent | 98010.0 | Debt Consolidation | $1,053.61 | 20.1 | 21.0 | 12 | 0 | 3695 | 18201 | 0.0 | 0.0 |
256930 | fff142bd-1430-4bb9-9eac-a5b45436a5b4 | 819e9ff5-f8d4-4c6f-b5c1-9d274b93f159 | Charged Off | 24658 | Long Term | 696.0 | 9 years | Rent | 98630.0 | Debt Consolidation | $1,060.27 | 20.1 | 21.0 | 12 | 0 | 3718 | 18317 | 0.0 | 0.0 |
256933 | fff1c0cc-49ce-47f6-a0c1-e74a7d43fbe1 | cb142142-3ff3-4ffe-87ed-4c0dd9fabb67 | Charged Off | 9681 | Short Term | 732.0 | 8 years | Home Mortgage | 33759.0 | Other | $953.68 | 16.0 | NaN | 17 | 0 | 24900 | 29857 | 0.0 | 0.0 |
256937 | fff2779e-9fe7-4db5-b566-48f43b58aa68 | 7ddd16c2-bf65-4b01-8b48-bbda8793c206 | Charged Off | 9976 | Short Term | 733.0 | 1 year | Home Mortgage | 124700.0 | Debt Consolidation | $1,340.53 | 10.4 | 16.0 | 16 | 0 | 6539 | 17074 | 0.0 | 0.0 |
256947 | fff5f918-967f-4cae-b8e8-0ec41b9d0a69 | 92093f79-1142-496c-8aef-0bc152ed8dd9 | Charged Off | 20658 | Short Term | 745.0 | 10+ years | Rent | 45963.0 | Debt Consolidation | $792.86 | 17.4 | NaN | 10 | 0 | 13192 | 36143 | 0.0 | 0.0 |
256949 | fff61503-051a-4bb8-8342-287939fa738a | b36620f7-d086-45c4-bfdc-b14c2187d0c6 | Charged Off | 20492 | Short Term | 722.0 | 5 years | Rent | 52511.0 | Debt Consolidation | $1,163.99 | 19.0 | NaN | 12 | 0 | 18732 | 24391 | 0.0 | 0.0 |
256966 | fffc297d-74f0-4d42-ad90-cdea2bd9aac0 | b8cae6f0-6c9b-4982-bef2-e3cdf48add59 | Charged Off | 11524 | Short Term | 725.0 | 6 years | Home Mortgage | 50990.0 | Debt Consolidation | $790.35 | 21.4 | NaN | 23 | 0 | 12818 | 19160 | 0.0 | 0.0 |
256974 | fffe36ec-97a6-460a-a236-640b7e0669f0 | 6520de3e-5961-47b1-bced-6a8132acb5cc | Charged Off | 10575 | Long Term | 722.0 | < 1 year | Rent | 30854.0 | Debt Consolidation | $707.08 | 14.8 | NaN | 18 | 0 | 20183 | 23496 | 0.0 | 0.0 |
256978 | fffe84cf-67a1-468d-ab7b-cf06a9c4ee14 | 460c6cf5-4d12-4e3c-a3d0-46dd67c78cce | Charged Off | 11953 | Short Term | 717.0 | 10+ years | Home Mortgage | 39844.0 | Debt Consolidation | $982.82 | 11.7 | NaN | 9 | 1 | 4176 | 4783 | 1.0 | 0.0 |
256982 | ffffcb2e-e48e-4d2c-a0d6-ed6bce5bfdbe | 971a6682-183b-4a52-8bce-1d3429ade295 | Charged Off | 12116 | Short Term | 746.0 | 9 years | Home Mortgage | 52504.0 | Debt Consolidation | $297.96 | 15.1 | 82.0 | 8 | 0 | 3315 | 20090 | 0.0 | 0.0 |
24674 rows × 19 columns
df['Loan Status'].unique()
array(['Fully Paid', 'Charged Off'], dtype=object)
no_dup[no_dup['Loan ID']=='0000afa6-8902-4f8f-b870-25a8fdad0aeb']
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
6 | 0000afa6-8902-4f8f-b870-25a8fdad0aeb | e49c1a82-a0f7-45e8-9f46-2f75c43f9fbc | Charged Off | 24613 | Long Term | 6640.0 | 6 years | Rent | 49225.0 | Business Loan | $542.29 | 17.6 | 73.0 | 7 | 0 | 14123 | 16954 | 0.0 | 0.0 |
7 | 0000afa6-8902-4f8f-b870-25a8fdad0aeb | e49c1a82-a0f7-45e8-9f46-2f75c43f9fbc | Charged Off | 24613 | Long Term | NaN | 6 years | Rent | NaN | Business Loan | $542.29 | 17.6 | 73.0 | 7 | 0 | 14123 | 16954 | 0.0 | 0.0 |
no_dup['null_count'] = no_dup.isnull().sum(axis=1)
no_dup = no_dup.sort_values('null_count', ascending=True).drop('null_count', axis=1)
no_dup.head(20)
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 000025bb-5694-4cff-b17d-192b1a98ba44 | 5ebc8bb1-5eb9-4404-b11b-a6eebc401a19 | Fully Paid | 11520 | Short Term | 741.0 | 10+ years | Home Mortgage | 33694.0 | Debt Consolidation | $584.03 | 12.3 | 41.0 | 10 | 0 | 6760 | 16056 | 0.0 | 0.0 |
128265 | 7f99634c-cc18-4667-92b4-b9e887b7c296 | e52aac79-e3f0-44cf-b538-539c0a01c849 | Fully Paid | 7055 | Short Term | 742.0 | 10+ years | Home Mortgage | 90711.0 | Debt Consolidation | $1,322.87 | 31.9 | 11.0 | 21 | 0 | 18290 | 19752 | 0.0 | 0.0 |
128263 | 7f99378c-b36e-4711-9af2-33c7239c00e7 | 32bdbd23-ca78-4c14-a0da-afc2a4e7a33a | Fully Paid | 99999999 | Long Term | 694.0 | 8 years | Home Mortgage | 110549.0 | Debt Consolidation | $1,824.06 | 21.6 | 13.0 | 18 | 0 | 40907 | 52513 | 0.0 | 0.0 |
128259 | 7f96d054-5de3-4f5f-9a21-d54ee9af703b | 784a48cb-96ee-494c-aa7b-eb388fbf3003 | Fully Paid | 2745 | Short Term | 743.0 | 2 years | Home Mortgage | 34318.0 | Debt Consolidation | $506.18 | 16.5 | 68.0 | 6 | 0 | 5429 | 10501 | 0.0 | 0.0 |
128256 | 7f9558b3-4ed7-4fe0-8662-8021c53c0016 | 0971186f-b0a7-4d11-ba55-4b41348673a2 | Fully Paid | 3008 | Short Term | 750.0 | 2 years | Rent | 37104.0 | Debt Consolidation | $556.55 | 13.8 | 34.0 | 11 | 0 | 732 | 12621 | 0.0 | 0.0 |
128254 | 7f95588d-57ba-4e00-988e-6bca2f2d32ec | 3faf854b-3530-4922-96d7-92d95ad9ded0 | Charged Off | 7077 | Short Term | 7400.0 | 3 years | Home Mortgage | 47179.0 | Debt Consolidation | $1,336.74 | 12.9 | 5.0 | 11 | 0 | 16689 | 18879 | 0.0 | 0.0 |
128253 | 7f943f01-36ea-4660-81da-82a2a3ae333f | f07fc1c0-6ceb-4bf2-8066-99d485d21eee | Fully Paid | 7267 | Short Term | 717.0 | 10+ years | Home Mortgage | 78032.0 | Debt Consolidation | $994.91 | 22.4 | 55.0 | 12 | 1 | 6919 | 7775 | 1.0 | 0.0 |
128251 | 7f943e2c-33e5-444a-bfce-ff83b0d44036 | b3173654-f680-4d83-9e18-451acd7cdd2d | Charged Off | 23522 | Long Term | 6910.0 | 10+ years | Home Mortgage | 70567.0 | Debt Consolidation | $882.09 | 13.1 | 17.0 | 12 | 0 | 6401 | 21196 | 0.0 | 0.0 |
128244 | 7f901e9c-0e09-4561-9b4d-949ef88a32f0 | 20f4d09f-f271-4e31-a23f-77c4f7ac82bf | Fully Paid | 19554 | Short Term | 740.0 | 10+ years | Rent | 53774.0 | Debt Consolidation | $1,030.66 | 29.0 | 14.0 | 10 | 0 | 26450 | 59706 | 0.0 | 0.0 |
128242 | 7f8fea60-ef3a-409b-84b8-f0f9e501293b | f3c67c03-d825-451d-a1b1-a3c9ec3757a3 | Fully Paid | 99999999 | Short Term | 745.0 | 6 years | Home Mortgage | 55258.0 | other | $915.90 | 25.0 | 81.0 | 9 | 0 | 16559 | 19760 | 0.0 | 0.0 |
128241 | 7f8fdf3e-54e6-46c6-b9e7-8578d376438b | fe9a45e1-d046-454d-8373-8e6597036316 | Fully Paid | 17707 | Short Term | 743.0 | 10+ years | Home Mortgage | 79434.0 | Debt Consolidation | $807.58 | 26.9 | 36.0 | 10 | 0 | 15111 | 23501 | 0.0 | 0.0 |
128239 | 7f8ede6b-7b07-4ef6-b32f-5de56b39ad90 | 047bcc6d-cf8a-448c-aba9-80fe88ab9e37 | Fully Paid | 10026 | Short Term | 731.0 | 1 year | Home Mortgage | 65169.0 | Debt Consolidation | $667.98 | 20.8 | 40.0 | 13 | 0 | 17769 | 28160 | 0.0 | 0.0 |
128266 | 7f997992-62a9-4c5f-a611-68ad0f655469 | c831a6d5-d78f-42ca-8592-db7ce4a0ee62 | Fully Paid | 99999999 | Long Term | 694.0 | 10+ years | Own Home | 864258.0 | Debt Consolidation | $5,912.97 | 18.4 | 6.0 | 21 | 0 | 63466 | 93746 | 0.0 | 0.0 |
128234 | 7f8de056-3a6d-4ace-901b-27fb473e04a4 | d4cc2de5-c43f-4941-bedb-5178e26d4b9a | Fully Paid | 9621 | Short Term | 740.0 | 10+ years | Rent | 64731.0 | Debt Consolidation | $760.59 | 23.9 | 12.0 | 11 | 0 | 5243 | 11625 | 0.0 | 0.0 |
128228 | 7f8bd8c2-c195-4425-8e27-a8b0f390b386 | 0562ab32-f8cd-4334-963b-16d6352168c3 | Fully Paid | 7806 | Short Term | 732.0 | 3 years | Home Mortgage | 29742.0 | Debt Consolidation | $401.52 | 9.5 | 13.0 | 12 | 0 | 5143 | 21703 | 0.0 | 0.0 |
128227 | 7f8b8fbf-37b7-4ee1-a873-7c85dfcd1f2f | 492f37b2-91e6-4e81-b9f3-5cce922c758e | Fully Paid | 11121 | Short Term | 732.0 | 2 years | Rent | 49034.0 | Debt Consolidation | $976.59 | 13.3 | 78.0 | 9 | 0 | 19484 | 31941 | 0.0 | 0.0 |
128226 | 7f8ababf-72ab-4119-b1b6-428b336c69df | 0b6cf0a4-baa3-4082-a6c8-3f2215e41611 | Fully Paid | 7429 | Short Term | 741.0 | 10+ years | Home Mortgage | 54478.0 | Debt Consolidation | $1,457.27 | 32.5 | 40.0 | 21 | 2 | 8296 | 32535 | 0.0 | 2.0 |
128217 | 7f8a04ad-214a-474f-b6bd-aeafd2acab8c | ebcec38c-a064-404b-8bc7-16dc616f5920 | Fully Paid | 11461 | Short Term | 750.0 | 10+ years | Own Home | 102330.0 | Other | $1,227.96 | 38.5 | 8.0 | 12 | 0 | 61587 | 249341 | 0.0 | 0.0 |
128215 | 7f89d5df-1397-431f-8075-f97d7010e932 | 30d2930b-6d86-4d27-afb1-ae7fcd61d2ec | Fully Paid | 99999999 | Short Term | 743.0 | 5 years | Rent | 58277.0 | Debt Consolidation | $845.01 | 14.4 | 18.0 | 10 | 0 | 10417 | 21932 | 0.0 | 0.0 |
128212 | 7f893d8b-d5b1-42ee-856b-1036b81642ad | dc92390e-6486-4c28-929a-3a48ff48037f | Fully Paid | 9909 | Short Term | 740.0 | 4 years | Own Home | 43600.0 | Debt Consolidation | $1,046.39 | 13.8 | 28.0 | 11 | 0 | 5353 | 11389 | 0.0 | 0.0 |
no_dup.drop_duplicates(subset = 'Loan ID',keep = 'first',inplace = True)
no_dup[no_dup['Loan ID'].duplicated()==True]
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens |
---|
Analyze the new dataframe with duplicates removed
no_dup.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 215700 entries, 0 to 25243 Data columns (total 19 columns): Loan ID 215700 non-null object Customer ID 215700 non-null object Loan Status 215700 non-null object Current Loan Amount 215700 non-null int64 Term 215700 non-null object Credit Score 171202 non-null float64 Years in current job 206710 non-null object Home Ownership 215700 non-null object Annual Income 171202 non-null float64 Purpose 215700 non-null object Monthly Debt 215700 non-null object Years of Credit History 215700 non-null float64 Months since last delinquent 97438 non-null float64 Number of Open Accounts 215700 non-null int64 Number of Credit Problems 215700 non-null int64 Current Credit Balance 215700 non-null int64 Maximum Open Credit 215700 non-null object Bankruptcies 215248 non-null float64 Tax Liens 215678 non-null float64 dtypes: float64(6), int64(4), object(9) memory usage: 32.9+ MB
#looking at the null values in my dataframe
no_dup.isnull().sum()
Loan ID 0 Customer ID 0 Loan Status 0 Current Loan Amount 0 Term 0 Credit Score 44498 Years in current job 8990 Home Ownership 0 Annual Income 44498 Purpose 0 Monthly Debt 0 Years of Credit History 0 Months since last delinquent 118262 Number of Open Accounts 0 Number of Credit Problems 0 Current Credit Balance 0 Maximum Open Credit 0 Bankruptcies 452 Tax Liens 22 dtype: int64
no_dup.describe()
Current Loan Amount | Credit Score | Annual Income | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Bankruptcies | Tax Liens | |
---|---|---|---|---|---|---|---|---|---|---|
count | 2.157000e+05 | 171202.000000 | 1.712020e+05 | 215700.000000 | 97438.000000 | 215700.000000 | 215700.000000 | 2.157000e+05 | 215248.000000 | 215678.000000 |
mean | 1.633515e+07 | 1035.265114 | 7.280452e+04 | 18.359758 | 35.031949 | 11.077566 | 0.155424 | 1.545723e+04 | 0.110556 | 0.026331 |
std | 3.695307e+07 | 1377.767153 | 5.730483e+04 | 7.057999 | 21.810714 | 4.971825 | 0.457972 | 1.975044e+04 | 0.336853 | 0.240853 |
min | 5.050000e+02 | 585.000000 | 0.000000e+00 | 3.400000 | 0.000000 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000 | 0.000000 |
25% | 8.328000e+03 | 715.000000 | 4.476600e+04 | 13.600000 | 17.000000 | 8.000000 | 0.000000 | 5.932000e+03 | 0.000000 | 0.000000 |
50% | 1.464200e+04 | 733.000000 | 6.210500e+04 | 17.000000 | 32.000000 | 10.000000 | 0.000000 | 1.104200e+04 | 0.000000 | 0.000000 |
75% | 2.520800e+04 | 743.000000 | 8.779200e+04 | 21.800000 | 51.000000 | 14.000000 | 0.000000 | 1.932400e+04 | 0.000000 | 0.000000 |
max | 1.000000e+08 | 7510.000000 | 8.713547e+06 | 70.500000 | 176.000000 | 76.000000 | 11.000000 | 1.731412e+06 | 7.000000 | 11.000000 |
Fill in missing values
filledScore = no_dup[no_dup['Credit Score'].notnull()]
plt.hist(filledScore['Credit Score'], bins = 50)
plt.show()
df_cleaned =no_dup.copy()
df_cleaned['Credit Score']= no_dup['Credit Score'].fillna(no_dup['Credit Score'].median())
df_cleaned.head()
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 000025bb-5694-4cff-b17d-192b1a98ba44 | 5ebc8bb1-5eb9-4404-b11b-a6eebc401a19 | Fully Paid | 11520 | Short Term | 741.0 | 10+ years | Home Mortgage | 33694.0 | Debt Consolidation | $584.03 | 12.3 | 41.0 | 10 | 0 | 6760 | 16056 | 0.0 | 0.0 |
128265 | 7f99634c-cc18-4667-92b4-b9e887b7c296 | e52aac79-e3f0-44cf-b538-539c0a01c849 | Fully Paid | 7055 | Short Term | 742.0 | 10+ years | Home Mortgage | 90711.0 | Debt Consolidation | $1,322.87 | 31.9 | 11.0 | 21 | 0 | 18290 | 19752 | 0.0 | 0.0 |
128263 | 7f99378c-b36e-4711-9af2-33c7239c00e7 | 32bdbd23-ca78-4c14-a0da-afc2a4e7a33a | Fully Paid | 99999999 | Long Term | 694.0 | 8 years | Home Mortgage | 110549.0 | Debt Consolidation | $1,824.06 | 21.6 | 13.0 | 18 | 0 | 40907 | 52513 | 0.0 | 0.0 |
128259 | 7f96d054-5de3-4f5f-9a21-d54ee9af703b | 784a48cb-96ee-494c-aa7b-eb388fbf3003 | Fully Paid | 2745 | Short Term | 743.0 | 2 years | Home Mortgage | 34318.0 | Debt Consolidation | $506.18 | 16.5 | 68.0 | 6 | 0 | 5429 | 10501 | 0.0 | 0.0 |
128256 | 7f9558b3-4ed7-4fe0-8662-8021c53c0016 | 0971186f-b0a7-4d11-ba55-4b41348673a2 | Fully Paid | 3008 | Short Term | 750.0 | 2 years | Rent | 37104.0 | Debt Consolidation | $556.55 | 13.8 | 34.0 | 11 | 0 | 732 | 12621 | 0.0 | 0.0 |
df_cleaned['Years in current job'].unique()
array(['10+ years', '8 years', '2 years', '3 years', '6 years', '1 year', '5 years', '4 years', '7 years', '< 1 year', '9 years', nan], dtype=object)
df_cleaned['Years in current job'].mode()
0 10+ years dtype: object
df_cleaned[df_cleaned['Years in current job'].isnull()==True]
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
173280 | ac42556a-621c-4efe-accd-81b0f8b88763 | a2a872f7-7f3c-42c1-b293-c47d1c8dcaf1 | Fully Paid | 14413 | Short Term | 742.0 | NaN | Home Mortgage | 62056.0 | Home Improvements | $1,054.95 | 44.3 | 40.0 | 11 | 0 | 15763 | 20209 | 0.0 | 0.0 |
173534 | ac805fbc-eed9-4358-acfe-0483ab2bdc4f | 5e18f2ba-6b6b-4d8d-8d17-5e7505f5f2fc | Fully Paid | 99999999 | Short Term | 743.0 | NaN | Rent | 64333.0 | Take a Trip | $456.23 | 20.3 | 35.0 | 7 | 0 | 4361 | 8954 | 0.0 | 0.0 |
174092 | ad06660e-ea3c-4065-9a5d-fb5a48a844af | f3f9b937-67ba-4e75-8090-131bc7f8cbb7 | Fully Paid | 99999999 | Short Term | 748.0 | NaN | Home Mortgage | 87084.0 | Debt Consolidation | $957.92 | 17.0 | 27.0 | 17 | 0 | 12442 | 57869 | 0.0 | 0.0 |
26896 | 1ae0e1f4-60c6-4057-b7f0-ff7d658c9d8d | 80b73ac5-4596-4fdd-a8b6-c5fc6a549100 | Charged Off | 11797 | Short Term | 746.0 | NaN | Own Home | 49155.0 | Debt Consolidation | $1,151.04 | 27.0 | 61.0 | 10 | 0 | 8812 | 20930 | 0.0 | 0.0 |
173682 | aca2383d-d3cb-46e7-97bd-bc0805105db0 | fa638ab0-0f90-40a2-8538-bd6822d4fb13 | Fully Paid | 99999999 | Short Term | 721.0 | NaN | Rent | 63104.0 | Debt Consolidation | $678.36 | 20.0 | 21.0 | 8 | 1 | 8763 | 20145 | 1.0 | 0.0 |
173329 | ac4f9423-eb2c-490f-94a5-0c1effa6a797 | 9475ae94-3944-41b0-a533-d925d58e4f23 | Fully Paid | 4080 | Short Term | 676.0 | NaN | Rent | 45900.0 | Business Loan | $284.58 | 27.1 | 61.0 | 7 | 1 | 4969 | 13359 | 1.0 | 0.0 |
26937 | 1aed3d3c-2900-4d7c-9d60-9c91c1bb950a | 37d905fe-cbc7-4df3-bf38-deff00d68cf6 | Fully Paid | 2371 | Short Term | 744.0 | NaN | Rent | 8179.0 | Debt Consolidation | $55.35 | 23.5 | 34.0 | 4 | 0 | 1898 | 12822 | 0.0 | 0.0 |
173448 | ac6c1ced-3d87-4e5a-937a-30087175e003 | 3ebc9323-fd95-4e37-aed2-1e233dbe1b32 | Charged Off | 9836 | Short Term | 7000.0 | NaN | Home Mortgage | 39344.0 | Debt Consolidation | $439.34 | 22.0 | 0.0 | 10 | 0 | 5255 | 21450 | 0.0 | 0.0 |
173426 | ac68d02f-03c0-47bd-9913-4072a6c41bb6 | 902b24c3-47bc-4a24-adee-cc2b2fc4b0ec | Fully Paid | 99999999 | Short Term | 735.0 | NaN | Home Mortgage | 82002.0 | Business Loan | $1,366.70 | 25.8 | 32.0 | 12 | 1 | 14573 | 24826 | 1.0 | 0.0 |
173472 | ac732b14-0923-40ee-adc3-5ebf40547c94 | 38414da0-02e4-495e-a681-05d85bf2eec0 | Fully Paid | 15302 | Short Term | 742.0 | NaN | Home Mortgage | 65286.0 | Debt Consolidation | $935.77 | 21.5 | 47.0 | 10 | 0 | 45585 | 51802 | 0.0 | 0.0 |
173367 | ac584085-6307-4dc2-8ed5-4cf00a6dfd3c | 93b132cd-104b-4792-8c2f-a28017e87a3e | Fully Paid | 99999999 | Short Term | 731.0 | NaN | Home Mortgage | 28035.0 | Debt Consolidation | $523.33 | 20.8 | 26.0 | 12 | 1 | 3080 | 10441 | 0.0 | 0.0 |
173545 | ac838e64-261b-411f-90b0-163c9fef02ac | cf2e394a-863c-4a7c-a51d-a6933db8aa01 | Fully Paid | 20146 | Short Term | 739.0 | NaN | Home Mortgage | 71090.0 | Debt Consolidation | $493.49 | 47.9 | 31.0 | 6 | 1 | 3598 | 11682 | 1.0 | 0.0 |
174068 | ad02d96e-b2a2-47ec-8e6f-4db0a77b90f6 | 0f65afd7-d784-4cd0-886f-0f66bbe7ec06 | Fully Paid | 5977 | Short Term | 737.0 | NaN | Home Mortgage | 51404.0 | Debt Consolidation | $582.58 | 19.9 | 16.0 | 8 | 0 | 9561 | 12257 | 0.0 | 0.0 |
174080 | ad04a669-3177-450c-adab-f52050f87228 | 5d05676e-9178-4141-be2e-686346c8ec1d | Fully Paid | 15682 | Short Term | 700.0 | NaN | Rent | 73946.0 | Debt Consolidation | $1,429.61 | 28.6 | 32.0 | 9 | 0 | 9543 | 10487 | 0.0 | 0.0 |
26952 | 1aef5ca7-71a1-4c6e-af48-3126f0625120 | 4b9dfe28-69ed-41d6-9331-7bdbfe3956cc | Fully Paid | 2928 | Short Term | 746.0 | NaN | Home Mortgage | 24398.0 | Home Improvements | $82.95 | 23.1 | 31.0 | 6 | 1 | 2208 | 10368 | 1.0 | 0.0 |
26978 | 1af65a97-17d8-432f-b8b4-17096aa50511 | ad99750e-961d-4e30-8bda-fb2ce53bae3a | Fully Paid | 99999999 | Short Term | 724.0 | NaN | Home Mortgage | 51903.0 | Debt Consolidation | $648.79 | 14.6 | 56.0 | 8 | 0 | 1950 | 3234 | 0.0 | 0.0 |
174036 | acfb2377-afe4-49ea-8713-3b778ff88d0a | c8ad7a90-20ab-48a4-b727-4bbe02afca54 | Charged Off | 28003 | Long Term | 644.0 | NaN | Rent | 67978.0 | Other | $1,648.47 | 20.1 | 1.0 | 14 | 0 | 32953 | 40089 | 0.0 | 0.0 |
175793 | aec1ac05-fa56-46e7-b56a-cf99f54c10ec | 703e03dd-4975-44bb-9c5a-c829e66a426e | Charged Off | 4996 | Short Term | 714.0 | NaN | Home Mortgage | 21707.0 | Debt Consolidation | $464.90 | 13.6 | 76.0 | 8 | 2 | 7276 | 11987 | 0.0 | 0.0 |
248985 | f7f98052-5983-490f-87ef-9dad239d8218 | cbd0ec21-01e5-460f-b4b9-1ca2f9004992 | Fully Paid | 99999999 | Short Term | 701.0 | NaN | Home Mortgage | 44222.0 | other | $420.11 | 32.1 | 23.0 | 10 | 0 | 16845 | 21350 | 0.0 | 0.0 |
175826 | aec8a60c-8ce9-4fb1-a010-0e345ea29b82 | 291d3bc3-fc82-4828-9a35-acbaab037520 | Fully Paid | 7800 | Short Term | 720.0 | NaN | Own Home | 68250.0 | Debt Consolidation | $1,302.43 | 26.0 | 18.0 | 7 | 0 | 25542 | 29631 | 0.0 | 0.0 |
2505 | 0291642a-3431-42d9-91d2-f840e7edcef0 | 33192245-25ba-4ca5-b19b-733bd54073b1 | Fully Paid | 15899 | Short Term | 733.0 | NaN | Home Mortgage | 70553.0 | Debt Consolidation | $391.57 | 25.7 | 18.0 | 12 | 0 | 14592 | 20757 | 0.0 | 0.0 |
175933 | aee5937a-0d6f-408c-a608-eb0f2ca1fb07 | ac49f838-1aff-4066-8b95-66f7aa79a4d1 | Fully Paid | 5125 | Short Term | 719.0 | NaN | Rent | 39210.0 | Other | $229.70 | 31.8 | 41.0 | 9 | 1 | 1186 | 9563 | 1.0 | 0.0 |
248932 | f7ec5f42-545d-47f0-9122-ed7de3c98ae5 | 154a15d1-7467-4f93-af69-b139c470239f | Fully Paid | 5765 | Short Term | 744.0 | NaN | Home Mortgage | 51015.0 | Debt Consolidation | $880.01 | 39.6 | 27.0 | 21 | 0 | 44922 | 123751 | 0.0 | 0.0 |
175919 | aee07838-218a-4711-8959-2ed1833027a1 | 1ee0c5d8-4232-4a50-8e56-be35d208c57f | Fully Paid | 11942 | Short Term | 739.0 | NaN | Home Mortgage | 89568.0 | other | $1,492.80 | 25.5 | 18.0 | 8 | 0 | 34856 | 283381 | 0.0 | 0.0 |
249066 | f80d7079-1cff-4670-a829-f2f7a9ac571f | e1dfc891-91d4-4d73-9e42-b914eb293189 | Fully Paid | 12011 | Short Term | 738.0 | NaN | Home Mortgage | 58921.0 | Debt Consolidation | $819.99 | 20.9 | 18.0 | 11 | 0 | 15807 | 23076 | 0.0 | 0.0 |
249095 | f8141615-1977-4e65-98e3-4b873efe077e | ac19299a-9247-4032-80df-6ed7a0684573 | Fully Paid | 3991 | Short Term | 734.0 | NaN | Own Home | 14975.0 | Buy a Car | $49.92 | 17.0 | 44.0 | 4 | 0 | 738 | 23818 | 0.0 | 0.0 |
175635 | ae9909fc-3fd1-4cc2-b60d-39b5dc7739de | 6c3f8d54-e607-4b8e-906a-217946623698 | Charged Off | 7838 | Short Term | 720.0 | NaN | Home Mortgage | 37866.0 | Debt Consolidation | $798.34 | 48.5 | 57.0 | 9 | 1 | 3084 | 6295 | 1.0 | 0.0 |
26152 | 1a1a418d-fe50-43c5-a415-52ec77aada88 | 0e7c0a01-244a-43e1-9d71-d8d1237ac7d2 | Fully Paid | 99999999 | Short Term | 715.0 | NaN | Rent | 15321.0 | Debt Consolidation | $90.90 | 11.3 | 25.0 | 6 | 0 | 2719 | 2859 | 0.0 | 0.0 |
176405 | af5a650b-af73-41fa-a9a0-86e480e1aa84 | 18918911-3c03-4f3f-a679-a3d30880a261 | Fully Paid | 9846 | Short Term | 734.0 | NaN | Own Home | 25600.0 | Debt Consolidation | $509.86 | 18.1 | 14.0 | 7 | 0 | 5848 | 7776 | 0.0 | 0.0 |
26075 | 1a07527b-d426-4df9-a90d-561a0e77e121 | 1af45f46-23b3-44c2-918f-d3ea94de0ecd | Charged Off | 23402 | Short Term | 7260.0 | NaN | Own Home | 63382.0 | Debt Consolidation | $1,061.64 | 45.6 | 57.0 | 8 | 0 | 29071 | 31092 | 0.0 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
210653 | d199168d-74e1-4a91-8977-4a92abd3b7e6 | ff0ec39b-ed73-4621-b3d0-37ca5ebdd781 | Fully Paid | 4660 | Short Term | 733.0 | NaN | Home Mortgage | NaN | Debt Consolidation | $410.56 | 24.6 | NaN | 9 | 1 | 4027 | 6601 | 0.0 | 1.0 |
196939 | c3d122fb-82f9-49aa-a53b-fcaa0c5005cc | db5157da-7f78-467e-9ad6-ac59a02ba37a | Fully Paid | 4021 | Short Term | 733.0 | NaN | Home Mortgage | NaN | Debt Consolidation | $714.70 | 17.6 | NaN | 10 | 1 | 7124 | 12054 | 1.0 | 0.0 |
73942 | 49754fb6-c9f7-47e1-82bc-8e36a0601e34 | d7568176-b747-45ac-a37c-5f1ac79a3e88 | Fully Paid | 12289 | Long Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $410.03 | 8.9 | NaN | 13 | 0 | 4700 | 13663 | 0.0 | 0.0 |
183394 | b65c390f-f654-4fd3-aee3-0d5a90ceaa57 | cdadacf2-762c-4d72-9b71-45d8daa3b57f | Fully Paid | 10664 | Short Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $554.43 | 16.0 | NaN | 14 | 0 | 17322 | 27760 | 0.0 | 0.0 |
5598 | 0598ffd1-01b8-4cf8-a189-1aa921e04cc9 | 5b07ab32-4213-4641-8848-8b8909e978f9 | Fully Paid | 4052 | Short Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $196.00 | 11.0 | NaN | 6 | 0 | 5491 | 7595 | 0.0 | 0.0 |
121922 | 792b4481-31b8-4ab8-82d3-0a884d97b3ef | a671bd7f-633b-4dd5-8e8d-e6857f71f26f | Fully Paid | 28398 | Long Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $1,899.93 | 42.4 | NaN | 16 | 0 | 28759 | 43182 | 0.0 | 0.0 |
188655 | bba2d454-4163-4a97-a951-ee7a1c6b0671 | 40b9912e-5fff-4da0-b09a-6a7cd7fe4b64 | Fully Paid | 9996 | Short Term | 733.0 | NaN | Home Mortgage | NaN | Debt Consolidation | $823.00 | 24.4 | NaN | 16 | 0 | 8674 | 37386 | 0.0 | 0.0 |
5602 | 059a48b2-92a0-4419-92b9-58e0c76a73d7 | 034ddee5-6b1f-4967-a482-3992d25180d0 | Charged Off | 19548 | Short Term | 733.0 | NaN | Home Mortgage | NaN | Debt Consolidation | $626.76 | 29.0 | NaN | 11 | 0 | 21502 | 36819 | 0.0 | 0.0 |
134137 | 85670bd6-c686-48ca-b2f6-0d54d0eafb3b | 2af61bd7-d7dc-4c1d-846f-f8e9be45c420 | Fully Paid | 1803 | Short Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $295.44 | 19.7 | NaN | 11 | 0 | 14797 | 26236 | 0.0 | 0.0 |
225482 | e04e59f3-45da-41b8-9103-33d9f78287ee | 08a02c9a-4dc1-4c1f-91fa-e2782132551c | Charged Off | 5042 | Short Term | 733.0 | NaN | Home Mortgage | NaN | other | $1,329.27 | 13.4 | NaN | 13 | 0 | 21420 | 21924 | 0.0 | 0.0 |
192595 | bf86c30a-6bc9-4c3e-92e4-d0f28ddf3c3f | 5e6e1943-cb63-4e56-8850-7b45945b3574 | Fully Paid | 15428 | Short Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $659.17 | 22.1 | NaN | 7 | 0 | 9843 | 11236 | 0.0 | 0.0 |
83235 | 52a7f035-1628-4d2f-acfb-0e18102900c5 | 6fcd2b59-3263-4978-9f24-541ae5cd1d85 | Fully Paid | 5001 | Short Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $458.09 | 9.4 | NaN | 9 | 1 | 4573 | 7597 | 1.0 | 0.0 |
161673 | a0d13d77-5adc-4d4b-b5c4-54e027e53b9f | e09b780c-295d-4770-b62f-b1b9b0d45b12 | Fully Paid | 19904 | Short Term | 733.0 | NaN | Home Mortgage | NaN | Debt Consolidation | $442.50 | 38.2 | NaN | 9 | 0 | 22304 | 58085 | 0.0 | 0.0 |
112872 | 7017e2bf-addf-4921-8a5c-83033aa4ba5f | 55f04f74-db14-45a7-a93b-b650d887f3a6 | Fully Paid | 14792 | Short Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $1,756.26 | 18.0 | NaN | 8 | 0 | 33082 | 53617 | 0.0 | 0.0 |
62660 | 3e5bb348-a284-4677-9ca0-41d2d11d2296 | 7adca112-4d41-446f-9a32-3d9b62d86c31 | Fully Paid | 5996 | Short Term | 733.0 | NaN | Own Home | NaN | Buy House | $25.05 | 16.9 | NaN | 4 | 0 | 324 | 19048 | 0.0 | 0.0 |
237038 | ebda9fc3-2adc-4686-ab07-0b07a62d2f37 | 20f60f2f-8185-4948-9bc6-cff6d694cf45 | Fully Paid | 3334 | Short Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $357.22 | 29.0 | NaN | 11 | 0 | 3551 | 72460 | 0.0 | 0.0 |
37017 | 24de2115-2b17-4f7c-a83a-de4e01719087 | 65bdda98-3efd-4c62-ade5-17413dfd00ab | Fully Paid | 5090 | Short Term | 733.0 | NaN | Home Mortgage | NaN | Debt Consolidation | $267.16 | 10.3 | NaN | 4 | 0 | 4582 | 6005 | 0.0 | 0.0 |
37000 | 24dac85e-c66d-47ff-8114-babaee373c09 | 2d2f7df5-602b-4db4-aebc-5ab579232e6d | Fully Paid | 5928 | Short Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $101.74 | 19.9 | NaN | 5 | 1 | 3439 | 10421 | 1.0 | 0.0 |
183515 | b67a2be4-b538-417c-9ff0-52dae0f57255 | effb3d59-f5b1-4249-968f-a507bc1547b0 | Fully Paid | 6837 | Short Term | 733.0 | NaN | Home Mortgage | NaN | Debt Consolidation | $935.33 | 17.7 | NaN | 9 | 0 | 16108 | 25287 | 0.0 | 0.0 |
247357 | f65e7d75-a80c-4e1f-b0e0-49e55726b2ca | 550a58d6-5961-44e5-94a7-09c0e296a4b6 | Fully Paid | 2020 | Short Term | 733.0 | NaN | Home Mortgage | NaN | Home Improvements | $168.70 | 24.0 | NaN | 4 | 0 | 6497 | 6767 | 0.0 | 0.0 |
80610 | 5010f786-7b3d-4aac-bc53-01d1bf673abd | 1f410f88-3edd-4036-9038-628260ec9eab | Fully Paid | 11598 | Short Term | 733.0 | NaN | Home Mortgage | NaN | Debt Consolidation | $420.63 | 39.3 | NaN | 9 | 0 | 3865 | 9884 | 0.0 | 0.0 |
112851 | 7011f892-0c94-4249-bf76-1d037cea6439 | 7a92e24d-10f3-496b-8dd9-b9f6bde65d36 | Fully Paid | 3058 | Short Term | 733.0 | NaN | HaveMortgage | NaN | Take a Trip | $1,166.91 | 27.4 | NaN | 8 | 0 | 35654 | 61791 | 0.0 | 0.0 |
99591 | 62f59cb7-d5e2-48e0-a641-a34f12966770 | 113b1587-82c1-4460-87fa-1948108ce818 | Fully Paid | 11046 | Short Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $508.99 | 12.7 | NaN | 10 | 0 | 16524 | 18319 | 0.0 | 0.0 |
35740 | 239eddda-a017-4052-a54f-ddf602c53055 | 9190a4e4-edcc-4529-99f3-e06b8e60c42f | Fully Paid | 7878 | Short Term | 733.0 | NaN | Own Home | NaN | Debt Consolidation | $307.08 | 41.7 | NaN | 12 | 0 | 9432 | 82022 | 0.0 | 0.0 |
222347 | dd43d4cf-f75b-4496-ae50-5db390b11c14 | e5d9d904-6bb2-4475-8e4a-e7538955793e | Charged Off | 18011 | Short Term | 733.0 | NaN | Home Mortgage | NaN | Debt Consolidation | $790.45 | 20.0 | NaN | 7 | 0 | 27619 | 34567 | 0.0 | 0.0 |
136736 | 880944f8-112e-477b-aba4-607adf7354ae | b1b314ce-f59d-4a02-917c-af1dd1d3c0f8 | Fully Paid | 10470 | Long Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $254.19 | 9.9 | NaN | 4 | 0 | 8419 | 8617 | 0.0 | 0.0 |
210603 | d18e1dcf-b1b0-48ea-b618-0f8e68b78d95 | e02cd691-2e14-4160-8f48-60722ef13e56 | Fully Paid | 7997 | Short Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $487.97 | 9.1 | NaN | 15 | 0 | 10573 | 25114 | 0.0 | 0.0 |
201198 | c827d04f-37c2-4ea5-8ea7-07941ee84c56 | 4d5379f1-4e8f-42fa-89cb-66a5a721f4dd | Fully Paid | 11860 | Short Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $193.69 | 16.2 | NaN | 5 | 0 | 8339 | 17089 | 0.0 | 0.0 |
128143 | 7f78cf95-560e-4c5a-9363-badced6777fa | 76dcd93e-3fa8-4af3-a975-c32eca83059e | Fully Paid | 2018 | Short Term | 733.0 | NaN | Home Mortgage | NaN | Medical Bills | $800.68 | 14.5 | NaN | 17 | 0 | 21697 | 41249 | 0.0 | 0.0 |
130205 | 818f9bb4-15a4-4c64-a18f-eb403a20bb74 | f3bf0af3-2746-4419-bed1-dbc488eef769 | Fully Paid | 1976 | Short Term | 733.0 | NaN | Rent | NaN | Debt Consolidation | $166.00 | 9.6 | NaN | 4 | 0 | 5175 | 10780 | 0.0 | 0.0 |
8990 rows × 19 columns
df_cleaned['Null years in current job'] = df_cleaned['Years in current job'].isnull()
df_cleaned.head()
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | Null years in current job | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 000025bb-5694-4cff-b17d-192b1a98ba44 | 5ebc8bb1-5eb9-4404-b11b-a6eebc401a19 | Fully Paid | 11520 | Short Term | 741.0 | 10+ years | Home Mortgage | 33694.0 | Debt Consolidation | $584.03 | 12.3 | 41.0 | 10 | 0 | 6760 | 16056 | 0.0 | 0.0 | False |
128265 | 7f99634c-cc18-4667-92b4-b9e887b7c296 | e52aac79-e3f0-44cf-b538-539c0a01c849 | Fully Paid | 7055 | Short Term | 742.0 | 10+ years | Home Mortgage | 90711.0 | Debt Consolidation | $1,322.87 | 31.9 | 11.0 | 21 | 0 | 18290 | 19752 | 0.0 | 0.0 | False |
128263 | 7f99378c-b36e-4711-9af2-33c7239c00e7 | 32bdbd23-ca78-4c14-a0da-afc2a4e7a33a | Fully Paid | 99999999 | Long Term | 694.0 | 8 years | Home Mortgage | 110549.0 | Debt Consolidation | $1,824.06 | 21.6 | 13.0 | 18 | 0 | 40907 | 52513 | 0.0 | 0.0 | False |
128259 | 7f96d054-5de3-4f5f-9a21-d54ee9af703b | 784a48cb-96ee-494c-aa7b-eb388fbf3003 | Fully Paid | 2745 | Short Term | 743.0 | 2 years | Home Mortgage | 34318.0 | Debt Consolidation | $506.18 | 16.5 | 68.0 | 6 | 0 | 5429 | 10501 | 0.0 | 0.0 | False |
128256 | 7f9558b3-4ed7-4fe0-8662-8021c53c0016 | 0971186f-b0a7-4d11-ba55-4b41348673a2 | Fully Paid | 3008 | Short Term | 750.0 | 2 years | Rent | 37104.0 | Debt Consolidation | $556.55 | 13.8 | 34.0 | 11 | 0 | 732 | 12621 | 0.0 | 0.0 | False |
df_cleaned['Years in current job'] = df_cleaned['Years in current job'].fillna('0')
df_cleaned[df_cleaned['Years in current job'].isnull()==True]
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | Null years in current job |
---|
df_cleaned.iloc[9]
Loan ID 7f8fea60-ef3a-409b-84b8-f0f9e501293b Customer ID f3c67c03-d825-451d-a1b1-a3c9ec3757a3 Loan Status Fully Paid Current Loan Amount 99999999 Term Short Term Credit Score 745 Years in current job 6 years Home Ownership Home Mortgage Annual Income 55258 Purpose other Monthly Debt $915.90 Years of Credit History 25 Months since last delinquent 81 Number of Open Accounts 9 Number of Credit Problems 0 Current Credit Balance 16559 Maximum Open Credit 19760 Bankruptcies 0 Tax Liens 0 Null years in current job False Name: 128242, dtype: object
df_cleaned['Years in current job'][df_cleaned['Years in current job']=='< 1 year']='0'
C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy """Entry point for launching an IPython kernel.
df_cleaned['Years in current job'].unique()
array(['10+ years', '8 years', '2 years', '3 years', '6 years', '1 year', '5 years', '4 years', '7 years', '0', '9 years'], dtype=object)
df_cleaned['Over ten years'] = df_cleaned['Years in current job']==('10+ years')
df_cleaned.head()
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | ... | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | Null years in current job | Over ten years | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 000025bb-5694-4cff-b17d-192b1a98ba44 | 5ebc8bb1-5eb9-4404-b11b-a6eebc401a19 | Fully Paid | 11520 | Short Term | 741.0 | 10+ years | Home Mortgage | 33694.0 | Debt Consolidation | ... | 12.3 | 41.0 | 10 | 0 | 6760 | 16056 | 0.0 | 0.0 | False | True |
128265 | 7f99634c-cc18-4667-92b4-b9e887b7c296 | e52aac79-e3f0-44cf-b538-539c0a01c849 | Fully Paid | 7055 | Short Term | 742.0 | 10+ years | Home Mortgage | 90711.0 | Debt Consolidation | ... | 31.9 | 11.0 | 21 | 0 | 18290 | 19752 | 0.0 | 0.0 | False | True |
128263 | 7f99378c-b36e-4711-9af2-33c7239c00e7 | 32bdbd23-ca78-4c14-a0da-afc2a4e7a33a | Fully Paid | 99999999 | Long Term | 694.0 | 8 years | Home Mortgage | 110549.0 | Debt Consolidation | ... | 21.6 | 13.0 | 18 | 0 | 40907 | 52513 | 0.0 | 0.0 | False | False |
128259 | 7f96d054-5de3-4f5f-9a21-d54ee9af703b | 784a48cb-96ee-494c-aa7b-eb388fbf3003 | Fully Paid | 2745 | Short Term | 743.0 | 2 years | Home Mortgage | 34318.0 | Debt Consolidation | ... | 16.5 | 68.0 | 6 | 0 | 5429 | 10501 | 0.0 | 0.0 | False | False |
128256 | 7f9558b3-4ed7-4fe0-8662-8021c53c0016 | 0971186f-b0a7-4d11-ba55-4b41348673a2 | Fully Paid | 3008 | Short Term | 750.0 | 2 years | Rent | 37104.0 | Debt Consolidation | ... | 13.8 | 34.0 | 11 | 0 | 732 | 12621 | 0.0 | 0.0 | False | False |
5 rows × 21 columns
df_cleaned['Years in current job'][df_cleaned['Years in current job']=='10+ years']='0'
C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy """Entry point for launching an IPython kernel.
df_cleaned['Years in current job'].unique()
array(['0', '8 years', '2 years', '3 years', '6 years', '1 year', '5 years', '4 years', '7 years', '9 years'], dtype=object)
df_cleaned['Years in current job'] = df_cleaned['Years in current job'].apply(lambda x:x[0])
df_cleaned.head()
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | ... | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | Null years in current job | Over ten years | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 000025bb-5694-4cff-b17d-192b1a98ba44 | 5ebc8bb1-5eb9-4404-b11b-a6eebc401a19 | Fully Paid | 11520 | Short Term | 741.0 | 0 | Home Mortgage | 33694.0 | Debt Consolidation | ... | 12.3 | 41.0 | 10 | 0 | 6760 | 16056 | 0.0 | 0.0 | False | True |
128265 | 7f99634c-cc18-4667-92b4-b9e887b7c296 | e52aac79-e3f0-44cf-b538-539c0a01c849 | Fully Paid | 7055 | Short Term | 742.0 | 0 | Home Mortgage | 90711.0 | Debt Consolidation | ... | 31.9 | 11.0 | 21 | 0 | 18290 | 19752 | 0.0 | 0.0 | False | True |
128263 | 7f99378c-b36e-4711-9af2-33c7239c00e7 | 32bdbd23-ca78-4c14-a0da-afc2a4e7a33a | Fully Paid | 99999999 | Long Term | 694.0 | 8 | Home Mortgage | 110549.0 | Debt Consolidation | ... | 21.6 | 13.0 | 18 | 0 | 40907 | 52513 | 0.0 | 0.0 | False | False |
128259 | 7f96d054-5de3-4f5f-9a21-d54ee9af703b | 784a48cb-96ee-494c-aa7b-eb388fbf3003 | Fully Paid | 2745 | Short Term | 743.0 | 2 | Home Mortgage | 34318.0 | Debt Consolidation | ... | 16.5 | 68.0 | 6 | 0 | 5429 | 10501 | 0.0 | 0.0 | False | False |
128256 | 7f9558b3-4ed7-4fe0-8662-8021c53c0016 | 0971186f-b0a7-4d11-ba55-4b41348673a2 | Fully Paid | 3008 | Short Term | 750.0 | 2 | Rent | 37104.0 | Debt Consolidation | ... | 13.8 | 34.0 | 11 | 0 | 732 | 12621 | 0.0 | 0.0 | False | False |
5 rows × 21 columns
df_cleaned['Years in current job']=df_cleaned['Years in current job'].astype(int)
df_cleaned.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 215700 entries, 0 to 25243 Data columns (total 21 columns): Loan ID 215700 non-null object Customer ID 215700 non-null object Loan Status 215700 non-null object Current Loan Amount 215700 non-null int64 Term 215700 non-null object Credit Score 215700 non-null float64 Years in current job 215700 non-null int32 Home Ownership 215700 non-null object Annual Income 171202 non-null float64 Purpose 215700 non-null object Monthly Debt 215700 non-null object Years of Credit History 215700 non-null float64 Months since last delinquent 97438 non-null float64 Number of Open Accounts 215700 non-null int64 Number of Credit Problems 215700 non-null int64 Current Credit Balance 215700 non-null int64 Maximum Open Credit 215700 non-null object Bankruptcies 215248 non-null float64 Tax Liens 215678 non-null float64 Null years in current job 215700 non-null bool Over ten years 215700 non-null bool dtypes: bool(2), float64(6), int32(1), int64(4), object(8) memory usage: 32.5+ MB
filledIncome = df_cleaned[df_cleaned['Annual Income'].notnull()]
plt.hist(filledIncome['Annual Income'], bins = 10)
plt.show()
df_cleaned['Annual Income'].median()
62105.0
df_cleaned['Annual Income']= df_cleaned['Annual Income'].fillna(df_cleaned['Annual Income'].median())
df_cleaned['Annual Income'].isnull().sum()
0
df_cleaned['Months since last delinquent'].unique()
array([ 41., 11., 13., 68., 34., 5., 55., 17., 14., 81., 36., 40., 6., 12., 78., 8., 18., 28., 35., 16., 49., 24., 53., 15., 63., 70., 52., 43., 22., 30., 29., 74., 10., 4., 48., 67., 7., 73., 3., 42., 21., 31., 79., 44., 54., 27., 61., 65., 25., 19., 80., 57., 51., 32., 75., 62., 2., 23., 64., 37., 26., 66., 59., 82., 46., 9., 20., 76., 58., 47., 60., 56., 45., 71., 77., 115., 50., 39., 69., 33., 72., 38., 1., 0., 85., 89., 96., 83., 140., 91., 114., 143., 94., 95., 86., 88., 84., 98., 135., 100., 97., 93., 99., 130., 106., 87., 122., 92., 101., 152., 139., 148., 109., 134., 113., 131., 119., 133., 110., 120., 90., 102., 151., 176., 149., 116., 104., 107., nan, 118., 112., 108.])
filledMon = df_cleaned[df_cleaned['Months since last delinquent'].notnull()]
plt.hist(filledMon['Months since last delinquent'], bins = 100)
plt.show()
df_cleaned['Months since last delinquent'].median()
32.0
df_cleaned['Months since last delinquent']=df_cleaned['Months since last delinquent'].fillna(32.0)
#seperate column for null
df_cleaned['Bankruptcies'].unique()
array([ 0., 1., 3., 2., 4., 5., 7., 6., nan])
filledBank = df_cleaned[df_cleaned['Bankruptcies'].notnull()]
plt.hist(filledBank['Bankruptcies'], bins = 10)
plt.show()
df_cleaned['Bankruptcies'].median()
0.0
df_cleaned['Bankruptcies'] = df_cleaned['Bankruptcies'].fillna(0.0)
df_cleaned.head()
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | ... | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | Null years in current job | Over ten years | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 000025bb-5694-4cff-b17d-192b1a98ba44 | 5ebc8bb1-5eb9-4404-b11b-a6eebc401a19 | Fully Paid | 11520 | Short Term | 741.0 | 0 | Home Mortgage | 33694.0 | Debt Consolidation | ... | 12.3 | 41.0 | 10 | 0 | 6760 | 16056 | 0.0 | 0.0 | False | True |
128265 | 7f99634c-cc18-4667-92b4-b9e887b7c296 | e52aac79-e3f0-44cf-b538-539c0a01c849 | Fully Paid | 7055 | Short Term | 742.0 | 0 | Home Mortgage | 90711.0 | Debt Consolidation | ... | 31.9 | 11.0 | 21 | 0 | 18290 | 19752 | 0.0 | 0.0 | False | True |
128263 | 7f99378c-b36e-4711-9af2-33c7239c00e7 | 32bdbd23-ca78-4c14-a0da-afc2a4e7a33a | Fully Paid | 99999999 | Long Term | 694.0 | 8 | Home Mortgage | 110549.0 | Debt Consolidation | ... | 21.6 | 13.0 | 18 | 0 | 40907 | 52513 | 0.0 | 0.0 | False | False |
128259 | 7f96d054-5de3-4f5f-9a21-d54ee9af703b | 784a48cb-96ee-494c-aa7b-eb388fbf3003 | Fully Paid | 2745 | Short Term | 743.0 | 2 | Home Mortgage | 34318.0 | Debt Consolidation | ... | 16.5 | 68.0 | 6 | 0 | 5429 | 10501 | 0.0 | 0.0 | False | False |
128256 | 7f9558b3-4ed7-4fe0-8662-8021c53c0016 | 0971186f-b0a7-4d11-ba55-4b41348673a2 | Fully Paid | 3008 | Short Term | 750.0 | 2 | Rent | 37104.0 | Debt Consolidation | ... | 13.8 | 34.0 | 11 | 0 | 732 | 12621 | 0.0 | 0.0 | False | False |
5 rows × 21 columns
df_cleaned['Bankruptcies'].isnull().sum()
0
df_cleaned['Tax Liens'].unique()
array([ 0., 2., 1., 3., 5., 4., 9., 6., 7., 8., 10., 11., nan])
filledLiens = df_cleaned[df_cleaned['Tax Liens'].notnull()]
plt.hist(filledLiens['Tax Liens'], bins = 10)
plt.show()
df_cleaned['Tax Liens'] = df_cleaned['Tax Liens'].fillna(df_cleaned['Tax Liens'].median())
df_cleaned.head()
Loan ID | Customer ID | Loan Status | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | ... | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | Null years in current job | Over ten years | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 000025bb-5694-4cff-b17d-192b1a98ba44 | 5ebc8bb1-5eb9-4404-b11b-a6eebc401a19 | Fully Paid | 11520 | Short Term | 741.0 | 0 | Home Mortgage | 33694.0 | Debt Consolidation | ... | 12.3 | 41.0 | 10 | 0 | 6760 | 16056 | 0.0 | 0.0 | False | True |
128265 | 7f99634c-cc18-4667-92b4-b9e887b7c296 | e52aac79-e3f0-44cf-b538-539c0a01c849 | Fully Paid | 7055 | Short Term | 742.0 | 0 | Home Mortgage | 90711.0 | Debt Consolidation | ... | 31.9 | 11.0 | 21 | 0 | 18290 | 19752 | 0.0 | 0.0 | False | True |
128263 | 7f99378c-b36e-4711-9af2-33c7239c00e7 | 32bdbd23-ca78-4c14-a0da-afc2a4e7a33a | Fully Paid | 99999999 | Long Term | 694.0 | 8 | Home Mortgage | 110549.0 | Debt Consolidation | ... | 21.6 | 13.0 | 18 | 0 | 40907 | 52513 | 0.0 | 0.0 | False | False |
128259 | 7f96d054-5de3-4f5f-9a21-d54ee9af703b | 784a48cb-96ee-494c-aa7b-eb388fbf3003 | Fully Paid | 2745 | Short Term | 743.0 | 2 | Home Mortgage | 34318.0 | Debt Consolidation | ... | 16.5 | 68.0 | 6 | 0 | 5429 | 10501 | 0.0 | 0.0 | False | False |
128256 | 7f9558b3-4ed7-4fe0-8662-8021c53c0016 | 0971186f-b0a7-4d11-ba55-4b41348673a2 | Fully Paid | 3008 | Short Term | 750.0 | 2 | Rent | 37104.0 | Debt Consolidation | ... | 13.8 | 34.0 | 11 | 0 | 732 | 12621 | 0.0 | 0.0 | False | False |
5 rows × 21 columns
Double check to make sure that there are no more missing values
df_cleaned.isnull().sum()
Loan ID 0 Customer ID 0 Loan Status 0 Current Loan Amount 0 Term 0 Credit Score 0 Years in current job 0 Home Ownership 0 Annual Income 0 Purpose 0 Monthly Debt 0 Years of Credit History 0 Months since last delinquent 0 Number of Open Accounts 0 Number of Credit Problems 0 Current Credit Balance 0 Maximum Open Credit 0 Bankruptcies 0 Tax Liens 0 Null years in current job 0 Over ten years 0 dtype: int64
Analye the unique values of each column to check for spelling mistakes and other mistakes that might affect the model
for item in df_cleaned.columns:
print(item + ' '+ df[item].dtype.kind)
print(df_cleaned[item].unique())
Loan ID O ['000025bb-5694-4cff-b17d-192b1a98ba44' '7f99634c-cc18-4667-92b4-b9e887b7c296' '7f99378c-b36e-4711-9af2-33c7239c00e7' ... '7f78cf95-560e-4c5a-9363-badced6777fa' '818f9bb4-15a4-4c64-a18f-eb403a20bb74' '193d7f29-89f9-4c59-9e44-c1ca8ea36209'] Customer ID O ['5ebc8bb1-5eb9-4404-b11b-a6eebc401a19' 'e52aac79-e3f0-44cf-b538-539c0a01c849' '32bdbd23-ca78-4c14-a0da-afc2a4e7a33a' ... '76dcd93e-3fa8-4af3-a975-c32eca83059e' 'f3bf0af3-2746-4419-bed1-dbc488eef769' '3b462912-a081-4407-a1b1-3553a29bb974'] Loan Status O ['Fully Paid' 'Charged Off'] Current Loan Amount i [ 11520 7055 99999999 ... 20512 1089 2216] Term O ['Short Term' 'Long Term'] Credit Score f [ 741. 742. 694. 743. 750. 7400. 717. 6910. 740. 745. 731. 732. 687. 689. 716. 736. 724. 690. 734. 680. 723. 726. 7410. 692. 730. 739. 703. 7500. 735. 662. 747. 722. 713. 738. 659. 711. 658. 702. 696. 718. 681. 709. 720. 728. 729. 695. 733. 708. 748. 697. 626. 701. 725. 700. 649. 698. 714. 712. 749. 746. 707. 634. 710. 627. 744. 6890. 668. 652. 706. 640. 751. 737. 647. 721. 715. 7270. 633. 641. 7260. 705. 686. 727. 671. 675. 663. 699. 693. 679. 664. 6700. 644. 666. 676. 7080. 669. 684. 665. 7150. 645. 691. 683. 719. 660. 7430. 6960. 667. 7240. 639. 704. 597. 7300. 7210. 6800. 638. 688. 7420. 651. 656. 620. 6990. 7330. 643. 7450. 677. 590. 661. 7360. 630. 6920. 7130. 685. 608. 674. 648. 673. 7040. 617. 625. 7010. 7470. 6660. 7370. 678. 618. 637. 7280. 6650. 607. 655. 602. 682. 654. 599. 6300. 603. 6740. 6980. 7250. 7390. 610. 7380. 7140. 670. 6480. 646. 5960. 672. 6860. 7440. 636. 7290. 600. 657. 624. 7070. 7350. 7310. 6940. 6540. 6830. 6080. 6470. 6770. 7020. 7320. 635. 585. 596. 621. 611. 7090. 6580. 605. 632. 6350. 6930. 7000. 586. 7100. 6790. 6750. 7170. 6870. 6780. 7480. 5920. 6730. 6710. 653. 7340. 6490. 629. 7120. 7220. 650. 7180. 615. 7200. 642. 619. 6970. 6950. 6070. 7460. 623. 622. 614. 631. 595. 6760. 6430. 616. 7060. 7190. 7110. 7050. 6280. 612. 589. 6570. 7230. 592. 6250. 6810. 6230. 6560. 7160. 6610. 628. 6690. 6320. 6220. 6240. 606. 6270. 6820. 6850. 6510. 588. 6340. 613. 604. 6460. 6550. 601. 6590. 7490. 6880. 6840. 5970. 6160. 587. 7510. 5910. 6630. 6020. 6900. 6400. 594. 6190. 609. 6410. 6420. 6720. 591. 6450. 6680. 6370. 6600. 6010. 6380. 598. 6360. 6090. 7030. 6670. 6390. 5990. 6030. 6290. 6330. 6180. 6110. 6200. 6500. 6620. 5930. 593. 6440. 6150. 6520. 6100. 6060. 6530. 5980. 6170. 6260. 6210. 5950. 6050. 6310. 6640. 5870. 5890. 6130. 6140. 6000. 5850. 5940. 5900. 5860. 6040. 6120. 5880.] Years in current job O [0 8 2 3 6 1 5 4 7 9] Home Ownership O ['Home Mortgage' 'Rent' 'Own Home' 'HaveMortgage'] Annual Income f [ 33694. 90711. 110549. ... 42653. 45661. 144275.] Purpose O ['Debt Consolidation' 'other' 'Other' 'Take a Trip' 'Business Loan' 'Home Improvements' 'Medical Bills' 'Buy House' 'Buy a Car' 'Educational Expenses'] Monthly Debt O ['$584.03' '$1,322.87 ' '$1,824.06 ' ... '$1,166.91 ' '$254.19' '$166.00'] Years of Credit History f [12.3 31.9 21.6 16.5 13.8 12.9 22.4 13.1 29. 25. 26.9 20.8 18.4 23.9 9.5 13.3 32.5 38.5 14.4 19.8 15. 15.7 23.3 14.5 13.7 19.9 19.2 23.8 22.3 11.3 12. 12.5 21.5 11.9 20.1 22.8 22.5 28.6 18.2 15.4 17.2 14.7 43.3 31.1 18.1 18.8 15.1 16.9 12.2 26.6 40.7 17. 28. 13.9 12.6 21. 13. 17.1 7.5 16. 20.2 12.4 13.2 13.5 21.8 17.7 14.9 32.4 14. 19. 29.1 20.3 22.1 17.5 22.6 11.6 9.4 10.4 24.2 14.6 16.7 28.7 25.7 16.8 23. 15.6 26.2 23.5 20.5 22. 25.9 34.4 12.8 14.8 26. 8. 21.3 10.3 20. 9. 19.1 16.6 19.4 15.5 17.9 42.8 30. 11.2 17.4 20.6 21.4 21.9 10.8 24. 27. 14.1 19.6 19.5 18. 29.8 31.8 15.2 19.3 21.2 11. 17.3 20.7 26.8 25.3 23.2 26.3 14.3 30.1 15.3 33.6 16.2 11.5 18.7 18.6 25.4 17.8 23.7 24.5 15.9 30.3 18.9 26.4 27.5 19.7 12.1 10.6 21.1 18.5 34.9 29.2 20.4 31. 16.3 23.4 32.7 21.7 29.6 32.2 10.5 9.8 30.4 22.2 30.8 27.2 27.6 31.4 18.3 8.9 10. 26.7 14.2 22.9 24.9 29.9 36.4 23.6 32. 27.4 17.6 11.7 22.7 16.4 28.4 23.1 25.2 6.7 30.9 8.3 29.5 9.2 16.1 42.5 9.3 8.2 37. 24.7 28.5 33.9 44.5 11.4 27.3 39.1 33.8 12.7 34.6 33.5 28.8 10.7 11.8 35. 13.4 27.1 10.9 28.3 24.1 35.5 26.1 36.5 24.4 30.6 39. 41.9 30.2 25.8 33. 25.5 28.9 7.2 35.4 29.3 31.7 24.8 29.7 6.9 31.6 10.2 9.1 35.6 7.7 38.3 20.9 43.9 5.4 7.3 24.6 42. 53.5 8.5 13.6 34.5 33.4 15.8 38.4 33.1 35.9 37.2 37.7 5. 11.1 6. 26.5 31.5 30.5 38.2 9.6 7.6 42.1 48.2 8.4 35.1 31.3 9.9 24.3 5.8 32.8 25.1 25.6 28.1 27.7 32.9 36. 38.7 34.3 39.2 8.7 37.3 9.7 38.1 28.2 36.3 32.1 32.3 44.6 27.9 70.5 10.1 29.4 37.1 34.1 37.6 5.7 33.7 39.5 41.6 54.2 30.7 8.1 41.3 31.2 46.8 45.1 40.1 39.4 32.6 33.2 27.8 50.9 34.2 6.3 40. 36.6 49.2 34.7 37.5 47.5 34. 38.6 37.8 36.2 8.6 41. 33.3 8.8 41.1 5.5 38. 7.1 6.2 39.3 51.7 35.8 46.4 7.4 35.2 41.8 40.6 35.3 7. 42.3 36.1 40.5 44.4 7.9 34.8 43.4 45.2 36.8 40.2 37.9 44.9 6.8 36.7 6.4 38.8 48.7 39.6 50.2 5.9 46.3 47.6 41.5 4.6 41.4 40.4 49. 41.7 4.7 39.8 37.4 39.9 44.1 43.7 41.2 45.5 47.3 4.5 48. 44.8 48.9 49.1 45.9 36.9 38.9 6.6 43.1 47.4 5.1 6.1 53.4 39.7 42.4 50.8 4.8 50.7 40.8 46.1 43.6 47.7 44.2 42.6 46.5 42.7 46.7 49.9 43.2 5.6 50. 42.2 44.7 55.4 5.3 49.5 47.9 6.5 48.3 7.8 40.9 45. 46.2 48.1 51. 40.3 46.6 43. 47.1 43.8 43.5 51.8 48.5 45.3 44.3 46. 44. 35.7 51.4 49.3 58.8 47. 50.3 4.9 50.5 4.2 55.8 48.8 52.5 61.7 42.9 55.7 5.2 52.2 48.4 53.1 46.9 51.1 53.6 4.1 45.8 47.2 49.7 59.1 51.9 54.1 49.4 56.8 54. 52.6 55.5 55.6 51.6 45.6 50.1 4.4 48.6 49.8 52.3 60.5 50.6 45.4 45.7 49.6 54.5 53.9 4.3 53. 4. 51.3 52.9 3.9 60.7 51.5 3.8 57.8 3.7 57.6 54.6 59.7 56.9 66. 3.4 50.4 56.1 57.5 59.9 54.3 57.7 55. 47.8 61.5 61.3 57.1 56. 53.8 62.5 52.4 53.7 65.8 54.7 55.3 64.6 65. 58. 52.1 56.5 52.8 61.8 57. ] Months since last delinquent f [ 41. 11. 13. 68. 34. 5. 55. 17. 14. 81. 36. 40. 6. 12. 78. 8. 18. 28. 35. 16. 49. 24. 53. 15. 63. 70. 52. 43. 22. 30. 29. 74. 10. 4. 48. 67. 7. 73. 3. 42. 21. 31. 79. 44. 54. 27. 61. 65. 25. 19. 80. 57. 51. 32. 75. 62. 2. 23. 64. 37. 26. 66. 59. 82. 46. 9. 20. 76. 58. 47. 60. 56. 45. 71. 77. 115. 50. 39. 69. 33. 72. 38. 1. 0. 85. 89. 96. 83. 140. 91. 114. 143. 94. 95. 86. 88. 84. 98. 135. 100. 97. 93. 99. 130. 106. 87. 122. 92. 101. 152. 139. 148. 109. 134. 113. 131. 119. 133. 110. 120. 90. 102. 151. 176. 149. 116. 104. 107. 118. 112. 108.] Number of Open Accounts i [10 21 18 6 11 12 9 13 20 8 4 14 5 7 15 16 32 17 33 22 19 3 26 2 24 36 28 37 25 31 34 23 27 29 30 55 35 38 39 40 42 41 0 44 43 46 45 1 51 50 47 53 48 56 54 52 76 49 58] Number of Credit Problems i [ 0 1 2 3 5 4 9 8 6 7 10 11] Current Credit Balance i [ 6760 18290 40907 ... 33838 34838 40652] Maximum Open Credit O [16056 19752 52513 ... '25287' 61791 82022] Bankruptcies f [0. 1. 3. 2. 4. 5. 7. 6.] Tax Liens f [ 0. 2. 1. 3. 5. 4. 9. 6. 7. 8. 10. 11.]
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 3062 try: -> 3063 return self._engine.get_loc(key) 3064 except KeyError: pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 'Null years in current job' During handling of the above exception, another exception occurred: KeyError Traceback (most recent call last) <ipython-input-59-0a4322840b28> in <module>() 1 for item in df_cleaned.columns: ----> 2 print(item + ' '+ df[item].dtype.kind) 3 print(df_cleaned[item].unique()) C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\pandas\core\frame.py in __getitem__(self, key) 2683 return self._getitem_multilevel(key) 2684 else: -> 2685 return self._getitem_column(key) 2686 2687 def _getitem_column(self, key): C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key) 2690 # get column 2691 if self.columns.is_unique: -> 2692 return self._get_item_cache(key) 2693 2694 # duplicate columns & possible reduce dimensionality C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item) 2484 res = cache.get(item) 2485 if res is None: -> 2486 values = self._data.get(item) 2487 res = self._box_item_values(item, values) 2488 cache[item] = res C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\pandas\core\internals.py in get(self, item, fastpath) 4113 4114 if not isna(item): -> 4115 loc = self.items.get_loc(item) 4116 else: 4117 indexer = np.arange(len(self.items))[isna(self.items)] C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 3063 return self._engine.get_loc(key) 3064 except KeyError: -> 3065 return self._engine.get_loc(self._maybe_cast_indexer(key)) 3066 3067 indexer = self.get_indexer([key], method=method, tolerance=tolerance) pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 'Null years in current job'
Fix spelling errors in the dataset
df_cleaned['Home Ownership'][df_cleaned['Home Ownership']=='HaveMortgage']='Home Mortgage'
C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy """Entry point for launching an IPython kernel.
df_cleaned['Purpose'][df_cleaned['Purpose']=='other']='Other'
C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy """Entry point for launching an IPython kernel.
df_cleaned['Purpose'].unique()
array(['Debt Consolidation', 'Other', 'Take a Trip', 'Business Loan', 'Home Improvements', 'Medical Bills', 'Buy House', 'Buy a Car', 'Educational Expenses'], dtype=object)
Convert the "Monthly Debt" column into numerical form
df_cleaned['Monthly Debt'].head()
0 $584.03 128265 $1,322.87 128263 $1,824.06 128259 $506.18 128256 $556.55 Name: Monthly Debt, dtype: object
df_cleaned['Monthly Debt'] = df_cleaned['Monthly Debt'].apply(lambda x:x[x.find('$')+1:])
df_cleaned['Monthly Debt']=df_cleaned['Monthly Debt'].str.replace(',','')
df_cleaned['Monthly Debt']=df_cleaned[['Monthly Debt']].astype(float)
Check the data types of each column
df_cleaned.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 215700 entries, 0 to 25243 Data columns (total 21 columns): Loan ID 215700 non-null object Customer ID 215700 non-null object Loan Status 215700 non-null object Current Loan Amount 215700 non-null int64 Term 215700 non-null object Credit Score 215700 non-null float64 Years in current job 215700 non-null int32 Home Ownership 215700 non-null object Annual Income 215700 non-null float64 Purpose 215700 non-null object Monthly Debt 215700 non-null float64 Years of Credit History 215700 non-null float64 Months since last delinquent 215700 non-null float64 Number of Open Accounts 215700 non-null int64 Number of Credit Problems 215700 non-null int64 Current Credit Balance 215700 non-null int64 Maximum Open Credit 215700 non-null object Bankruptcies 215700 non-null float64 Tax Liens 215700 non-null float64 Null years in current job 215700 non-null bool Over ten years 215700 non-null bool dtypes: bool(2), float64(7), int32(1), int64(4), object(7) memory usage: 32.5+ MB
Look for the values in "Maximum Open Credit" that cannot be converted to floats
df_cleaned['Maximum Open Credit'][df_cleaned['Maximum Open Credit']=='#VALUE!']
184663 #VALUE! 187513 #VALUE! Name: Maximum Open Credit, dtype: object
Drop the two rows with "#VALUE!" in its "Maximum Open Credit" column
df_cleaned.drop(184663,axis = 0, inplace = True)
df_cleaned.iloc[184663]
Loan ID 191696df-dfb3-4059-9bbe-bb803e08af16 Customer ID c736d42a-a2a5-419d-824b-f4a1ceddd2b1 Loan Status Fully Paid Current Loan Amount 18484 Term Short Term Credit Score 720 Years in current job 0 Home Ownership Home Mortgage Annual Income 46149 Purpose Debt Consolidation Monthly Debt 1126.81 Years of Credit History 21.5 Months since last delinquent 32 Number of Open Accounts 12 Number of Credit Problems 0 Current Credit Balance 21871 Maximum Open Credit 23168 Bankruptcies 0 Tax Liens 0 Null years in current job True Over ten years False Name: 25090, dtype: object
df_cleaned['Maximum Open Credit'][df_cleaned['Maximum Open Credit']=='#VALUE!']
187513 #VALUE! Name: Maximum Open Credit, dtype: object
df_cleaned.drop(187513,axis = 0, inplace = True)
df_cleaned['Maximum Open Credit']=df_cleaned['Maximum Open Credit'].astype(int)
#check for giant numbers
df_cleaned.describe()
Current Loan Amount | Credit Score | Years in current job | Annual Income | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 2.156980e+05 | 215698.000000 | 215698.000000 | 2.156980e+05 | 215698.000000 | 215698.000000 | 215698.000000 | 215698.000000 | 215698.000000 | 2.156980e+05 | 2.156980e+05 | 215698.000000 | 215698.000000 |
mean | 1.633530e+07 | 972.911524 | 2.536903 | 7.059750e+04 | 960.396162 | 18.359797 | 33.369660 | 11.077627 | 0.155426 | 1.545737e+04 | 3.698387e+04 | 0.110326 | 0.026328 |
std | 3.695320e+07 | 1233.537371 | 2.883904 | 5.123631e+04 | 634.781931 | 7.058020 | 14.736628 | 4.971806 | 0.457974 | 1.975048e+04 | 6.011805e+05 | 0.336539 | 0.240842 |
min | 5.050000e+02 | 585.000000 | 0.000000 | 0.000000e+00 | 0.000000 | 3.400000 | 0.000000 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000e+00 | 0.000000 | 0.000000 |
25% | 8.328250e+03 | 721.000000 | 0.000000 | 4.935000e+04 | 527.660000 | 13.600000 | 32.000000 | 8.000000 | 0.000000 | 5.932000e+03 | 1.296100e+04 | 0.000000 | 0.000000 |
50% | 1.464200e+04 | 733.000000 | 2.000000 | 6.210500e+04 | 840.545000 | 17.000000 | 32.000000 | 10.000000 | 0.000000 | 1.104200e+04 | 2.206000e+04 | 0.000000 | 0.000000 |
75% | 2.520800e+04 | 741.000000 | 5.000000 | 7.960000e+04 | 1249.942500 | 21.800000 | 32.000000 | 14.000000 | 0.000000 | 1.932400e+04 | 3.681000e+04 | 0.000000 | 0.000000 |
max | 1.000000e+08 | 7510.000000 | 9.000000 | 8.713547e+06 | 22939.120000 | 70.500000 | 176.000000 | 76.000000 | 11.000000 | 1.731412e+06 | 1.763322e+08 | 7.000000 | 11.000000 |
Split my dataset into a dataset for the independent variables and a dataset for the dependent variables
x = df_cleaned.drop('Loan Status', axis = 1)
y = pd.get_dummies(df_cleaned['Loan Status'], drop_first = True)
x.head()
Loan ID | Customer ID | Current Loan Amount | Term | Credit Score | Years in current job | Home Ownership | Annual Income | Purpose | Monthly Debt | Years of Credit History | Months since last delinquent | Number of Open Accounts | Number of Credit Problems | Current Credit Balance | Maximum Open Credit | Bankruptcies | Tax Liens | Null years in current job | Over ten years | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 000025bb-5694-4cff-b17d-192b1a98ba44 | 5ebc8bb1-5eb9-4404-b11b-a6eebc401a19 | 11520 | Short Term | 741.0 | 0 | Home Mortgage | 33694.0 | Debt Consolidation | 584.03 | 12.3 | 41.0 | 10 | 0 | 6760 | 16056 | 0.0 | 0.0 | False | True |
128265 | 7f99634c-cc18-4667-92b4-b9e887b7c296 | e52aac79-e3f0-44cf-b538-539c0a01c849 | 7055 | Short Term | 742.0 | 0 | Home Mortgage | 90711.0 | Debt Consolidation | 1322.87 | 31.9 | 11.0 | 21 | 0 | 18290 | 19752 | 0.0 | 0.0 | False | True |
128263 | 7f99378c-b36e-4711-9af2-33c7239c00e7 | 32bdbd23-ca78-4c14-a0da-afc2a4e7a33a | 99999999 | Long Term | 694.0 | 8 | Home Mortgage | 110549.0 | Debt Consolidation | 1824.06 | 21.6 | 13.0 | 18 | 0 | 40907 | 52513 | 0.0 | 0.0 | False | False |
128259 | 7f96d054-5de3-4f5f-9a21-d54ee9af703b | 784a48cb-96ee-494c-aa7b-eb388fbf3003 | 2745 | Short Term | 743.0 | 2 | Home Mortgage | 34318.0 | Debt Consolidation | 506.18 | 16.5 | 68.0 | 6 | 0 | 5429 | 10501 | 0.0 | 0.0 | False | False |
128256 | 7f9558b3-4ed7-4fe0-8662-8021c53c0016 | 0971186f-b0a7-4d11-ba55-4b41348673a2 | 3008 | Short Term | 750.0 | 2 | Rent | 37104.0 | Debt Consolidation | 556.55 | 13.8 | 34.0 | 11 | 0 | 732 | 12621 | 0.0 | 0.0 | False | False |
x.drop('Loan ID', axis = 1, inplace = True)
x.drop('Customer ID', axis = 1, inplace = True)
X = pd.get_dummies(x, drop_first = True)
X.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 215698 entries, 0 to 25243 Data columns (total 26 columns): Current Loan Amount 215698 non-null int64 Credit Score 215698 non-null float64 Years in current job 215698 non-null int32 Annual Income 215698 non-null float64 Monthly Debt 215698 non-null float64 Years of Credit History 215698 non-null float64 Months since last delinquent 215698 non-null float64 Number of Open Accounts 215698 non-null int64 Number of Credit Problems 215698 non-null int64 Current Credit Balance 215698 non-null int64 Maximum Open Credit 215698 non-null int32 Bankruptcies 215698 non-null float64 Tax Liens 215698 non-null float64 Null years in current job 215698 non-null bool Over ten years 215698 non-null bool Term_Short Term 215698 non-null uint8 Home Ownership_Own Home 215698 non-null uint8 Home Ownership_Rent 215698 non-null uint8 Purpose_Buy House 215698 non-null uint8 Purpose_Buy a Car 215698 non-null uint8 Purpose_Debt Consolidation 215698 non-null uint8 Purpose_Educational Expenses 215698 non-null uint8 Purpose_Home Improvements 215698 non-null uint8 Purpose_Medical Bills 215698 non-null uint8 Purpose_Other 215698 non-null uint8 Purpose_Take a Trip 215698 non-null uint8 dtypes: bool(2), float64(7), int32(2), int64(4), uint8(11) memory usage: 24.1 MB
y.head()
Fully Paid | |
---|---|
0 | 1 |
128265 | 1 |
128263 | 1 |
128259 | 1 |
128256 | 1 |
Train random forest classifier, gradient boosting classifier, logistic regression, and k-nearest neighbors classifier models
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,train_size = 0.8, random_state = 0)
C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\sklearn\model_selection\_split.py:2026: FutureWarning: From version 0.21, test_size will always complement train_size unless both are specified. FutureWarning)
from sklearn.ensemble import RandomForestClassifier
rfc = RandomForestClassifier(n_estimators = 20)
rfc=rfc.fit(X_train,y_train)
C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\ipykernel_launcher.py:3: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). This is separate from the ipykernel package so we can avoid doing imports until
from sklearn.ensemble import GradientBoostingClassifier
gbc = GradientBoostingClassifier(n_estimators=50,learning_rate=.1)
model_gbc = gbc.fit(X_train,y_train)
C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\sklearn\utils\validation.py:578: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()
logmodel = lr.fit(X_train,y_train)
C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\sklearn\utils\validation.py:578: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
scaler = StandardScaler()
data_scaled = scaler.fit_transform(X_train)
neigh = KNeighborsClassifier(n_neighbors = 21)
model = neigh.fit(data_scaled,y_train)
C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\ipykernel_launcher.py:6: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
rfc_pred = rfc.predict(X_test)
gbc_pred = model_gbc.predict(X_test)
log_pred = logmodel.predict(X_test)
test_scaled = scaler.transform(X_test)
k_pred = model.predict(test_scaled)
from sklearn.metrics import accuracy_score, f1_score, recall_score, precision_score
Examine the accuracy, f1, recall, and precision scores of each model
pred_list = [rfc_pred,gbc_pred,log_pred,k_pred]
for i in range(4):
print(i)
print(accuracy_score(y_test,pred_list[i]))
print(f1_score(y_test,pred_list[i]))
print(recall_score(y_test,pred_list[i]))
print(precision_score(y_test,pred_list[i]))
0 0.8512517385257302 0.9140423023857045 0.9676120249574589 0.8660929606782931 1 0.856583217431618 0.918993937965618 0.9953204764605785 0.8535398983388865 2 0.8530366249420491 0.917407050363461 0.9986103233125354 0.8484169437617464 3 0.8543347241539174 0.9177788245145758 0.9946681792399319 0.8519238243295764
The gradient boosting model was selected as our predictive model since it had the highest scores (accuracy: 85.7%, f1: 91.9%, recall: 00.5%, precision: 85.3%)
Create a csv file to showcase my results
results = df_cleaned[['Loan ID']]
results['Customer ID'] = df_cleaned['Customer ID']
C:\User_Files\Lucy_Wan\Programming\Anaconda2\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
predict = pd.DataFrame(model_gbc.predict_proba(X))
predict.head()
0 | 1 | |
---|---|---|
0 | 0.178703 | 0.821297 |
1 | 0.100057 | 0.899943 |
2 | 0.009637 | 0.990363 |
3 | 0.141215 | 0.858785 |
4 | 0.113016 | 0.886984 |
predict = predict.rename(columns = {0:"Probability of Charged Off", 1:"Probability of Fully Paid"})
results = pd.merge(results,predict, left_index = True, right_index = True)
model_pred = pd.DataFrame(model_gbc.predict(X))
model_pred[model_pred==1]='Fully Paid'
model_pred[model_pred==0]='Charged Off'
results['Loan Status Predicted By Model'] = model_pred
results['Actual Loan Status'] = df_cleaned['Loan Status']
results.head()
Loan ID | Customer ID | Probability of Charged Off | Probability of Fully Paid | Loan Status Predicted By Model | Actual Loan Status | |
---|---|---|---|---|---|---|
0 | 000025bb-5694-4cff-b17d-192b1a98ba44 | 5ebc8bb1-5eb9-4404-b11b-a6eebc401a19 | 0.178703 | 0.821297 | Fully Paid | Fully Paid |
128265 | 7f99634c-cc18-4667-92b4-b9e887b7c296 | e52aac79-e3f0-44cf-b538-539c0a01c849 | 0.104818 | 0.895182 | Fully Paid | Fully Paid |
128263 | 7f99378c-b36e-4711-9af2-33c7239c00e7 | 32bdbd23-ca78-4c14-a0da-afc2a4e7a33a | 0.089349 | 0.910651 | Fully Paid | Fully Paid |
128259 | 7f96d054-5de3-4f5f-9a21-d54ee9af703b | 784a48cb-96ee-494c-aa7b-eb388fbf3003 | 0.009822 | 0.990178 | Fully Paid | Fully Paid |
128256 | 7f9558b3-4ed7-4fe0-8662-8021c53c0016 | 0971186f-b0a7-4d11-ba55-4b41348673a2 | 0.145427 | 0.854573 | Fully Paid | Fully Paid |
results.to_csv('results.csv')
features = pd.DataFrame(model_gbc.feature_importances_)
features['feature']=X_train.columns
Examine which features were the most important to my model
features
0 | feature | |
---|---|---|
0 | 0.203020 | Current Loan Amount |
1 | 0.444981 | Credit Score |
2 | 0.000000 | Years in current job |
3 | 0.173322 | Annual Income |
4 | 0.040899 | Monthly Debt |
5 | 0.000000 | Years of Credit History |
6 | 0.008935 | Months since last delinquent |
7 | 0.003361 | Number of Open Accounts |
8 | 0.000000 | Number of Credit Problems |
9 | 0.004552 | Current Credit Balance |
10 | 0.006288 | Maximum Open Credit |
11 | 0.000000 | Bankruptcies |
12 | 0.000000 | Tax Liens |
13 | 0.028454 | Null years in current job |
14 | 0.000000 | Over ten years |
15 | 0.054749 | Term_Short Term |
16 | 0.000000 | Home Ownership_Own Home |
17 | 0.031438 | Home Ownership_Rent |
18 | 0.000000 | Purpose_Buy House |
19 | 0.000000 | Purpose_Buy a Car |
20 | 0.000000 | Purpose_Debt Consolidation |
21 | 0.000000 | Purpose_Educational Expenses |
22 | 0.000000 | Purpose_Home Improvements |
23 | 0.000000 | Purpose_Medical Bills |
24 | 0.000000 | Purpose_Other |
25 | 0.000000 | Purpose_Take a Trip |