PDA

View Full Version : Harmoney



Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 [16] 17 18 19

Vagabond47
03-10-2018, 06:46 PM
It happens fairly often due to system errors on Harmoney's side.

They don't contact you about it or charge fees as it's their fault so just consider it a bonus.

Okay, another first for me then, I think I like earning interest on money I don't have more than write-downs !

leesal
03-10-2018, 06:58 PM
Okay, another first for me then, I think I like earning interest on money I don't have more than write-downs !

From older newbie to newer newbie I'd say get used to the charge-offs. You'll get plenty more.

Albeit with limited info, I'm finding that most loans that go past 60 days eventually go pear-shaped :(

Vagabond47
03-10-2018, 09:15 PM
At the moment there is just the one loan over 60 days, I expect that to be written down any day now, everything else in arrears is 0-30 days, so nothing to worry about yet.

humvee
03-10-2018, 10:30 PM
I assume that is for myles? I didn't get or expect anything.

Anyway, next question.. how have I ended up with a negative funds available amount? I did invest in a few loans today, but I thought the system was supposed to stop you from investing money you don't have? -$38.09 available funds.

I wonder if you were invested in any of the errors I pointed out to harmony?

Loans with negitaive outstanding balances.

Completed loans that still owed money

Open loans with $0 balances

beacon
04-10-2018, 08:35 AM
I wonder if you were invested in any of the errors I pointed out to harmony?

Loans with negitaive outstanding balances.

Completed loans that still owed money

Open loans with $0 balances

You're not alone humvee. I am experiencing these errors too, and some of them are now chronic. I noticed other posters mentioning similar problems too, earlier in this thread. Question is, are they just reporting errors, or more problematically, computing errors? If the summary shows 2+2 is 5, then it is quite likely that somewhere in the data 2+2 was computed as 5 (manually or algorithmically). Reports are just the final output. GIGO - Garbage In, garbage out.

Harmoney dataset is huge and growing. Number of errors generally increase with dataset size, but then Harmoney is no different from banks with large financial datasets. Would you forgive your bank if they had such reporting standards? I remain invested for the moment, due to paucity of suitable alternatives...

777
04-10-2018, 08:50 AM
I wonder if you were invested in any of the errors I pointed out to harmony?

Loans with negitaive outstanding balances.

Completed loans that still owed money

Open loans with $0 balances

I have three of those. Calculated to overpaying 1 cent each. A bit untidy as they should be just scrubbed off active loans.

humvee
04-10-2018, 09:38 AM
You're not alone humvee. I am experiencing these errors too, and some of them are now chronic. I noticed other posters mentioning similar problems too, earlier in this thread. Question is, are they just reporting errors, or more problematically, computing errors? If the summary shows 2+2 is 5, then it is quite likely that somewhere in the data 2+2 was computed as 5 (manually or algorithmically). Reports are just the final output. GIGO - Garbage In, garbage out.

Harmoney dataset is huge and growing. Number of errors generally increase with dataset size, but then Harmoney is no different from banks with large financial datasets. Would you forgive your bank if they had such reporting standards? I remain invested for the moment, due to paucity of suitable alternatives...


It does look like harmoney have fixed some of the errors in each category since I reported them

Here are the 4 worst of each type of error that remain in my loans
All of the Active but $0 outstanding loans I reported appear to be fixed

10016
10017

myles
04-10-2018, 06:42 PM
Have you looked at differences in default rates between borrowers who make comments and those that don't?

Pretty much what I expected:

10020

myles
06-10-2018, 11:26 PM
Below is another data set from a long term Harmoney Lender - total loans is just over 10,000. Many thanks for sharing:t_up:

I don't now the history or details of this data set, but a few observation worth considering:

This data set includes early loans that had much higher interest and default rates - some of the early loans, from what I've been able to work out, were somewhat 'dodgy', so this needs to be considered i.e. older loans may not be representative of newer loans (I think Harmoney are doing a much better job now on loan selection - just my opinion).
I've not looked at the data set, but just doing the graphs, it appears that early loans where based on significantly higher risk loans that may not have had a good return. There has been a significant change to lower risk loans, which is where most of the current loans appear to be.


http://i634.photobucket.com/albums/uu65/mylesau/Harmoney/timeage_1.gif

First a time lapse, which shows the period and investment level. One point to note here is that towards the end there is a huge jump in defaults - I can only guess that Harmoney did a major 'tidy up' of old loans at this point, but I'm not sure. Unfortunately it makes the details of actual timing of defaults of lesser value.

10028

The 'hazard' curve is 'broken' due to the jump in defaults. Huge 'churn' of loans very obvious.

10029

If you can work this chart out, you can see the significant amount of higher risk loans that have been paid off on the right half of the chart (light blue). The Population line shows where most of the loans are now - lower risk.

10030

Data can be beautiful - shows that Harmoney have the mix of loans right. Note that this data set includes earlier, much higher risk loans, so this is probably not an accurate representation of current loans and their default rates, that have change numerous times over the period of this data set. (This applies to all the details below to some extent as well.)

10031

10032

Limit to 5 images - continued next post...

myles
06-10-2018, 11:28 PM
10033

10034

10035

10036

Enjoy!

myles
06-10-2018, 11:48 PM
I meant to add that the tiny peak at the 36 month point, where a couple (...) of loans actually get to is, well, tiny...:ohmy:

Just thinking about the Payment Protect numbers - possibly not meaningful as they include a large portion of loans when PP was not available? Could probably redo this for loans after whenever PP started...

Quick look at the data shows that the large 'jump' in defaults was due to a heap of loans debt being sold off early February 2018.

The 'Enquiries (6 months)' chart has a bar with no label (far right - count of 404), I'm guessing that this info was not recorded in early days, so was just blank - I'd suggest just ignoring this bar.

Everyones loan set is different, so these numbers can only be considered a guide, nothing more.

Cool Bear
07-10-2018, 10:22 AM
Fantastic work Myles. Many thanks for the effort. Thanks too to the investor for sharing.

One quick conclusion is that it is much better to invest in rewrites. None of my almost 400 defaults were rewrites too.

alundracloud
07-10-2018, 11:59 AM
Fantastic work Myles. Many thanks for the effort. Thanks too to the investor for sharing.

One quick conclusion is that it is much better to invest in rewrites. None of my almost 400 defaults were rewrites too.

Quite a lot to digest in there- you're right though Cool Bear the Re-Writes is absolutely staggering.

Another that stood out to me was Home Improvement loans defaulting at just a tick over 1%.

One of the selection criteria that I use, is I will only go into a Home Improvement loan if the residential status "Fully Owned" or "Paying Mortgage". Would you mind plotting this Myles?

y-axis = number of Home Improvement Defaults
x-axis = Residential status

Huge thanks to both the investor & Myles. Fantastic effort! :cool::cool:

myles
07-10-2018, 01:43 PM
y-axis = number of Home Improvement Defaults
x-axis = Residential status


I think this is what you want? You're on the money - looks like most are mortgagors anyways.

10038

alundracloud
07-10-2018, 01:53 PM
Exactly what I was after, and looks similar to how I expected. Thank-you!
I have some data which you might be interested in also, how best to get it to you?

Cool Bear
07-10-2018, 02:27 PM
I think this is what you want? You're on the money - looks like most are mortgagors anyways.

10038
Thanks again.

The single default in the "boarding" resulting in the 6.25% (1 out of 16) could be an anomaly due to the low population size. Similarly too for the "Living with Parents" and "others". However, when you combine them, it is a total of 10 defaults out of 88 or 11.36%. And 88 is a credible population size. Even if you add the 2 "unknowns" and 8 "supplied by Employers", it is still a high 10.1% compared to 0% for fully owned, 0.5ish% for mortgagees and even the 'high-ish' 3+% for renters.

Valuable confirmation, Myles.

beacon
08-10-2018, 10:20 AM
Below is another data set from a long term Harmoney Lender - total loans is just over 10,000. Many thanks for sharing:t_up:

Many thanks Myles and InTheRearWithTheGear for sharing. Interesting to see your lending volume up substantially in the last six months. Confidence from past lending results and learning, no doubt.

Thanks also Cool Bear for your insights:)

myles
09-10-2018, 01:44 AM
There are a few people interested in pooling data and I've got most of what needs to be done to merge and clean the data. I'll not include any data I currently have so if you have given me some previously can you please generate a new report and upload that.

The good thing is that the data in the orders-report-###.csv file provided, is free of lender info - so it will be safe to upload the csv and not have the data associated with the lender. However, the name of the csv file does include an ID number - for anonymity it will be best to rename the file to just orders.csv.

Please generate a new report, rename the file to orders.csv and submit that as it is (don't zip or anything else).

When asked for a name and email address to upload the file just use a fake one, I typically pick on poor old fred:

10040

I'm just using a Drop Box file request, which is secure and safe:

Upload orders.csv (https://www.dropbox.com/request/K5kFe1loGKQnaIUwi0XS)


There are some issues with timeliness of data i.e. it becomes 'stale' if not updated with any new details of defaults. So I'll only keep this link live for the rest of this week (late Sunday) and then shut it down.

Then there is the question of what to do with the data. I can clean it up and make it available for those who want to do their own analysis and perhaps come up with one 'tidy' set of charts and summary data for those who don't/can't. Some thought needs to be put in around older loans and the 9 month default gap which effects the 'real' values - I'll think about that a bit more.

The 'mechanics' of doing it aren't a problem I don't think i.e. once I've merged and cleaned the data I'll just provide a link to download the compiled data as a csv in the same format as it currently is.

If anyone has suggestions/thoughts please jump in.
(https://www.dropbox.com/request/K5kFe1loGKQnaIUwi0XS)

IntheRearWithTheGear
09-10-2018, 08:41 AM
You may wish to strip borrower information from it as well - use your own key.

ie LAI-00140637 becomes some sort of new id tracked across mutliple data sources - that way loan is not identified.

But At the end of the day, we cant beat what they provide which is the loan grade as a loan risk, this loan grade takes into account all the hidden data points ie credit score.

You would be wrong to associate risk with a individual loan dimenson such as "board" or "living with parents"

A better way to visualise things use kmeans clustering on each dimension - there will be python tools to do it -

And then you could spin your graphs by dimension

https://www.youtube.com/watch?v=9qnMHAZaD7k

https://www.youtube.com/watch?v=l98GVNwdTks

is an example.

Its kinda interesting stuff and food for thought.

(you would also need a loan from harmony for the pc specs to work it out)


We should call your data dump idea the #TheHarmoning you know after the #thefappening :)

beacon
09-10-2018, 08:47 AM
If anyone has suggestions/thoughts please jump in.


Neat idea. Thanks for making the time to do this, Myles. I have some pooled data too, which I am happy to upload to help with this initiative.


Data labels on bar charts can be very helpful. Eg., in plotting default by coborrower, you currently include sample sizes on x-axis (1079 co-borrowers and 9909 single borrower). Both bars are between 0 to 5%. Could you also include labels on top of the bars like 3% or better still 32, if including them is no trouble to you.


Protect loans might be better compared against non-protect loans (control) after the date protect loans became available, to be more truly representative. So if I were plotting these, I'd begin counting the control sample from the day protect sample began. This means control sample will be a subset of Harmoney loan population and control population will be less than Harmoney total population. Of course, the result may not not be too different.

humvee
09-10-2018, 09:02 AM
You may wish to strip borrower information from it as well - use your own key.

ie LAI-00140637 becomes some sort of new id tracked across mutliple data sources - that way loan is not identified.


I don't see the need to do this - harmoney has already done this - the loan ID is already harmoney's key that does not identify the borrower.

IntheRearWithTheGear
09-10-2018, 09:08 AM
From my view point.

If i had a loan which turned bad - and my harmony documents had LAI-00140633 written on it as loan identity. Somebody could google that at a later point and find loan outcome from one of our uploaded datasets - maybe ex husband or something - at the momment ex husband cant really do that. So by stripping it out - and replacing with a number which does link to harmoney loan id - we get best of both worlds - anonymous but linked.

Like a phone number in its own dosnt identify you - but once you have the phone number you can find where a person has worked, compaines they own etc.

humvee
09-10-2018, 09:42 AM
From my view point.

If i had a loan which turned bad - and my harmony documents had LAI-00140633 written on it as loan identity. Somebody could google that at a later point and find loan outcome from one of our uploaded datasets - maybe ex husband or something - at the momment ex husband cant really do that. So by stripping it out - and replacing with a number which does link to harmoney loan id - we get best of both worlds - anonymous but linked.

Like a phone number in its own doesnt identify you - but once you have the phone number you can find where a person has worked, companies they own etc.

Unlike a phone number which is linked to a person or location, the loan ID is relates to a single loan - not a person - ie all re-writes get a new loan id. this means it has a maximum life expectancy of 3-5 years - but as past data has show in general this is more like 1-3 years

The disadvantage of changing the loan ID is to a small extent extra work and complexity. But mostly it makes it more difficult for any one else to build appon the data, add to it, or if myles ever stopped doing it for what ever reason - carry on the project.

I also see value in comparisons made in 6 months time of things like what % of loans in arrears today are written off vs in arrears vs current 6 months later - to do this you need to link data sets 6 months apart - that may or may not be processed by the same person

The only reason I really see for changing the loan ID would be to stop some one who did not contribute to the data pool from getting the data then adding their own to it (and not sharing) then having a more complete data set then anyone else. But I'm putting my data into the pool and I still don't think this is a good enough reason to change the ID

myles
09-10-2018, 09:47 AM
You may wish to strip borrower information from it as well - use your own key.

I tend to agree with Humvee - this has already been done by Harmoney when they give the details to us - the LAI-# would not be shown on any borrower records (i.e. link to borrower would be internal only - well it should be?).


But At the end of the day, we cant beat what they provide which is the loan grade as a loan risk, this loan grade takes into account all the hidden data points ie credit score.

Some of us are doing much better than what they provide, that's the point of doing this.


You would be wrong to associate risk with a individual loan dimenson such as "board" or "living with parents"

Disagree with this, sure you've got to take multiple things into consideration, but you've only got to look at the re-write numbers to see that one dimension can be enough to lower default rates significantly.

If you want to splice and dice the data, a spreadsheet pivot table will do it for you, nothing too flash is really needed, once you have the data set to play with.

myles
09-10-2018, 09:48 AM
Data labels on bar charts can be very helpful. Eg., in plotting default by coborrower, you currently include sample sizes on x-axis (1079 co-borrowers and 9909 single borrower). Both bars are between 0 to 5%. Could you also include labels on top of the bars like 3% or better still 32, if including them is no trouble to you.

Good idea, if I forget, please remind me.

myles
09-10-2018, 10:02 AM
Just to expand on what I suggested about timeliness before - the data set is really a snapshot in time - as a whole, it can't be built on because it will potentially contain 'stale' loan data. To try to explain: If someone uploads a particular loan that no one else has in the initial data set, and then never uploads again, then that particular loan will just sit with no further detail updated - it will become 'stale' and impact on the data set.

The dates stored with the data i.e. LAST_PAYMENT_DATE, don't allow, with confidence, for these 'stale' records to be removed (they may just be in arrears). So the best way to get a refreshed set of data will be to do this whole thing again - I think once very 6 months or so would be frequent enough - I don't see this as a 'continuous' process.

However, the detail of loans that are 'Paid Off', 'Charged Off', 'Debt Sold', will not change and could be built on. Something to think about a bit later on I think.

Point I'm trying to make is that this is a one off at this stage, that could be repeated, but not too frequently.

IntheRearWithTheGear
09-10-2018, 10:12 AM
Another option could be to just trust myles - all send him our raw data. Have him create google sheet with all data dumps and then for him to google doc's the sheet with all data consoldated to all worthy contributions - we can then create our own graphs - and if a graph is any good publicise the summary data which we reference for dicussions or improve on.

So one source of data control by myles with a many google sheets using that data as source for pivot tables etc.

humvee
09-10-2018, 10:29 AM
Myles - What is your cut off date for receiving data, Ive uploaded mine - but I can re export and upload just before cut off to make the data more current if this is helpful?

myles
09-10-2018, 10:41 AM
Another option could be to just trust myles - all send him our raw data. Have him create google sheet with all data dumps and then for him to google doc's the sheet with all data consoldated to all worthy contributions - we can then create our own graphs - and if a graph is any good publicise the summary data which we reference for dicussions or improve on.

So one source of data control by myles with a many google sheets using that data as source for pivot tables etc.

Yeah, nah - too much work for me and I don't want control :p Everyone will want to look at the data differently, that's why the easy solution of just providing the whole set back seemed to be the best way to go. I'll do up the basics for those who don't have the skills to generate graphs etc. which, with a bit of tidy up I've already scripted so no drama.

myles
09-10-2018, 10:43 AM
Myles - What is your cut off date for receiving data, Ive uploaded mine - but I can re export and upload just before cut off to make the data more current if this is helpful?

I had that in the original post - late Sunday (I'll close the link sometime Monday morning - probably early early am).

beacon
09-10-2018, 11:50 AM
data set is really a snapshot in time - as a whole, it can't be built on because it will potentially contain 'stale' loan data. To try to explain: If someone uploads a particular loan that no one else has in the initial data set, and then never uploads again, then that particular loan will just sit with no further detail updated - it will become 'stale' and impact on the data set.

The dates stored with the data i.e. LAST_PAYMENT_DATE, don't allow, with confidence, for these 'stale' records to be removed (they may just be in arrears). ...

I agree. Myles.
Humvee, I think that while more data and more people sharing data is definitely better, Myles probably already has over 75% of the unique loan records in the Harmoney population with his own dataset, yours, mine and InTheRearWithTheGear's. If Cool Bear chooses to share too, I suspect we might end up with over 80% n. No statistician can argue purity of data on those numbers.


the detail of loans that are 'Paid Off', 'Charged Off', 'Debt Sold', will not change and could be built on..

Also agree. These don't become stale - as a final outcome has been already achieved. The only place this gets stuffed is where Harmoney has noted different outcomes in different portfolios for the same LAI. I came across this when we were pooling some Harmoney data earlier.

myles
09-10-2018, 11:58 AM
Also agree. These don't become stale - as a final outcome has been already achieved. The only place this gets stuffed is where Harmoney has noted different outcomes in different portfolios for the same LAI. I came across this when we were pooling some Harmoney data earlier.

Yep, will just have to 'fix' data errors that are obvious and hope they all come out in the wash...

The 'jump' in defaults that showed up in that last time lapse chart I did was surprising to me - I would not likely have found it if I hadn't done that time lapse... It appears harmoney updated the LAST_PAYMENT_DATE field when the debt was sold which broke the data, why they did that I don't know. I would have liked to have developed a 'hazard' curve based on real data, but can't because of this :(

myles
09-10-2018, 12:19 PM
Is there any additional data that anyone can think of that might be of value when pulling this together?

An example is that I could create a 'number of lenders' column, which would be a count of how many of the data sets included the loan - not overly meaningful but might be interesting to see which loans are taken more frequently? Hmm, might be easier to just release the 'raw' data and the 'clean/unique' data sets. Then anyone can do their own thing.

I'll just keep the main merge simple by taking the latest unique loan record as uploaded from the data sets - I'll add mine as a fresh set at the end.

Any suggestions or thoughts on this welcome.

beacon
09-10-2018, 12:47 PM
... It appears harmoney updated the LAST_PAYMENT_DATE field when the debt was sold which broke the data, why they did that I don't know. I would have liked to have developed a 'hazard' curve based on real data, but can't because of this :(

The least Harmoney can do is update the hazard curve annually, especially as it is still evolving. It still has the 15 month old one up (hazard-curve-jul-2017-583x260)

alundracloud
09-10-2018, 12:48 PM
Is there any additional data that anyone can think of that might be of value when pulling this together?

An example is that I could create a 'number of lenders' column, which would be a count of how many of the data sets included the loan - not overly meaningful but might be interesting to see which loans are taken more frequently? Hmm, might be easier to just release the 'raw' data and the 'clean/unique' data sets. Then anyone can do their own thing.

I'll just keep the main merge simple by taking the latest unique loan record as uploaded from the data sets - I'll add mine as a fresh set at the end.

Any suggestions or thoughts on this welcome.

I think your proposal to clean the data, then release a 'tidy' version of unique loans the best option. Often times with data, you don't develop your question until mucking around with the data.

I think keeping the 'LAI-' as a key is useful, I'd prefer that over InTheRearWithTheGear's suggestion.

beacon
09-10-2018, 12:53 PM
I think keeping the 'LAI-' as a key is useful, I'd prefer that over InTheRearWithTheGear's suggestion.

My vote for keeping LAI too. It is easier, cleaner, portable, scalable, discrete ...

IntheRearWithTheGear
09-10-2018, 12:56 PM
Add a record creation date/time - could be used as a tie breaker for determining freashness for duplicate rows.

beacon
09-10-2018, 01:17 PM
I have about 2500 active loans ... With 50% annual churn/repayment ...!

Data pool would benefit from contributions by early lenders like RMJH, 777, harvey specter, or Halebop, if they are still around.

Cool Bear
09-10-2018, 01:36 PM
My vote for keeping LAI too. It is easier, cleaner, portable, scalable, discrete ...
agreed too

Cool Bear
09-10-2018, 01:53 PM
I find that when analysing defaults, it is better to leave out all recent loans as they will lower the actual default rates. Based on my loans, the average time from date of loan to default is just over 13 months. The median is just short of 12 months. So my suggestion to Myles is to (in addition to the main set of graphs), also do one set (relating to defaults) ignoring all loans less than (say) 12 months old.

RMJH
09-10-2018, 03:33 PM
Data pool would benefit from contributions by early lenders like RMJH, 777, harvey specter, or Halebop, if they are still around.
Very happy to share, must be about 7000 loans including closed in my history. Do I just download the entire history to CSV and use the link to dropbox provided by Myles? Presumably it's easy enough to weed out duplicate loans found in the other data sets. cheers R

777
09-10-2018, 03:45 PM
Data pool would benefit from contributions by early lenders like RMJH, 777, harvey specter, or Halebop, if they are still around.

What data do you want? I can send in the report as produced by Harmoney if that would help.

I invested from 30/10/14 and started extracting myself from the system in July 2016. 160 loans averaging $250/loan

Only one loan left. A 60 month one which has 13 months to go.

myles
09-10-2018, 03:49 PM
Very happy to share, must be about 7000 loans including closed in my history. Do I just download the entire history to CSV and use the link to dropbox provided by Myles? Presumably it's easy enough to weed out duplicate loans found in the other data sets. cheers R

From the main Harmoney dashboard select [REPORTS] across the top and when the [Lending] page comes up you should see a big blue [Export] button above the counter on the right. When you click on that you'll get an email shortly after (could be 10 minutes) with a csv file attached. Just download that csv file (something like orders-report-1234567890-123456.csv) and rename it to just orders.csv. Then select the Drop Box link from my original post and follow those detail to upload the orders.csv file.


At this point in time, with my loans included there are around 24000 loans in total with around 16000 unique loans.

Note: One of the uploads had a heap of extra commas at the end of each row - I'm guessing it was preloaded into a spread sheet and then dropped out. Best to just upload the raw csv (this didn't create a problem as the import drops any extra columns).

RMJH
09-10-2018, 04:30 PM
Thanks Myles. Data should be in Dropbox now.

alundracloud
09-10-2018, 05:27 PM
I've been looking around to see what others (academics) have done to try to determine default rates - some are using a 'derived' weighting on arrears so it looks at more recent data.

Has anyone else looked into this and come up with any good options?


Hi Myles,

This blog post at Lending Robot might be of interest..
http://blog.lendingrobot.com/research/predicting-the-number-of-payments-in-peer-lending/

**And this one:
http://blog.lendingrobot.com/research/predicting-returns-for-ongoing-loans/

I'm hopeful (perhaps naively?) that we might be able to recreate something similar to the 'hazard curve' / 'survival function' graph with the compiled loan data.

myles
09-10-2018, 08:08 PM
I'm hopeful (perhaps naively?) that we might be able to recreate something similar to the 'hazard curve' / 'survival function' graph with the compiled loan data.

Thanks for those links, some good info in there particularly the second one.

Unfortunately I think Harmoney have stuffed the data by using the LAST_PAYMENT_DATE for dual purpose when they started selling of Debt. Will wait and see when this data set comes together, but at the end of the day I'm of the thinking that I'd prefer to use my own loan set to base my defaults on which are more representative of what I'm likely to see in future loans - work in progress, would love to try to integrate arrears somehow...

myles
09-10-2018, 09:19 PM
The csv file that was submitted just before lunch (11:55) today that was labelled as 'fred fred - orders 2 myles.csv', which I found with a heap of extra commas on the end has a number of other issues (dates reversed, formated % signs etc.) in the data.

I really can't use this as it would take too long to go through and check what has been changed.

If this is you can you please re-upload a new csv at some stage, otherwise I'll have to drop it out of the set as I don't want to introduce unknown errors.

Just the raw csv as it's emailed directly from Harmoney is the go, don't open it in any viewers etc. first.

Thanks.

Over 20,000 unique loans to date!

Cool Bear
09-10-2018, 11:32 PM
The csv file that was submitted just before lunch (11:55) today that was labelled as 'fred fred - orders 2 myles.csv', which I found with a heap of extra commas on the end has a number of other issues (dates reversed, formated % signs etc.) in the data.

I really can't use this as it would take too long to go through and check what has been changed.

If this is you can you please re-upload a new csv at some stage, otherwise I'll have to drop it out of the set as I don't want to introduce unknown errors.

Just the raw csv as it's emailed directly from Harmoney is the go, don't open it in any viewers etc. first.

Thanks.

Over 20,000 unique loans to date!
Myles, how many lines of data (number of loans) in the above mentioned csv file?

myles
10-10-2018, 01:46 AM
Myles, how many lines of data (number of loans) in the above mentioned csv file?
5769 including 1 line for the headings

leesal
10-10-2018, 08:57 AM
uploaded my data if any use.

Are you able to generate default rate by cohort(6 monthly)/and grade ? And same thing for early repayments.

beacon
10-10-2018, 09:12 AM
What data do you want? I can send in the report as produced by Harmoney if that would help.

Yes 777, that is the one. As Myles has written : From the main Harmoney dashboard select [REPORTS] across the top and when the [Lending] page comes up you should see a big blue [Export] button above the counter on the right. When you click on that you'll get an email shortly after (could be 10 minutes) with a csv file attached. Just download that csv file (something like orders-report-1234567890-123456.csv) and rename it to just orders.csv. Then select the Drop Box link from my original post and follow those detail to upload the orders.csv file.

beacon
10-10-2018, 09:15 AM
I find that when analysing defaults, it is better to leave out all recent loans as they will lower the actual default rates. Based on my loans, the average time from date of loan to default is just over 13 months. The median is just short of 12 months. So my suggestion to Myles is to (in addition to the main set of graphs), also do one set (relating to defaults) ignoring all loans less than (say) 12 months old.

Perhaps the date of the most recent default as cutoff would be better, instead of say 12 months. I have had noticeable defaults in the <12 month period. By the way, thanks for sharing your dataset too, Cool Bear.

beacon
10-10-2018, 09:18 AM
Thanks Myles. Data should be in Dropbox now.

Thanks RMJH.

Thanks leesal, and also thanks to all the data contributors. 20,000 unique records could very well be 95% of the Harmoney loan universe. :t_up:

777
10-10-2018, 09:35 AM
Yes 777, that is the one. As Myles has written : From the main Harmoney dashboard select [REPORTS] across the top and when the [Lending] page comes up you should see a big blue [Export] button above the counter on the right. When you click on that you'll get an email shortly after (could be 10 minutes) with a csv file attached. Just download that csv file (something like orders-report-1234567890-123456.csv) and rename it to just orders.csv. Then select the Drop Box link from my original post and follow those detail to upload the orders.csv file.

Uploaded last night.

beacon
10-10-2018, 09:41 AM
20,000 unique records could very well be 95% of the Harmoney loan universe. :t_up:

My bad! Harmoney universe has had 47,000+ unique loans to date. So, we might be shy of 50% of the universe, but still statistically very significant numbers.

Thanks 777

myles
10-10-2018, 10:23 AM
Just a quick update on numbers - new uploads aren't adding too many new records which is a good sign:

Unique Loans: 20446 (First dated: 2014-10-29)
Raw Loans: 32494

humvee
10-10-2018, 10:43 AM
Just a quick update on numbers - new uploads aren't adding too many new records which is a good sign:

Unique Loans: 20446 (First dated: 2014-10-29)
Raw Loans: 32494


What loan ID was on 2014-10-29? Must be very low

2015-31-01 was LAI-00009718

Cool Bear
10-10-2018, 01:18 PM
Are you able to generate default rate by cohort(6 monthly)/and grade ? And same thing for early repayments.
Myles, by 6 monthly cohort (sorted by grade) would be fantastic if you can spare the time to do it. Thanks a million for doing this. CB

777
10-10-2018, 01:19 PM
What loan ID was on 2014-10-29? Must be very low

2015-31-01 was LAI-00009718

My earliest filled loan was 5531 issued on 30/10/14. I had an earlier numbered at 5493 but it did not get filled until 24/12/18.

myles
10-10-2018, 04:32 PM
Myles, by 6 monthly cohort (sorted by grade) would be fantastic if you can spare the time to do it. Thanks a million for doing this. CB

This will be a very 'wide' graph. Are you wanting by grade as in A, B,...F (50 bars) or as in grade A1, A2,...F5 (300 bars)?

Perhaps it needs to be plotted differently (just a line graph)?

Will wait till the combined data comes together and then have a look.

Cool Bear
10-10-2018, 05:09 PM
This will be a very 'wide' graph. Are you wanting by grade as in A, B,...F (50 bars) or as in grade A1, A2,...F5 (300 bars)?

Perhaps it needs to be plotted differently (just a line graph)?

Will wait till the combined data comes together and then have a look.
I was thinking of one graph for each cohort (a1 to f5) but if that is too time consuming, then one overall graph with just A to F as a start

leesal
10-10-2018, 06:37 PM
This will be a very 'wide' graph. Are you wanting by grade as in A, B,...F (50 bars) or as in grade A1, A2,...F5 (300 bars)?

Perhaps it needs to be plotted differently (just a line graph)?

Will wait till the combined data comes together and then have a look.

I was thinking of one graph for each cohort (a1 to f5) but if that is too time consuming, then one overall graph with just A to F as a start

You'd probably want to run it as a line graph, with a separate line for grade (A, B... F) [imagine plotting by cohort would overlap too much]. Alternatively if using a bar graph, may need to run a separate graph for each cohort.

Thanks Myles :) Must be interesting being able to work all that data - just under 50% of the loans out there!

myles
10-10-2018, 08:00 PM
Must be interesting being able to work all that data

Haven't had much time to play :(

But here is the latest:

Unique Loans: 21249

Total Loan Value: $432,596,325.00

10047

Number of loans that defaulted for each bar has been added, which is a good addition :t_up:

Just adding this one as it is significantly different from what was shown in the past:

10048
Calculated by determining the minimum date where a loan has a PP and only included loans on and after that date. More like what most expect it to be.

Cool Bear
10-10-2018, 11:00 PM
Haven't had much time to play :(

But here is the latest:

Unique Loans: 21249

Total Loan Value: $432,596,325.00

10047

Number of loans that defaulted for each bar has been added, which is a good addition :t_up:

Just adding this one as it is significantly different from what was shown in the past:

10048
Calculated by determining the minimum date where a loan has a PP and only included loans on and after that date. More like what most expect it to be.
I wonder if the PP are taken up more by the borrowers in the risker grades and thus have more defaults. Could you do one showing % of PP taken by the various grades? If the percentages are about even, then we can safely conclude to stay away from PP loans. But if it shows that more borrowers in the risker grades take PP then that conclusion may not be valid.

myles
11-10-2018, 12:18 AM
I wonder if the PP are taken up more by the borrowers in the risker grades and thus have more defaults. Could you do one showing % of PP taken by the various grades? If the percentages are about even, then we can safely conclude to stay away from PP loans. But if it shows that more borrowers in the risker grades take PP then that conclusion may not be valid.
Probably need to plot this differently but the details are shown:

10050

The ratios are pretty even across all grades, but more PP taken up in mid risk grades.

RMJH
11-10-2018, 07:59 AM
Great work Myles, adding the sample size is very helpful. Some interesting patterns there. Have you got graphs by 6m cohort too?

beacon
11-10-2018, 08:06 AM
Number of loans that defaulted for each bar has been added, which is a good addition :t_up:

Just adding this one as it is significantly different from what was shown in the past:

10048
Calculated by determining the minimum date where a loan has a PP and only included loans on and after that date. More like what most expect it to be.

Thanks Myles. Part covers defaulting more than full covers - I thought it was an anomaly just in my much smaller dataset, but not so. How interesting!

Cool Bear
11-10-2018, 10:24 AM
Probably need to plot this differently but the details are shown:

10050

The ratios are pretty even across all grades, but more PP taken up in mid risk grades.

Great work Myles. Thank you.

I think we can safely conclude that PP loans are a greater risk than non-PP loans (although not to the extent that the earlier aggregate chart shows). With partial PP performing a bit worse than full PP.

With us having to pay HM for the sales commission even for fail PP loans, that makes the loss to us investors even higher. Another factor to take into account when investing manually.

Thanks again.

myles
11-10-2018, 10:53 AM
The hard part is determining if the increased income out-ways the increased risk from PP's. I've never looked at the numbers but 'trusted' Harmoney's suggested 1% gain?

Need to correct what I said earlier too, now looking again at that last PP graph:

There is more PP taken up in higher risk grades in proportion to loans in that grade (not what I said before about mid grades - that was volume of PP not proportion)...needed to be plotted differently to highlight that...but the detail is there.

humvee
11-10-2018, 12:00 PM
Probably need to plot this differently but the details are shown:

10050

The ratios are pretty even across all grades, but more PP taken up in mid risk grades.


It is kind of surprising that defaults on payment protect loans is higher, as I would have thought a number of the causes of defaults on normal loans should not cause a default - but should instead cause a claim against payment protect on loans with payment protect

Vagabond47
11-10-2018, 12:42 PM
It is kind of surprising that defaults on payment protect loans is higher, as I would have thought a number of the causes of defaults on normal loans should not cause a default - but should instead cause a claim against payment protect on loans with payment protect

But payment protect is a prime example of adverse selection bias. Those that have good stable jobs and are confident of their ability to pay don't get PP, while those that know work has been a bit slow lately etc do get PP.

humvee
11-10-2018, 01:02 PM
But payment protect is a prime example of adverse selection bias. Those that have good stable jobs and are confident of their ability to pay don't get PP, while those that know work has been a bit slow lately etc do get PP.

I agree with you that that must be whats happening, but its interesting if you look at the figures for full payment protect in particular this group of people must be pre disposed to events or problems that cause defaults that are NOT Death, Terminal Illness, Disability or Involuntary Redundancy, because if the cause was any of these it would be a payment protect claim rather then a default

myles
11-10-2018, 03:08 PM
You'd probably want to run it as a line graph, with a separate line for grade (A, B... F) [imagine plotting by cohort would overlap too much]. Alternatively if using a bar graph, may need to run a separate graph for each cohort.
Had wanted to work out how to create a 'Fence' chart (finicky to get right) so thought I'd give it a go for this. Result below:

10054

Not sure if there is enough detail to be meaningful? Would running it for individual grades (i.e. A1, A2) be useful? Let me know what you think.

Perhaps there is a better way to show what you want - or it might be best to use a Pivot Table once you have the data and slice and dice as you want?

myles
11-10-2018, 06:25 PM
Argh, lost two posts :( stupid mobile interface has far to big of a Delete button...

Anyway, 3rd time lucky.

I realised I got my wires crossed with what you guys are asking for - I thought you wanted 'Enquiries last 6 months' vs Grades rather than 6 monthly period borrower cohort vs Grades.

Unfortunately, I don't think we are going to be able to generate this. When Harmoney started selling off debt they used the 'Last Payment Date' field for some unknown reason. By doing this they overwrote the date when the loan actually defaulted (i.e. when the last payment was made).

Will have to have a look and see...but that last time-lapse showed a very large chunk of loans being sold off and effectively having their default date wiped into the future.

Cool Bear
11-10-2018, 07:34 PM
Argh, lost two posts :( stupid mobile interface has far to big of a Delete button...

Anyway, 3rd time lucky.

I realised I got my wires crossed with what you guys are asking for - I thought you wanted 'Enquiries last 6 months' vs Grades rather than 6 monthly period borrower cohort vs Grades.

Unfortunately, I don't think we are going to be able to generate this. When Harmoney started selling off debt they used the 'Last Payment Date' field for some unknown reason. By doing this they overwrote the date when the loan actually defaulted (i.e. when the last payment was made).

Will have to have a look and see...but that last time-lapse showed a very large chunk of loans being sold off and effectively having their default date wiped into the future.
Myles,

No, I meant the date the loan originates. So the cohorts are loans originating in 1h2015, 2h2015, 1h2016, 2h2016 etc.

So that, for example, we can see out of all the loans originating in 1h2015, how many A, B,...F have been sold/charged off etc..
X axis 1h2015A, 1h2015B...etc
Y axis percentage of loans that are charged off.
No time lapse, just bar charts.

This is so that we can see the percentage charge off rates for older cohorts compared to younger cohorts.

leesal
11-10-2018, 08:50 PM
Had wanted to work out how to create a 'Fence' chart (finicky to get right) so thought I'd give it a go for this. Result below:

10054

Not sure if there is enough detail to be meaningful? Would running it for individual grades (i.e. A1, A2) be useful? Let me know what you think.

Perhaps there is a better way to show what you want - or it might be best to use a Pivot Table once you have the data and slice and dice as you want?

Despite not requesting this data :), the result of grade vs enquiry is interesting.

Shows that grade is a far more important then # of enquiry in predicting a default. Hard to gleen much (as there may be confounders such the cohort mix), but it seems to suggest a 6+ defaults at A-C grade spikes default; D to E grades it doesn't seem to matter, and at F there appears to be an inverse relationship ie more enquiries results in less defaults. Fascinating!

leesal
11-10-2018, 09:01 PM
Myles,

No, I meant the date the loan originates. So the cohorts are loans originating in 1h2015, 2h2015, 1h2016, 2h2016 etc.

So that, for example, we can see out of all the loans originating in 1h2015, how many A, B,...F have been sold/charged off etc..
X axis 1h2015A, 1h2015B...etc
Y axis percentage of loans that are charged off.
No time lapse, just bar charts.

This is so that we can see the percentage charge off rates for older cohorts compared to younger cohorts.

Yep thats exactly what requested.

I'm guessing with your IT skills you don't need me to tell you. However I run cohort off a vlookup on column B "Date". Run a column on the combined export, and you should have all the data you need. Like the Fence post graph btw, very sweet presentation!

10057

myles
12-10-2018, 12:11 AM
Early release of basic chart set for comment:

summary.pdf (https://www.dropbox.com/s/e02rj5kc8d6fgt9/summary.pdf?raw=1)

I'll remove or break this link when the final data set and summary are done.

Looking for comments on the graphs etc. Will probably add to this over time, but this is just to get it out there for some initial comments so I can fix most of what needs fixing before the final data set comes together. Decided on pdf to keep it all together and for ease of viewing.

Notes:

The charts are all based on the full set of data (except where stated), which means the default values are watered down a little due to the approx. 9 month gap between when a loan starts and when a default is flagged as having occurred (possibly ~7 months in the past). Do we just want to live with this or make some arbitrary cut off? (or run another set)
Some of the earlier loans, I feel, don't reflect the more recent loans i.e. there was some 'dodgy', 'poorly' graded loans in the early days. These will likely be inflating the more 'current' default rates. Should we consider dropping some of the earlier loans out (some are still current)?
Perhaps the above two cancel each other out to some extent?
Any time based default details is broken due to 'incorrect' use of the 'Last Payment Date' field so some caution needs to be taken around this type of data.
Dollar values of loans i.e. 'Outstanding Principal' etc., are of no value, since they are only for the particular portion of the unique loan record in the data set - so it needs to be understood that using these values would result in meaningless detail. (A couple of values like the 'Total Loan Value' can be used - ratios might be okay.)
I've limited quite a few data sets to a population of at least 10, just to remove outliers so core detail in charts etc. aren't influenced.

alundracloud
12-10-2018, 08:33 AM
This is great Myles- awesome work putting it all together.

Can I get you to double-check the Rewrite graph? (Am I reading it correctly that 0/6610 have defaulted?)

Included in the dataset should be this loan, which although it had 0 successful payments before being re-written, still got classified by Harmoney as a Re-write. So there should be at least 1 on that graph? (It should have been included in the orders.csv uploaded by 'Wilma@Bedrock.Flinstones', yesterday afternoon)

10058



**edit**

I've gone and looked at the csv, and it looks as though Harmoney have adjusted the 'Previous Loan Pay-off (re-write) column to N/A, whereas in the Harmoney portal/dashboard under reports, it shows this loan was in fact a re-write. I get the feeling that the re-write info on charged off loans is unreliable, it looks to me as though the re-write info shown in the csv changes once the status goes from in-arrears -> charged off.

10059

RMJH
12-10-2018, 08:42 AM
Many thanks Myles. This is great. First thoughts after a brief squizz: 1. Some cohort based data would be helpful or maybe just exclude loans below a certain age eg 12 months. This could be at letter grade level if that helps. 2. Some measure of net return would be of interest though of course with changes in rates and algorithms there will be comparability issues and the time factor would need to be accommodated. Maybe apply to just completed loans? 3. When you say full data set I assume you are not including duplicates.

Cool Bear
12-10-2018, 09:09 AM
This is great Myles- awesome work putting it all together.

Can I get you to double-check the Rewrite graph? (Am I reading it correctly that 0/6610 have defaulted?)

Included in the dataset should be this loan, which although it had 0 successful payments before being re-written, still got classified by Harmoney as a Re-write. So there should be at least 1 on that graph? (It should have been included in the orders.csv uploaded by 'Wilma@Bedrock.Flinstones', yesterday afternoon)

10058



**edit**

I've gone and looked at the csv, and it looks as though Harmoney have adjusted the 'Previous Loan Pay-off (re-write) column to N/A, whereas in the Harmoney portal/dashboard under reports, it shows this loan was in fact a re-write. I get the feeling that the re-write info on charged off loans is unreliable, it looks to me as though the re-write info shown in the csv changes once the status goes from in-arrears -> charged off.

10059

I just had a quick look at my defaults in HM reports on the website. Of the last 10 charge-offs, 6 had indicated that they were re-writes but had 0 successful payments and 0 remaining payments. Maybe HM just "zeroize" the re-writes when loans are charge-off. So maybe the conclusion that re-writes are safer investments is not correct after all.

Cool Bear
12-10-2018, 09:14 AM
Early release of basic chart set for comment:

summary.pdf (https://www.dropbox.com/s/e02rj5kc8d6fgt9/summary.pdf?dl=0)

I'll remove or break this link when the final data set and summary are done.

Looking for comments on the graphs etc. Will probably add to this over time, but this is just to get it out there for some initial comments so I can fix most of what needs fixing before the final data set comes together. Decided on pdf to keep it all together and for ease of viewing.

Notes:

The charts are all based on the full set of data (except where stated), which means the default values are watered down a little due to the approx. 9 month gap between when a loan starts and when a default is flagged as having occurred (possibly ~7 months in the past). Do we just want to live with this or make some arbitrary cut off? (or run another set)
Some of the earlier loans, I feel, don't reflect the more recent loans i.e. there was some 'dodgy', 'poorly' graded loans in the early days. These will likely be inflating the more 'current' default rates. Should we consider dropping some of the earlier loans out (some are still current)?
Perhaps the above two cancel each other out to some extent?
Any time based default details is broken due to 'incorrect' use of the 'Last Payment Date' field so some caution needs to be taken around this type of data.
Dollar values of loans i.e. 'Outstanding Principal' etc., are of no value, since they are only for the particular portion of the unique loan record in the data set - so it needs to be understood that using these values would result in meaningless detail. (A couple of values like the 'Total Loan Value' can be used - ratios might be okay.)
I've limited quite a few data sets to a population of at least 10, just to remove outliers so core detail in charts etc. aren't influenced.


Thanks Myles for taking so much time and effort in this. Fantastic work indeed!

myles
12-10-2018, 09:39 AM
I just had a quick look at my defaults in HM reports on the website. Of the last 10 charge-offs, 6 had indicated that they were re-writes but had 0 successful payments and 0 remaining payments. Maybe HM just "zeroize" the re-writes when loans are charge-off. So maybe the conclusion that re-writes are safer investments is not correct after all.

It was too good to be true. That's why I asked about it a little while back and when someone indicated that they had only invested in re-writes and had no defaults I wrongly took that as confirmation. :( Good to know, with confirmation now. :)

Oh well, looks like this stat can go in the bin, as I can't see how a 'Charged Off' re-write can be differentiated from a 'Charged Off' non re-write...

From what I've read, Harmoney have had a good few Software Developers/Coders influence their interface over time, I'm guessing they had a few hacks go through who don't know the value of data integrity :(

myles
12-10-2018, 09:45 AM
1. Some cohort based data would be helpful or maybe just exclude loans below a certain age eg 12 months. This could be at letter grade level if that helps. 2. Some measure of net return would be of interest though of course with changes in rates and algorithms there will be comparability issues and the time factor would need to be accommodated. Maybe apply to just completed loans? 3. When you say full data set I assume you are not including duplicates.

1. Next step I think - have to validate the basics first before getting deeper into it.
2. Would have to think long and hard on this one - details like how much did 'you' invest in the loan, fee rate, tax rate etc. come into it, which this data set does not contain (individual data only - so it would be up to you to do that for your own data).
3. Yep, I won't be reporting anything on the 'RAW' data (except total records), only the unique loan set is in the charts etc.

myles
12-10-2018, 10:00 AM
Ignore the 'Investment' chart. It is using one of those value columns that are at individual level which has no meaning in a combined data set (I use this on my loans, but no value here - I'll remove it).

beacon
12-10-2018, 11:39 AM
Looking for comments on the graphs etc.


Great effort Myles. Don't understand your last graph though. Perhaps a short explanation would help here, something like HM estimates is roughly on the mark as actual deafults appear to average out at HM estimates etc.
I think some more information on early repaid by loan grade and protect would be very valuable too. At the moment the aggregate paid off of this data set is 52.6%, but my own data shows Bs repaying much faster than average, and protect loans repaying much slower than average.



default values are watered down a little due to the approx. 9 month gap between when a loan starts and when a default is flagged as having occurred (possibly ~7 months in the past). Do we just want to live with this or make some arbitrary cut off? (or run another set)


Perhaps cutoff at the date of most recent default, to include all chargeoffs regardless of how soon or late they were booked to individual accounts. This will exclude some early repayments but we can live with that, unless you want to produce a full set chart too to compare.



Some of the earlier loans, I feel, don't reflect the more recent loans i.e. there was some 'dodgy', 'poorly' graded loans in the early days. These will likely be inflating the more 'current' default rates. Should we consider dropping some of the earlier loans out (some are still current)?
Perhaps the above two cancel each other out to some extent?


If I were charting, I'd leave them in as grading evolution as well as application are subjective and will continue to change with time. Even if Harmoney has become smarter at vetting, the economy is still headed out of the Goldilocks zone, so leaving a buffer might not hurt. Charge-offs for this (lean in F, and maybe As) dataset stands at 4.8%, which is in the recent ballpark declared by HM



Dollar values of loans i.e. 'Outstanding Principal' etc., are of no value, since they are only for the particular portion of the unique loan record in the data set - so it needs to be understood that using these values would result in meaningless detail. (A couple of values like the 'Total Loan Value' can be used - ratios might be okay.)


I think this is a progresssion of your earlier idea when you wanted to see how many had invested in a certain loan, to determine desirable loan characteristics in some sort of a covariant analysis. There is merit in aggregating the invested amounts per unique LAI, if this can be done easily. One can just follow the money then, to see what the contributors liked/invested in.



Oh well, looks like this stat can go in the bin, as I can't see how a 'Charged Off' re-write can be differentiated from a 'Charged Off' non re-write... From what I've read, Harmoney have had a good few Software Developers/Coders influence their interface over time, I'm guessing they had a few hacks go through who don't know the value of data integrity :(


Amen. Well spotted alundracloud and Cool Bear. I also just had a quick look at my defaults in HM reports on the website. Of the last dozen charge-offs, half were re-writes with 0 successful payments and 0 remaining payments. So, HM is wiping off the re-writes in summaries, when loans are charged-off. This brings into question the sanctity of not just their initial listings data, but also the subsequent changes they make to it. :eek2:

myles
12-10-2018, 01:11 PM
Just for info: Latest 'Charged Off'/'Debt Sold' loans in data set looks like:

Month|Count
-----------------
2017-04|22
2017-05|20
2017-06|23
2017-07|16
2017-08|15
2017-09|14
2017-10|8
2017-11|9
2017-12|8
2018-01|3
2018-02|5
2018-03|1
2018-04|1

So the latest one is 6 months old.

myles
12-10-2018, 01:20 PM
And to help complete the picture these are the 'Paid Off' numbers:

Month|Count
---------------
2017-04|364
2017-05|420
2017-06|325
2017-07|310
2017-08|345
2017-09|303
2017-10|295
2017-11|282
2017-12|243
2018-01|263
2018-02|203
2018-03|171
2018-04|73
2018-05|49
2018-06|10
2018-07|9
2018-08|3
2018-09|7

So a fair amount of early 'Paid Off' loans would be left out...hmm...

beacon
12-10-2018, 01:36 PM
And to help complete the picture these are the 'Paid Off' numbers: ...

Month|Count
---------------
2017-04|364
2017-05|420
2017-06|325
2017-07|310
2017-08|345
2017-09|303
2017-10|295
2017-11|282
2017-12|243
2018-01|263
2018-02|203
2018-03|171
2018-04|73
2018-05|49
2018-06|10
2018-07|9
2018-08|3
2018-09|7

So a fair amount of early 'Paid Off' loans would be left out...hmm...

So, if I was worried about default rate being watered down, I'd include data upto April end/date of last default, because I'd want to include that last default for its effect on variate percentages. Cutoff at May end will get me the 50 odd early paid too. I'd definitely start data from the first loan on the books - given substantial numbers for both early repairs and defaults.

And I'd do the same for the separate PP dataset beginning from the date of the first PP loan.

RMJH
12-10-2018, 01:45 PM
When you quote default rates, are these %'s by loan numbers or $ values ?

myles
12-10-2018, 01:49 PM
And I'd do the same for the separate PP dataset beginning from the date of the first PP loan.

Have done the PP one - see note on side of graph.

I'm tempted to just go with all loans from start to current, the watering down effect is likely to be minimal and one would expect it would effect everything by the same proportion (assuming similar loan characteristics throughout the year)?

Happy to go with the flow with this, just don't want to over complicate things too early :)

myles
12-10-2018, 01:57 PM
When you quote default rates, are these %'s by loan numbers or $ values ?

Numbers, can't do value on a merged data set. (everyone invests different amounts in individual loans - the merged unique dataset is made up of values from different lenders)

RMJH
12-10-2018, 02:14 PM
Numbers, can't do value on a merged data set. (everyone invests different amounts in individual loans - the merged unique dataset is made up of values from different lenders)
Thanks, I'm clearly a bit behind some of you guys! I guess to an extent using numbers compensates for the young loans included though the figures should be viewed as comparative (for setting filters) rather than absolute (for calculating returns) and are not directly comparable with HM's.

humvee
12-10-2018, 02:38 PM
Hi Myles

How many loans exist in the combined data set with the each of the following problems

Negative principal outstanding

Paid Off/Cancelled - but still owe money

Active/Arrears but $0.00 principal owing

As I assume others have the same problems as I have had with data quality - Or maybe even more concerning - if this data is correct

myles
12-10-2018, 03:59 PM
How many loans exist in the combined data set with the each of the following problems


Negative principal outstanding: 49 (-$152.13)
Paid Off/Cancelled - but still owe money: 248 ($1350.66)
Active/Arrears but $0.00 principal owing: 8

Some of this could be due to timing, small rounding type errors, and there will be duplicates? Doesn't look excessive for the size of the data set?

joker
12-10-2018, 04:07 PM
I just had a quick look at my defaults in HM reports on the website. Of the last 10 charge-offs, 6 had indicated that they were re-writes but had 0 successful payments and 0 remaining payments. Maybe HM just "zeroize" the re-writes when loans are charge-off. So maybe the conclusion that re-writes are safer investments is not correct after all. Yes all Harmoney charged-off loan that were a rewrite seem to have had all their rewrite data corrupted or changed. All mine have been reset to '0' in the charged-off loan details and N/A in the reports...10062

myles
12-10-2018, 11:38 PM
Thoughts on this style:

10063

leesal
13-10-2018, 12:48 AM
Fantastic effort so far Myles. And a very good pickup by alundracloud on the rewrite data glitch.

Will however comment that many of the graph's lack predictive validity, as you haven't controlled for confounders :) I'm guessing there's significantly more AB grades in home improvement and 50-59 age group loans; and more EF grades in 20-29yo and household items (for example).

The question begs whether a "C" grade 20-29yo is more likely to default compared to a "C" grade 50-59, and similar with Home Improvements vs HH Items or Used Cars. If not those categories may already be fully explained by the grading category assigned by HM.

Commend you on your efforts so far though, top work!

10064

leesal
13-10-2018, 01:03 AM
Thoughts on this style:

10063

Really good. Thanks Myles.

Appears that HM had trouble grading C and D grades early on in '14. Otherwise looks to be running consistent. Wonder whether there's early signs of improvement in DEF grade defaults out of platform 1.5

myles
13-10-2018, 02:08 AM
Will however comment that many of the graph's lack predictive validity, as you haven't controlled for confounders
It was never my intent to run numbers for all combinations, nor to validate the grading system...

By putting together some high level detail, individuals can drill down to 'discover' what they think are key predictors and perhaps share what they find if it is of value. As I said before, a pivot table can be used to do much of that type of work. There is plenty of research available that has covered much of this ground before - whether it applies is the question.

RMJH
13-10-2018, 08:58 AM
Thoughts on this style:

10063
That one I really like. Big picture and time based. And as time moves on it should give increasing information on default variability/predictability. If you could show which risk model was being used for each cohort so much the better.

myles
13-10-2018, 10:43 AM
I've updated the summary document with a few more charts: summary.pdf (https://www.dropbox.com/s/e02rj5kc8d6fgt9/summary.pdf?raw=1)

Would appreciate thoughts on the last chart - Estimated vs Actual Default Rate - which I've tidied up and made more informative (I hope). Harmoney (https://www.harmoney.co.nz/investors/default-rates), in their definition of Estimated Default Rate, make no mention of limiting the 'window' of loans by ignoring the first period where loans don't default/aren't flagged as defaults. This is the approach I've taken to be consistent with their figures (i.e. no window).

However, other platforms, e.g. Lending Club, limit the window to loans older than 120 days:

"Annualized Charge Off Rate is calculated by dividing the total amount of loans in charge off by the total amount of loans issued for more than 120 days, divided by the number of months loans in charge off have been outstanding and multiplied by twelve. The loans issued for less than 120 days are excluded from the calculation because loans are unlikely to charge off during the first 120 days."
Source: Lending Club - How We Measure Annualized Charge Off Rate (https://www.lendingclub.com/public/annualizedDefaultRateHelpPop.action)

If anyone has thoughts on this, I'd appreciate them.

beacon
13-10-2018, 11:20 AM
Thoughts on this style:

10063

Great graph, Myles. Illuminating to see defaults reaching a quarter of the loans already, in the earlier cohorts for DEF.

BJ1
13-10-2018, 11:20 AM
10065How many willing buyers are there for this risk? I bet it gets filled and defaults - but none of my money is going to repayments ratios like this..

beacon
13-10-2018, 11:30 AM
Harmoney (https://www.harmoney.co.nz/investors/default-rates) ... make no mention of limiting the 'window' of loans by ignoring the first period where loans don't default/aren't flagged as defaults. This is the approach I've taken to be consistent with their figures (i.e. no window). ... Lending Crowed, limit the window to loans older than 120 days:

"Annualized Charge Off Rate is calculated by dividing the total amount of loans in charge off by the total amount of loans issued for more than 120 days, divided by the number of months loans in charge off have been outstanding and multiplied by twelve. The loans issued for less than 120 days are excluded from the calculation because loans are unlikely to charge off during the first 120 days."
Source: Lending Crowd - How We Measure Annualized Charge Off Rate (https://www.lendingclub.com/public/annualizedDefaultRateHelpPop.action)

If anyone has thoughts on this, I'd appreciate them.

I agree that using the all loans approach is valuable for comparative purposes (with Harmoney) and easier, but Lending Club approach of excluding the <120 day laons and annualizing the rest is more technically correct. I would have used the former.

beacon
13-10-2018, 11:54 AM
... there's significantly more AB grades in home improvement and 50-59 age group loans; and more EF grades in 20-29yo and household items (for example).

The question begs whether a "C" grade 20-29yo is more likely to default compared to a "C" grade 50-59, and similar with Home Improvements vs HH Items or Used Cars. If not those categories may already be fully explained by the grading category assigned by HM.

Commend you on your efforts so far though, top work!

10064

I had observed in my data too that riskier grades were predominantly youngsters and safer grades elders, so there is an element of safety with age already built into HM grading, I think. But interesting insights on household items vs home improvements, leesal.

beacon
13-10-2018, 01:41 PM
I've updated the summary document with a few more charts: summary.pdf (https://www.dropbox.com/s/e02rj5kc8d6fgt9/summary.pdf?raw=1)

Would appreciate thoughts on the last chart - Estimated vs Actual Default Rate - which I've tidied up and made more informative (I hope).

Thanks Myles, I still don't get what you are plotting here - and what it is showing? Also, the samples (bottom numbers on x-axis) in borrower profile loan characteristic charts don't seem to add up to 21492. Can you re check please?

myles
13-10-2018, 02:51 PM
I still don't get what you are plotting here - and what it is showing?

The Orange line is the Harmoney suggested default rate - this is plotted against itself to produce the line. So one of the points that make up this line is 3.15% vs 3.15% - this is the Scorecard 1.5 estimated default rate for an E1 risk grade (details here (https://www.harmoney.co.nz/investors/investment-risks)). There is a similar point for each estimated default rate for all risk grades (Scorecard 1.0 and 1.5 - sourced from the data set) that makes up the line.

The scatter graph is plotting the recorded estimated default rate for each data set record (grouped by estimated default rate) against what the actual default rate for that group is. This puts a point on the graph at x=Estimated Default Rate vs y=Actual Default Rate. The circle is proportional to the size of the population of loans with that estimated default rate.

Hmm, not sure how clear that is...

Basically showing what Harmoney say the default is estimated to be vs what it actually is (grouped by risk grade).............

myles
13-10-2018, 03:42 PM
Also, the samples (bottom numbers on x-axis) in borrower profile loan characteristic charts don't seem to add up to 21492. Can you re check please?
Shortfall will be Cancelled loans ... a quick check shows them adding up to 21308? Database count confirms 21308 as loans not Cancelled.

Summary at top of report may not be completely correct - will be when I finalise the data set.

myles
13-10-2018, 03:46 PM
Just so there is a comparison of re-write vs non re-write default rates (based on my own loan set only, as this detail can't be captured from the csv):

10066

beacon
13-10-2018, 03:53 PM
Basically showing what Harmoney say the default is estimated to be vs what it actually is (grouped by risk grade).............

I see it now, thanks.


Summary at top of report may not be completely correct - will be when I finalise the data set.

Okay, thanks.

So, is it safe to assume that (barring the jump over 30% due to debt sold) the default line on page 20 is the up-to-date hazard curve of our data pool?

myles
13-10-2018, 05:35 PM
So, is it safe to assume that (barring the jump over 30% due to debt sold) the default line on page 20 is the up-to-date hazard curve of our data pool?

Nope, it's broken. If you go back to that last time-lapse graph you'll see that when those loans were sold off it affected loans across time, not at a specific time i.e. the loans all actually defaulted at different times, but where sold off together. When they were sold off the date used to determine when they defaulted was set to when they were sold - most appeared in the 25 - 35 'ish month period (100-150 week on time-lapse). Just looking at it now, I don't think it's annualised anyway. I'll remove it to avoid confusion.

myles
14-10-2018, 03:47 AM
Have updated the summary.pdf (https://www.dropbox.com/s/e02rj5kc8d6fgt9/summary.pdf?raw=1) document with quite a few characteristic by grade chart sets where they make sense. Some of these appear to offer some hints of how the Harmoney grading process may be working.

At this stage I'll take a break from making new charts unless someone asks for something specific or I think of something useful - if I've missed anything, let me know.

I'll put up the final summary document and unique.csv and raw.csv sometime late tonight. If you find any errors or have problems with these let me know.

Time to digest some of this info ;)

leesal
14-10-2018, 10:43 AM
Have updated the summary.pdf (https://www.dropbox.com/s/e02rj5kc8d6fgt9/summary.pdf?raw=1) document with quite a few characteristic by grade chart sets where they make sense. Some of these appear to offer some hints of how the Harmoney grading process may be working.

At this stage I'll take a break from making new charts unless someone asks for something specific or I think of something useful - if I've missed anything, let me know.

I'll put up the final summary document and unique.csv and raw.csv sometime late tonight. If you find any errors or have problems with these let me know.

Time to digest some of this info ;)

fantastic work myles. Lots of info to make sense of. Reinforces that previous default is something to avoid and to consider scaling down pp loans. Owning home seems to perform better then renting, and suprisingly 20-29 age group performs comparatively well.

The only other thing that could be done (but really getting quite pedantic), is determine the average age of loan in each category to ensure comparing apples with apples. But its probably safe to assume simplistically the loan age mix would be similar.

How are you calculating HM estimated default rate per group in the bubble graph? In "Half yearly paid off loans", does "paid off" exclude part paid?

Thanks again!

myles
14-10-2018, 11:02 AM
How are you calculating HM estimated default rate per group in the bubble graph? In "Half yearly paid off loans", does "paid off" exclude part paid?
Attempted explanation a few posts back - calculated by grouping loans with same estimated default values (i.e risk grade) - it is not time based. Different Scorecard risk grades had different estimated default values so make up different circles.

In case it's not clear, each loan has the Estimated Default Rate recorded with it - so calculation is similar to all other bar charts, but annualised as per Harmoney's estimated rate.

joker
14-10-2018, 12:08 PM
Excellent document myles. Just a query about co-borrowers defaults. The chart on p6 shows loans with a co-borrower default at half the rate of single borrower loans yet on p25 co-borrower defaults seem to be almost always at a higher rate than single borrower loans. It maybe the way I'm reading it but I would appreciate your view on this. Cheers and thanks for what is obviously a huge amount of time consuming work on your part.

Cool Bear
14-10-2018, 01:53 PM
Excellent document myles. Just a query about co-borrowers defaults. The chart on p6 shows loans with a co-borrower default at half the rate of single borrower loans yet on p25 co-borrower defaults seem to be almost always at a higher rate than single borrower loans. It maybe the way I'm reading it but I would appreciate your view on this. Cheers and thanks for what is obviously a huge amount of time consuming work on your part.

Myles, I will try to answer this.

Good spotting Joker.

The difference is in the scale of things.

In page 25, for grades where the co-borrower's defaults are higher, the numbers (as in population) are not significant - for example F, only 2 defaults in co-borrowers out of 8 gives it a 25% default rates. OR the absolute difference in percentages is not significant as in A where the single is 0.44% (14/3111) and coborrowers 0.97% (5/517) - an absolute difference of just 0.52%. Both Population size and Absolute percentage difference are important when you combine the grades.

Whereas, where the singles are higher eg E, both the numbers and absolute percentage difference is significant. So E has single 291/2336 = 12.46% and coborrowers 7/72 = 9.72%, a difference of 2.74%.

As an example, if you add E and F, the numbers are single (291+250)/(2336+1203)=15.29% and coborrowers (7+2)/(72+8) =11.26%. So, singles are still way higher than coborrowers. But looking at the two chart (without noticing the numbers) one would have thought that they even out.

An analogy would be if you add a pot of boiling water to a pot of cold water, the temperature of the cold water would rise considerably. But if you add a spoonful of boiling water to a pot of cold water, it does not make much difference to the cold water. Or if you add a pot of slightly warmer water to the pot of cold water, there is not much difference to the cold water either.

alundracloud
14-10-2018, 01:57 PM
Would anyone that knows a bit more about tax than myself care to provide some advice / peer-review my work?

I've put together a Double Entry cashbook in Excel for my Harmoney Investing, and I'd like a second pair of eyes to check my work. I've never done any form of accounting before, so this was all pretty new to me...

All of my Harmoney investing has been carried out through a NZ Registered Company. I intend to claim back the Harmoney Fees, as the company is "in the business of lending"

This is how I've coded the different kinds of transactions that I've carried out- Does this look right? I wasn't really too sure about the Payment Protect unfunded... :confused:

10069

myles
14-10-2018, 02:05 PM
The chart on p6 shows loans with a co-borrower default at half the rate of single borrower loans yet on p25 co-borrower defaults seem to be almost always at a higher rate than single borrower loans. It maybe the way I'm reading it but I would appreciate your view on this.

If you add up all the individual numbers (population and defaults) you see that they all add up, so no smoke and mirrors ;)

You need to consider the weighting/proportion of each grade bar to the whole - for example the E grade 'No' bar (appears a little higher) represents 2458 loans at ~12% default rate (305 defaults), compared to say the A Grade 'No' bar (appears much lower) that represents 3158 loans loans at ~0.4% default rate (14 defaults). The E swamps the A on defaults so at the overall level drives up the number of 'No' defaults.

The left and right bars don't represent the same base number of loans, it is the ratio to the base number of loans.

Harder to explain than it should be...?

leesal
14-10-2018, 03:54 PM
If you add up all the individual numbers (population and defaults) you see that they all add up, so no smoke and mirrors ;)

You need to consider the weighting/proportion of each grade bar to the whole - for example the E grade 'No' bar (appears a little higher) represents 2458 loans at ~12% default rate (305 defaults), compared to say the A Grade 'No' bar (appears much lower) that represents 3158 loans loans at ~0.4% default rate (14 defaults). The E swamps the A on defaults so at the overall level drives up the number of 'No' defaults.

The left and right bars don't represent the same base number of loans, it is the ratio to the base number of loans.

Harder to explain than it should be...?

Yep hard to explain - All to do with Mix. Coolbears teaspoon of boiling water was a good analogy.

Co-Borrower is a good example of adverse selection. Two average borrowers (D+D grade) given a B grade loan, does not equal one good borrower (B grade)

leesal
14-10-2018, 04:05 PM
Attempted explanation a few posts back - calculated by grouping loans with same estimated default values (i.e risk grade) - it is not time based. Different Scorecard risk grades had different estimated default values so make up different circles.

In case it's not clear, each loan has the Estimated Default Rate recorded with it - so calculation is similar to all other bar charts, but annualised as per Harmoney's estimated rate.

Gotcha, and on the other side - the actual default would be the total # of defaults / cumulative # of days of all the loans.

How do you add up paid off / charge offs? Is it taken to current date, or date of repayment/default?

myles
14-10-2018, 04:36 PM
Have a read of the Harmoney and Lending Club definitions - I linked to them above.

No cumulative # of days, just number of days / 365.25.

Defaults are all 'Charge Off' + 'Debt Sold' vs everything except 'Cancelled' loans.

The final date was the query I was asking about before - Harmoney don't appear to do the same as Lending Club. Lending Club takes it back an arbitrary 120 days, Harmoney take from first to last (that's my interpretation of what their wording). So first loan start date to last loan payment date is what I'm using (I actually had max(start date) - min(start date), but have 'fixed' that to max(last payment date) - min(start date) and it has shifted the scatter pretty much under the line in lower risk grades - high risk grades are off the line (i.e. lower default rate on high risk grades then estimated).

It seems a pretty rudimentary way of calculating it, but it appears to be the way it is done...

myles
14-10-2018, 08:02 PM
Putting this up a bit early. It includes the two updated csv files that were put up today. The data set summary on page 4 should be correct for the unique.csv file which is what all of the charts are based on.

I've dropped off the last chart as it seemed to be causing some confusion, will revisit it at a later time.

Please note the warning in the summary about comparing values to annualised values.

I can claim the largest loan :) There is a story behind it - it wasn't meant to be - moral of the story - don't get distracted when purchasing loans. I've given up sweating over it, if it defaults it will hurt a bit, but not too much now :)

summary.pdf (https://www.dropbox.com/s/e02rj5kc8d6fgt9/summary.pdf?raw=1)

unique.csv (https://www.dropbox.com/s/i4dm1mfh5wni7m4/unique.csv?raw=1)

raw.csv (https://www.dropbox.com/s/lsrpvihjppiub25/raw.csv?raw=1)

I've deliberately not compressed the csv files (they aren't overly big), just to avoid problems. These are big enough that they may cause some spreadsheet software to 'chug' so be prepared for that.

If you find any errors in the data or corrections to the numbers in the summary, please let me know and I'll correct them when I can.

If there are charts that you think would be useful to everyone and would like me to try to generate, I'm happy to do that within reason.

If you find the secret to selecting the perfect loans, please share.

If you see a loan for a Caravan, you might be onto a good thing ;)

Enjoy!

humvee
15-10-2018, 09:50 AM
As its currently proposed harmoney could end end up caught up in this too

https://www.interest.co.nz/personal-finance/96271/loan-shark-crackdown-includes-interest-fee-caps-fit-proper-person-test-and

I cannot find a definition of who they define as "high-cost lenders"

However if we just look at interest and fees alone in the combined data set there 11 Unique loans where interest paid to date is greater then amount invested - Key point to rember is is only what has been paid todate - the real number will be alot higher once loans reach full term. Also as proposed the 100% cap includes all fees as well as default charges



"Interest and fees on high-cost loans will be limited to 100% of the amount borrowed (the loan principal). Thus if an individual borrows $500, they will never have to pay the lender back more than $1000, including all fees and interest, the Government says. This will only apply to "high-cost lenders" with the aim being to prevent unmanageable debt and financial hardship from accumulating large debts from a small loan.The idea is that even if the borrower defaults, they would repay no more than twice the original loan principal, including interest, default interest, and all fees. "

IntheRearWithTheGear
15-10-2018, 10:02 AM
heard on the radio (but havnt seen in writing) that high cost was defined as interest rates greater than 50% per annum.

Highest at harmoney was 39%, dont know if that is the case now..

leesal
15-10-2018, 10:06 AM
Putting this up a bit early. It includes the two updated csv files that were put up today. The data set summary on page 4 should be correct for the unique.csv file which is what all of the charts are based on.

I've dropped off the last chart as it seemed to be causing some confusion, will revisit it at a later time.

Please note the warning in the summary about comparing values to annualised values.

I can claim the largest loan :) There is a story behind it - it wasn't meant to be - moral of the story - don't get distracted when purchasing loans. I've given up sweating over it, if it defaults it will hurt a bit, but not too much now :)

summary.pdf (https://www.dropbox.com/s/e02rj5kc8d6fgt9/summary.pdf?raw=1)

unique.csv (https://www.dropbox.com/s/i4dm1mfh5wni7m4/unique.csv?raw=1)

raw.csv (https://www.dropbox.com/s/lsrpvihjppiub25/raw.csv?raw=1)


Enjoy!

Fantastic effort Myles. Lots of top notch analysis which will help shed valuable insights, to help optimise our loan selections, even should the economy does head towards more turbulent times.

Am comparatively an excel hack compared to your DB charting skills, but heres a lone graph that some may find useful.

It tracks lifecycle stats by month of initiation. Stats are all as a % of initial capital invested. eg int% = total interest earned to date for mmmyy loan/ total investment made in mmmyy

Also the "arrears" figure I've taken is my own calculation of "principal at risk" (being the entire principal outstanding of loans more then half a payment in arrears). In addition I don't rely on HM judgement on arrears. Happy to share how I calculate if requested.

Second graph shows remaining principal in $ value. Use this as a measure of future potential (eg potential for further interest / further defaults). Months with less then 10% principal outstanding should have minimal change to Int/Charge off.

Thirdly is a measure of clear interest %. Being Interest earned less principal in arrears less defaults (no HM fee). Note it is not annualised, but over the period of the investment (which given the high early repayment would probably work out as annual anyway!).

10071

Cool Bear
15-10-2018, 10:33 AM
Once again, many thanks Myles for your time and effort and producing a fantastic top-notch and very professional report.

Cool Bear
15-10-2018, 10:37 AM
Fantastic effort Myles. Lots of top notch analysis which will help shed valuable insights, to help optimise our loan selections, even should the economy does head towards more turbulent times.

Am comparatively an excel hack compared to your DB charting skills, but heres a lone graph that some may find useful.

It tracks lifecycle stats by month of initiation. Stats are all as a % of initial capital invested. eg int% = total interest earned to date for mmmyy loan/ total investment made in mmmyy

Also the "arrears" figure I've taken is my own calculation of "principal at risk" (being the entire principal outstanding of loans more then half a payment in arrears). In addition I don't rely on HM judgement on arrears. Happy to share how I calculate if requested.

Second graph shows remaining principal in $ value. Use this as a measure of future potential (eg potential for further interest / further defaults). Months with less then 10% principal outstanding should have minimal change to Int/Charge off.

Thirdly is a measure of clear interest %. Being Interest earned less principal in arrears less defaults (no HM fee). Note it is not annualised, but over the period of the investment (which given the high early repayment would probably work out as annual anyway!).

10071

Thanks for sharing Leesal. Interesting charts

beacon
15-10-2018, 11:49 AM
If you find the secret to selecting the perfect loans, please share.

If you see a loan for a Caravan, you might be onto a good thing ;)

Enjoy!

Watch out for owners doing home improvements :)

leesal
15-10-2018, 12:28 PM
E4 with partial pp, just up.

Previously would have mulled this over, and on a day with few loans would have taken it. Thanks to Myles's data will leave this one :)

10072

humvee
15-10-2018, 04:59 PM
E4 with partial pp, just up.

Previously would have mulled this over, and on a day with few loans would have taken it. Thanks to Myles's data will leave this one :)

10072

I was looking at this one too but what caught my eye was actually the level of income on a benefit , By my calculations to get that level of after tax income would require a yearly before tax income of around $78,000. Are we really paying benefits that high? no wonder i pay so much tax (or is harmoney income data wrong again?)

That is equivalent of 1 person working full time for $37.50/hour or 2 people working full time at $18.75/hour or 91 hours of work/ week at minimum wage

leesal
16-10-2018, 01:25 AM
I was looking at this one too but what caught my eye was actually the level of income on a benefit , By my calculations to get that level of after tax income would require a yearly before tax income of around $78,000. Are we really paying benefits that high? no wonder i pay so much tax (or is harmoney income data wrong again?)

That is equivalent of 1 person working full time for $37.50/hour or 2 people working full time at $18.75/hour or 91 hours of work/ week at minimum wage

that got me too for a moment, but the beneficiary seems to be earning only $30k per year. Still if thats the case, my wife can pack in her job and bunk up on the dole.

myles
16-10-2018, 01:47 AM
How useful/useless is this mapping of 2nd level loan criteria (https://www.dropbox.com/s/bqu2dmjaasu92c9/level_2_defaults.pdf?raw=1)?

What is that you ask...I've mapped what I think are the more practical criteria against all others (2nd level) and then sorted (descending) by default rate (and then population - higher population means more significant). Limited to a population of 10, just to get rid of the chaff.

It takes a little bit to generate, but I should be able to run it for individual grades and either report it or dump it out to csv. However, population numbers might be too low at the grade level, even with this data set (>20,000)...

I've not spent a lot of time going through it, but it contains some key info - I think?

Added: Just to explain the first row in case it's not obvious:

This is taking all loans with a 'Residential Status' of 'Fully Owned - No Mortgage' AND a 'Loan Purpose' of 'Debt Consolidation' and calculating the number of defaults, the total number of loans and the default rate. In this case there are 118 loans that fit these two critera, none of which have defaulted.

This is the top of the chart, which is a fairly obvious set of criteria to take that lead place...

leesal
16-10-2018, 02:08 AM
How useful/useless is this mapping of 2nd level loan criteria (https://www.dropbox.com/s/bqu2dmjaasu92c9/level_2_defaults.pdf?raw=1)?

What is that you ask...I've mapped what I think are the more practical criteria against all others (2nd level) and then sorted (descending) by default rate (and then population - higher population means more significant). Limited to a population of 10, just to get rid of the chaff.

It takes a little bit to generate, but I should be able to run it for individual grades and either report it or dump it out to csv. However, population numbers might be too low at the grade level, even with this data set (>20,000)...

I've not spent a lot of time going through it, but it contains some key info - I think?

Added: Just to explain the first row in case it's not obvious:

This is taking all loans with a 'Residential Status' of 'Fully Owned - No Mortgage' AND a 'Loan Purpose' of 'Debt Consolidation' and calculating the number of defaults, the total number of loans and the default rate. In this case there are 118 loans that fit these two critera, none of which have defaulted.

This is the top of the chart, which is a fairly obvious set of criteria to take that lead place...

If it shows the same types of categories occurring again and again, then useful as a data mining tool!. Being able to compare against some form of level of return however would be helpful, its difficult to contextualise the risk/reward benefit without.

leesal
16-10-2018, 02:16 AM
Further to Myles efforts, couldn't resist doing a multivariate model.

comparing two loan sets:
1. Debt to income < 20%, enquiries 3 or less, time in job of 3 or more years, exclude new vehicle - new boat - tax - wedding expense - other loan purpose, exclude lendors aged over 60, exclude loans having 2 or more defaults, exclude partial PP, exclude B and C grade who aren't homeowners, exclude all A and all F grades.
2. loans that don't match the above criteria

Method:
* Included only loans dated prior to 1/1/2017.
* All $ values are common sized. That means all Interest, Charge-Off,Arrears, Investment, brought down to the size of one $25 note.

The results show that the model selection produced an average default of 3.6%, while the control had average default of 7.5%. Grade results, showed defaults were approx half in BCD, while E was only slightly improved compared to the control.

Need to catch up on sleep , time to give data-modelling a break..

10077

10078

BJ1
16-10-2018, 08:48 AM
10079

This loan tells us that when the borrower moved into the new home (presumably with a new mortgage) s/he had a large Harmoney loan in place already committing near to 20% of after tax income. I have difficulty accepting that the bank knew about it as the total expense to income ratio would have been unacceptable. Why didn't s/he use the bank to refinance Harmoney. The loan will fill reasonably quickly even though it doesn't make sense to rate this B1.

While I don't deny there is value in understanding more about the greater harmony portfolio, keeping loans like this out of one's own is the first step to reducing defaults.

RMJH
16-10-2018, 08:51 AM
How useful/useless is this mapping of 2nd level loan criteria (https://www.dropbox.com/s/bqu2dmjaasu92c9/level_2_defaults.pdf?raw=1)?

Interesting but too much detail for such a thin market to be of practical benefit? And isn't that what Harmoney's AI would be designed to capture? I think all of this stuff should give us confidence that HM's grading system has been good in the prevailing economic conditions. It certainly has challenged some of my filter prejudices. If you cherry pick too much you could end up with low diversification and too much cash. Thanks again for all your hard work and for engaging our minds and shining a light into some dark corners.

myles
16-10-2018, 09:56 AM
Interesting but too much detail for such a thin market to be of practical benefit?

At grade level it can pull out some anomalies worth considering. Some examples (bold is default rate):

Potential Avoids:


B
Age Band
Above 60
Co-Borrower
Yes
3
42
7.14




C
Age Band
20-29
Residential Status
Living with Parents
5
36
13.89




C
Residential Status
Other
Loan Purpose
Other
5
24
20.83




D
Marital Status
Married
Residential Status
Boarding
10
63
15.87



Potential Picks:


D
Income Type
Employment or Self Employed
Loan Purpose
Education Expenses
0
86
0.00




D
Loan Purpose
Education Expenses
Co-Borrower
No
0
78
0.00





D
Residential Status
Renting
Loan Purpose
Education Expenses
0
67
0.00



and some significant repeating of particular criteria at the high default rate end e.g. Purchase New Vehicle, helps spot some combinations to avoid...

Some of these a clearly obvious, but they can help support assumptions, or not.

I'll keep playing with it and see if anything comes of it.

RMJH
16-10-2018, 10:47 AM
Yes, keep up the good work Myles. One thing I would really like to see (from Harmoney) is employment vs self employment. Intuitively it seems bizarre not to distinguish between these two but maybe there is data to support consolidation on a portfolio basis. I don't like to invest in business loans but wonder whether many are hidden in other categories by the self employed.

leesal
16-10-2018, 11:06 AM
At grade level it can pull out some anomalies worth considering. Some examples (bold is default rate):

Potential Avoids:


B
Age Band
Above 60
Co-Borrower
Yes
3
42
7.14




C
Age Band
20-29
Residential Status
Living with Parents
5
36
13.89




C
Residential Status
Other
Loan Purpose
Other
5
24
20.83




D
Marital Status
Married
Residential Status
Boarding
10
63
15.87



Potential Picks:


D
Income Type
Employment or Self Employed
Loan Purpose
Education Expenses
0
86
0.00




D
Loan Purpose
Education Expenses
Co-Borrower
No
0
78
0.00





D
Residential Status
Renting
Loan Purpose
Education Expenses
0
67
0.00



and some significant repeating of particular criteria at the high default rate end e.g. Purchase New Vehicle, helps spot some combinations to avoid...

Some of these a clearly obvious, but they can help support assumptions, or not.

I'll keep playing with it and see if anything comes of it.

Definately very useful. Over 60 with a coborrower; married and boarding! Both categories that jump out as being awkward. The data is pretty thin, but can be insightful. Your mining for raw diamonds.

For example Purchase new vehicle coming up bad, makes sense but didn't consider. On the other hand Business Cash Flow comes up pretty good. I wonder if the 25,000 loans the data set is missing is causing some selection bias.

Am now applying two default sets of criteria for loan selection - A default minimiser and the another that takes stable loans less likely to early repay. Both generate a return 3% higher then the data.

Funnily enough this E grade I normally wouldn't touch, met the conditions in my "default minimisation" set.

10080

myles
16-10-2018, 01:12 PM
Time Lapse of the unique.csv loans. I may be able to back calculate the last payment date from the interest earned on those ~400 'Debt Sold' loans, which might let me generate a hazard curve. Will have a look later, but I'm hopeful...

http://i634.photobucket.com/albums/uu65/mylesau/Harmoney/unique_timeage.gif

myles
16-10-2018, 02:28 PM
One thing I've not noticed before is that the early loans ~ pre mid 2017 were being paid off earlier - around week 20, while ~post mid 2017 loans are being paid off at around 30-35 week mark. Could be something to do with scorecard changes or borrower selection or even a shift in grade selection by lenders. Just interesting...

Saffer
16-10-2018, 03:29 PM
The IRD had a discussion with Harmoney to determine if charged off principal is in fact a bad debt or a provisional bad debt. The IRD has now responded that it is a bad debt and can be written off against income if you are in the business of dealing in financial arrangements. Hope this is of help to any one.

myles
16-10-2018, 11:37 PM
Fixed the time lapse for those 'Debt Sold' loans with overwritten last payment date. Much better appreciation for where the defaults occur now.

Turns out it's a very easy calculation to update the last payment date by adding [payments to date / (invested / loan total * monthly loan payment)] months to the start date - this is for loans with a status of 'Debt Sold' and a last payment date = '2018-02-08' (just short of 400 loans).

beacon
17-10-2018, 09:32 AM
10079

This loan tells us that when the borrower moved into the new home (presumably with a new mortgage) s/he had a large Harmoney loan in place already committing near to 20% of after tax income. I have difficulty accepting that the bank knew about it as the total expense to income ratio would have been unacceptable. Why didn't s/he use the bank to refinance Harmoney. The loan will fill reasonably quickly even though it doesn't make sense to rate this B1.

While I don't deny there is value in understanding more about the greater harmony portfolio, keeping loans like this out of one's own is the first step to reducing defaults.

Goes to show that the programmed data mining is still a poor substitute for experience, but in the absence of the latter, the former is indispensable. Thanks BJ1. Brilliant insight

beacon
17-10-2018, 09:37 AM
I was looking at this one too but what caught my eye was actually the level of income on a benefit , By my calculations to get that level of after tax income would require a yearly before tax income of around $78,000. Are we really paying benefits that high? no wonder i pay so much tax (or is harmoney income data wrong again?)

That is equivalent of 1 person working full time for $37.50/hour or 2 people working full time at $18.75/hour or 91 hours of work/ week at minimum wage

Apparently benefit rises with the number of kids a young family has. And with other add-ons, like Accomodation Supplement for the city they live in etc. Sadly, for some, making babies can be more lucrative than getting a job

beacon
17-10-2018, 09:44 AM
How useful/useless is this mapping of 2nd level loan criteria (https://www.dropbox.com/s/bqu2dmjaasu92c9/level_2_defaults.pdf?raw=1)?

This cross tabulation is brilliant, Myles. A bivariate analysis is much more useful than segregated filtration in a multivariate environment. At the very least, it proves again that risk falls with age, until you hit 60. :)

I found it interesting that Boarders remained risky in most situations, but risk for those living with parents seems to be mitigated in certain situations like if they were 50-59 or divorced. Thank you for producing and sharing this.

beacon
17-10-2018, 10:15 AM
Further to Myles efforts, couldn't resist doing a multivariate model. comparing two loan sets:
1. Debt to income < 20%, enquiries 3 or less, time in job of 3 or more years, exclude new vehicle - new boat - tax - wedding expense - other loan purpose, exclude lendors aged over 60, exclude loans having 2 or more defaults, exclude partial PP, exclude B and C grade who aren't homeowners, exclude all A and all F grades.
2. loans that don't match the above criteria

...

The results show that the model selection produced an average default of 3.6%, while the control had average default of 7.5%. Grade results, showed defaults were approx half in BCD, while E was only slightly improved compared to the control.

Brilliant work leesal. Thanks for proving again, that loan picking ISN'T simply happening for love, and so, ISN'T a sheer waste of time.


Am now applying two default sets of criteria for loan selection - A default minimiser and the another that takes stable loans less likely to early repay. Both generate a return 3% higher then the data.

I'm assuming 1. is your default minimiser. That is an awful lot of filters, overridden by top level elimination/modification of filtered loans. Be interesting to know, what criteria you chose for your second default set.


I wonder if the 25,000 loans the data set is missing is causing some selection bias.

There is bound to be selection bias, since only 17 or so investors contributed to the data pool, versus the 8,800 odd investors Harmoney has on its books today. Hence the disclaimers Myles has put in the report. Still, we got input from loan pickers and index buyers to some extent, and I'm not holding my breath that Harmoney will (ever) publish default info by individual variables for its whole dataset. So, we've got the next best thing.

What I find interesting also, is that Myles' efforts at cleaning and reporting data put Harmoney's output-to-date to shame. Four years, and they still have the type of data errors humvee recently reported. They also lack in investor education and communication, especially in the area of protect loans. But overwriting/hiding rewritten status/info of defaulting loans, I find absolutely appalling and unforgivably misleading, bordering on fraud. I hope they desist and rectify the wrong they have done - as it impacts on investment decision making.

beacon
17-10-2018, 10:32 AM
The IRD had a discussion with Harmoney to determine if charged off principal is in fact a bad debt or a provisional bad debt. The IRD has now responded that it is a bad debt and can be written off against income if you are in the business of dealing in financial arrangements. Hope this is of help to any one.

Thanks Saffer. Be useful if you can provide an external link/reference, if possible, to this IRD confirmation.

Saffer
17-10-2018, 12:17 PM
This was a telephone confirmation and I will ask for this to be put in writing.

myles
17-10-2018, 02:38 PM
Chart of Expected Defaults by grade (because I hate the Hazard Curve and often find annualised data misleading):

This is calculated from historical data (unique.csv) with points on the line representing the percent of loans that are expected to default sometime in the future for a given loan age. (this is not annualised!) I've applied a window (description in title) to remove some outlier detail that plays havoc with the graph.

To give an example of how to read the details:

C Grade loans that are 10 months old have a ~3% chance of defaulting sometime in the future.

10084

The first point on the line is the expected default rate for all loans in that grade (based on historical data) - NOT annualised.

Some interesting crossovers and some rising and some falling grades...

Fix: Had a date constraint wrong which influenced the right hand side of the chart. Note that the right hand side is still incomplete as number of loans still to low for meaningful detail at that age.

myles
17-10-2018, 05:40 PM
A bivariate analysis is much more useful than segregated filtration in a multivariate environment. At the very least, it proves again that risk falls with age, until you hit 60.

If you liked that then you might like these even more - same but with grade as well :) pdf runs to 67 pages though...

l2_default.csv (https://www.dropbox.com/s/fh3n8vn1rs7kjnv/l2_default.csv?raw=1)

l2_default.pdf (https://www.dropbox.com/s/ultd5qkw3ubg56p/l2-default.pdf?raw=1)

The csv is particularly easy to use in a spreadsheet with simple column filters - lets you 'dig' into the data without having to get bogged down with pivot tables initially.

leesal
17-10-2018, 10:36 PM
Brilliant work leesal. Thanks for proving again, that loan picking ISN'T simply happening for love, and so, ISN'T a sheer waste of time.

I'm assuming 1. is your default minimiser. That is an awful lot of filters, overridden by top level elimination/modification of filtered loans. Be interesting to know, what criteria you chose for your second default set.

Thanks. Myles data stimulated a lot of thought and good ideas... and as a loan picker, its useful to have an overarching frame work - and one backed up with historic data all the better!

I've progressed beyond where I was :) And now have a streamlined single "default minimiser" that can ideally put into an autofilter. The criteria I have is B,C,D,E grades; age band 20 -60 year; debt to income < 10%; time at residence 2+ years; all purposes except new vehicle, new boat, wedding expenses, tax, other. With those alone can boost return and half default from the average 7.0% to 3.5%. Then there are items which cannot autofilter on that have a big impact - enquiries 3 or fewer, and debt to income < 5% for E grade loans - which bring the default down to 2.2%. I have gone further then that without sacrifricing return, but it can get very complicated.

Will start off running the default minimiser, and will be handpicking others I like from gut feel; and track how they perform.



There is bound to be selection bias, since only 17 or so investors contributed to the data pool, versus the 8,800 odd investors Harmoney has on its books today. Hence the disclaimers Myles has put in the report. Still, we got input from loan pickers and index buyers to some extent, and I'm not holding my breath that Harmoney will (ever) publish default info by individual variables for its whole dataset. So, we've got the next best thing.

What I find interesting also, is that Myles' efforts at cleaning and reporting data put Harmoney's output-to-date to shame. Four years, and they still have the type of data errors humvee recently reported. They also lack in investor education and communication, especially in the area of protect loans. But overwriting/hiding rewritten status/info of defaulting loans, I find absolutely appalling and unforgivably misleading, bordering on fraud. I hope they desist and rectify the wrong they have done - as it impacts on investment decision making.

I wonder whether its possible that rewritten loan data can be fixed going forward. I recall a thread some time ago, someone mentioned that employment status changed part way through a loan, which means that the data we are viewing is taken from a live relational database - eg any of the "client data" links such as employment etc is subject to change at any point in time.

myles
18-10-2018, 10:04 AM
Hazard Curve for loans in unique.csv:

Notes:


The x-axis is the month where the default occurred - the Harmoney graph appears to work on when they declare a loan has defaulted. (Which is more correct? I prefer to know when it actually occurred and payments stopped?)
Individual grade curves can be quite different to the overall average, but in very general terms, the peaks are similar (e.g. peaks at the 6 monthly points tend to be obvious across most grades)
The peak at 6 months would not be reported as a default until month 10 - 12 (or later) but the actual default occurred at 6 months.


10087

beacon
18-10-2018, 10:13 AM
Chart of Expected Defaults by grade ...

Some interesting crossovers and some rising and some falling grades...

Contrarian late life behaviors by C and D. Why I wonder? Maybe because they were the good E and Fs which rewrote and metamorphosed into C and D. This graph is worthy of a longitudinal analysis...

beacon
18-10-2018, 10:20 AM
If you liked that then you might like these even more - same but with grade as well :) pdf runs to 67 pages though...

l2_default.csv (https://www.dropbox.com/s/fh3n8vn1rs7kjnv/l2_default.csv?raw=1)

l2_default.pdf (https://www.dropbox.com/s/ultd5qkw3ubg56p/l2-default.pdf?raw=1)

The csv is particularly easy to use in a spreadsheet with simple column filters - lets you 'dig' into the data without having to get bogged down with pivot tables initially.

Just had a quick look Myles. This is marvellous. Looks like I may be exercising my grey matter much more this weekend. Hope I get the time :)

beacon
18-10-2018, 10:39 AM
Will start off running the default minimiser, and will be handpicking others I like from gut feel; and track how they perform.

You won't be able to attribute RAR improvement fully to your minimser then, but RAR improvement will be welcome nonetheless. How did you discern a loan was unlikely to early repay though? :)


I wonder whether its possible that rewritten loan data can be fixed going forward.

Yes, it is possible to debug going forward as well as rectify retrospectively.


I recall a thread some time ago, someone mentioned that employment status changed part way through a loan, which means that the data we are viewing is taken from a live relational database - eg any of the "client data" links such as employment etc is subject to change at any point in time.

Maybe, but changing whether a loan was initially rewritten after a loan has defaulted? Some may call it covering the tracks, and it is unnecessary to do really. I remember Myles producing a graph for his own data for rewrites a few posts earlier (after rechecking his 50 defaults to date). It showed his rewritten loans were a slightly better risk than the new loans. Most likely, this will reflect in the composite set too, but overwriting it makes the aftermath impossible to analyse/troubleshoot

beacon
18-10-2018, 10:53 AM
Hazard Curve for loans in unique.csv:

Notes:


The x-axis is the month where the default occurred - the Harmoney graph appears to work on when they declare a loan has defaulted. (Which is more correct? I prefer to know when it actually occurred and payments stopped?)
Individual grade curves can be quite different to the overall average, but in very general terms, the peaks are similar (e.g. peaks at the 6 monthly points tend to be obvious across most grades)
The peak at 6 months would not be reported as a default until month 10 - 12 (or later) but the actual default occurred at 6 months.


Many thanks for this, Mylesy. So, if it survives the first 6 months, chances of its survival improve with age. In this light also, it is interesting to see the C and D behaving as contrarians in later life. By the way, like you, I'd prefer to know when a default actually occurred and payments stopped, if it doesn't inconvenience Harmoney to change the date they use to plot defaults.

myles
18-10-2018, 05:14 PM
So, if it survives the first 6 months, chances of its survival improve with age.

No (I think), exactly why I produced the Expected Defaults graph. The Hazard Curve (i.e. Harmoney Hazard Curve), is plotting defaults in a month vs total loans (all loans from all time). It's not wrong - it is a Hazard Curve as per:

Failure rate is the frequency with which an engineered system or component fails, expressed in failures per unit of time.

But it is not the survival rate:

The survival function is a function that gives the probability that a patient, device, or other object of interest will survive beyond any given specified time.

Perhaps I'm wrong with the way statistical survival works, but shouldn't it be the chance of survival of those that are still alive? i.e. if we have 1000 loans and 10 default in month 10 (10/1000), the Hazard Curve plots month = 10 vs 1%. But aren't we more interested that, if 600 of the loans have already defaulted or been paid off by month 10, the 'none' survival (or chance of defaulting) is (number of defaults that have yet to occur)/400?

This is what I tried to plot with the Expected Default chart, which shows, in lower risk grades, a fairly flat default rate across time, in D grades a rise before a fall and for E and F a fall.

I've clearly misunderstanding what the Hazard Curve was showing, up until now, and suspect I may not be the only one?

Clearly 'statistics', and it's terminology aren't my strong point ;), happy to take suggestions on the best way this should be represented. I'm just not sure it is the Hazard Curve?

leesal
18-10-2018, 06:14 PM
Exactly right. Going to bump my rather verbose post from a month back....

Actual annual survival rates in some grades are 4 times greater then HM quoted default rates. Lower grades are still worth it, you just have to factor in early repayment.

EG in grade E3 - after early repayments of 50% pa, your probably only going to rake in 30-35% interest over the term of the loan. After defaults of 4.5% x 5 = 22.5% you generate a return of approx 10%. In complete contrast HM data which suggests 27.99 - 4.5% = 22.5%!!!!






HM annual average default is misleading. Rather I prefer to look at cohort default across the full term. Taking HM forecasted stats for Grade "E", their default forecasts are approx 4.5% per annum, or 22.5% across a 5 year term. To validate this, taking the 2014 E grade performance off the "historical annual default rate tool " https://www.harmoney.co.nz/investors/default-rates - shows that the cumulative default of E grade at 22.7% (and running to a similar place on 2015 and 2016 cohorts). Critically the definition of cumulative default is based on the number of loans originally funded, not the loans outstanding.

How is the cumulative default rate calculated?

The cumulative default rate is calculated by dividing the total number of defaults by the total number of loans funded. For example in 2015, for grade C3, 447 loans were funded and 17 loans defaulted to the end of 2017 creating a cumulative default rate of 3.8%.

How that reads to me, is early repayment is not factored in. If HM were to publish annual default based on time in lent, the number would be significantly different. ie If 22% of your E grade loans are going to default, and 78% remain good - how are your stats going to look if 40% of the good ones repay early in the first 12 months!

RMJH
18-10-2018, 08:58 PM
Slightly suffering from information overload so maybe I missed it but is there an analysis of defaults for those who borrow the maximum amount vs others in the same grade? I mostly avoid loans over $40k but admittedly it's pure guesswork.

leesal
18-10-2018, 09:14 PM
You won't be able to attribute RAR improvement fully to your minimser then, but RAR improvement will be welcome nonetheless. How did you discern a loan was unlikely to early repay though? :)

Yep definately, doesn't matter where the improvement comes from. However I'll have some reports on the default minimiser set, just wont be RAR.

The "stable set" (eg less likely to repay early), was formed by selecting variables that maximised active loan days. iirc - the criteria included repay/loan < 8%, 1 enquiry max, in job 3+ years, married, exclude F loans, and excluded wedding exp, buy new car, buy new boat, tax bill. Pretty sure there was something else but can't recall. That set according to my scrawled notes gave 2.8% default rate (compared against the average 7% for the whole set through to Dec16), and extended the loan period an extra 2 months on average - to about 16.5 months. The return wasn't too different from the overall average though - mainly due to the set containing over 60% A and B grade loans. In contrast the default minimiser set contained 60% at CDE grade and turned in a lower default rate and higher return.





Yes, it is possible to debug going forward as well as rectify retrospectively.



Maybe, but changing whether a loan was initially rewritten after a loan has defaulted? Some may call it covering the tracks, and it is unnecessary to do really. I remember Myles producing a graph for his own data for rewrites a few posts earlier (after rechecking his 50 defaults to date). It showed his rewritten loans were a slightly better risk than the new loans. Most likely, this will reflect in the composite set too, but overwriting it makes the aftermath impossible to analyse/troubleshoot

I'd say its just lazy programming. Easy enough to do without thinking. If somebody did mention to HM, am sure they'd add it to their list of fixes!

leesal
18-10-2018, 10:22 PM
Slightly suffering from information overload so maybe I missed it but is there an analysis of defaults for those who borrow the maximum amount vs others in the same grade? I mostly avoid loans over $40k but admittedly it's pure guesswork.

Will help Myles out on this one.

10088

have to say didn't expect that result. So had a look at the data, and it appears that loans above the max have been invested on average at a later date. Cannot really draw a conclusion based on the whole dataset.






ave date below max
ave date at max


A
16/02/2017
14/11/2017


B
09/01/2017
20/11/2017


C
19/01/2017
21/11/2017


D
22/12/2016
17/02/2018


E
22/09/2016
18/02/2018


F
26/05/2016
10/03/2018

leesal
19-10-2018, 08:06 AM
Will help Myles out on this one.

10088

have to say didn't expect that result. So had a look at the data, and it appears that loans above the max have been invested on average at a later date. Cannot really draw a conclusion based on the whole dataset.






ave date below max
ave date at max


A
16/02/2017
14/11/2017


B
09/01/2017
20/11/2017


C
19/01/2017
21/11/2017


D
22/12/2016
17/02/2018


E
22/09/2016
18/02/2018


F
26/05/2016
10/03/2018




The loans above max would be a really interesting dataset - with the rewrite option not available early repayments should be much less.

Prior to August 2017 no max loans in grade DEF had been written. Looks like only one investor in the overall dataset has taken on the strategy - will be interesting to see how they perform

humvee
19-10-2018, 10:31 AM
The loans above max would be a really interesting dataset - with the rewrite option not available early repayments should be much less.

Prior to August 2017 no max loans in grade DEF had been written. Looks like only one investor in the overall dataset has taken on the strategy - will be interesting to see how they perform

That will probably be because on august 3rd 2017 the maximum borrowing limits on many grades increased - prior to then borrowing todays maximum amount would not be possible

below is from harmoney email


Hi ...........,
On Thursday, 3 August 2017, we are implementing our next generation borrower scorecard. This will not affect your current Harmoney portfolio, but may affect your lending strategy moving forward.
What is changing

From Thursday, 03 August 2017, all new loan applications will be assessed with the new generation scorecard.
The scorecard will increase the accuracy of pricing risk significantly from the first generation scorecard.
The probability of a default for each grade will change with the new scorecard.
Borrower interest rates will reduce across risk grades A1-F5. The minimum rate will be 6.99% p.a. and maximum rate will be 29.99% p.a. This reflects the underlying risk and improved ability to price risk.
The maximum loan limits are being increased in some grades. However, it is important to note that the affordability testing that is done will not change.

Why we are changing itAfter 3 years and over 300,000 applications to date we continue to demonstrate our commitment to innovation, and creating value for both Lenders and Borrowers, with the launch of our next generation scorecard (http://go2.harmoney.com/JS3ZLH0KgX04I5N0Qy00050).

humvee
19-10-2018, 10:41 AM
Here are the pre August 2017 Limits (as at August 2016 to be exact)

A1 - A5 $35,000
B1 - B5 $30,000
C1 - C5 $25,000
D1 - D5 $25,000
E1 - E5 $10,000
F1 - F5 $5,000

Cool Bear
19-10-2018, 10:41 AM
That will probably be because on august 3rd 2017 the maximum borrowing limits on many grades increased - prior to then borrowing todays maximum amount would not be possible

below is from harmoney email


Hi ...........,
On Thursday, 3 August 2017, we are implementing our next generation borrower scorecard. This will not affect your current Harmoney portfolio, but may affect your lending strategy moving forward.
What is changing



From Thursday, 03 August 2017, all new loan applications will be assessed with the new generation scorecard.
The scorecard will increase the accuracy of pricing risk significantly from the first generation scorecard.
The probability of a default for each grade will change with the new scorecard.
Borrower interest rates will reduce across risk grades A1-F5. The minimum rate will be 6.99% p.a. and maximum rate will be 29.99% p.a. This reflects the underlying risk and improved ability to price risk.
The maximum loan limits are being increased in some grades. However, it is important to note that the affordability testing that is done will not change.

Why we are changing it

After 3 years and over 300,000 applications to date we continue to demonstrate our commitment to innovation, and creating value for both Lenders and Borrowers, with the launch of our next generation scorecard (http://go2.harmoney.com/JS3ZLH0KgX04I5N0Qy00050).
Leesal, if you could kindly redo your graph with only loans since August 2017.
And please don't try to confuse us with swapping the colours between the two graphs :p.
Thanks.

Or continuing on Humvee above post, do it from August 2016 but with the max set at different levels for the different dates - more complicated.

leesal
19-10-2018, 11:45 AM
Good to have that gem of information :)

Here goes. Shows a not insignificant difference in default between frequency between max and non max loans. There is a slight mitigant, that max loans (particularly at the mid grade) stay current longer. Potential for more interest payments needs to be weighed up against default frequency.



10089

leesal
19-10-2018, 11:49 AM
Further - above is drawn from pre Aug 2017 data only

humvee
19-10-2018, 12:27 PM
That graph is much more like what I would expect


Good to have that gem of information :)

Here goes. Shows a not insignificant difference in default between frequency between max and non max loans. There is a slight mitigant, that max loans (particularly at the mid grade) stay current longer. Potential for more interest payments needs to be weighed up against default frequency.



10089

Cool Bear
19-10-2018, 05:00 PM
Good to have that gem of information :)

Here goes. Shows a not insignificant difference in default between frequency between max and non max loans. There is a slight mitigant, that max loans (particularly at the mid grade) stay current longer. Potential for more interest payments needs to be weighed up against default frequency.



10089
Thanks Leesal

From your chart, it does looks like the default rates of the max loans are twice the non max loans (except for F). So it is significant indeed.

As for the max loans being more current, that would be because the non max loans may be paid off early to rewrite as max loans.

Thanks again (and for the colours too :))

myles
19-10-2018, 05:32 PM
I'd be a little cautious on those Max Loan values.

The maximums for each grade in the unique.csv are below. There are a couple of rogue (high) values that look almost like there is some flexibility in the ceiling value?

If I put in a few previous dates I get what looks like more than a couple of different sets. Not 100% sure, but it looks to me like those maximum loan amounts have changed quite a bit???

A1|79500.0
A2|79500.0
A3|77475.0
A4|77475.0
A5|77475.0
B1|79500.0
B2|77475.0
B3|55500.0
B4|55825.0
B5|55500.0
C1|55500.0
C2|56950.0
C3|45675.0
C4|55500.0
C5|44500.0
D1|34400.0
D2|34325.0
D3|34400.0
D4|33525.0
D5|33525.0
E1|22525.0
E2|22525.0
E3|23050.0
E4|22525.0
E5|22525.0
F1|11525.0
F2|11525.0
F3|11200.0
F4|11525.0
F5|10500.0

Cool Bear
19-10-2018, 06:02 PM
I'd be a little cautious on those Max Loan values.

The maximums for each grade in the unique.csv are below. There are a couple of rogue (high) values that look almost like there is some flexibility in the ceiling value?

If I put in a few previous dates I get what looks like more than a couple of different sets. Not 100% sure, but it looks to me like those maximum loan amounts have changed quite a bit???

A1|79500.0
A2|79500.0
A3|77475.0
A4|77475.0
A5|77475.0
B1|79500.0
B2|77475.0
B3|55500.0
B4|55825.0
B5|55500.0
C1|55500.0
C2|56950.0
C3|45675.0
C4|55500.0
C5|44500.0
D1|34400.0
D2|34325.0
D3|34400.0
D4|33525.0
D5|33525.0
E1|22525.0
E2|22525.0
E3|23050.0
E4|22525.0
E5|22525.0
F1|11525.0
F2|11525.0
F3|11200.0
F4|11525.0
F5|10500.0
I think the loan limits are what the borrower gets in the hand. The extras are the loan fees and payment protect which we lenders fund. Although that does not explain the "overflow" of the A limit to the B1 and B2, as well as the B limit to C1, C2 and C4

RMJH
20-10-2018, 08:30 AM
Good to have that gem of information :)

Here goes. Shows a not insignificant difference in default between frequency between max and non max loans. There is a slight mitigant, that max loans (particularly at the mid grade) stay current longer. Potential for more interest payments needs to be weighed up against default frequency.



10089
Big thank you Leesal (and Myles). Given the picking skills of some of those in our data group there could be bias in the sample but that said your analysis does seem to confirm common sense logic that those taking every penny that they can get are (over a large sample) more likely to be in an unsustainable financial situation.

Any thoughts on the pattern showing letter grade 1's seem to have higher defaults that letter grade 2's in most grades?

Cool Bear
20-10-2018, 08:55 AM
I think the loan limits are what the borrower gets in the hand. The extras are the loan fees and payment protect which we lenders fund. Although that does not explain the "overflow" of the A limit to the B1 and B2, as well as the B limit to C1, C2 and C4
Myles, sorry for my above post which just sort of repeat what you said. I did not read your post properly - early signs of senior moments.

Either HM is flexible in their ceiling values (and interest rates) or the grade presented to us was a mistake? I remember, about two years ago, a case where the interest rate charged for one loan was not correct and was from another grade. I did not follow up on that with HM and just accepted that their system is far from perfect.

Wsp
20-10-2018, 02:20 PM
Has anyone looked at time of year and defaults? There has previously been speculation on this forum that loans taken out just before and after Xmas have a higher default rate.

myles
20-10-2018, 05:27 PM
Has anyone looked at time of year and defaults? There has previously been speculation on this forum that loans taken out just before and after Xmas have a higher default rate.

Good question (I've been the main speculator :p)

First chart shows the calendar month of when loans start vs default rate. There is a definite bias to the later months of the year. June stands out, not sure why - school holidays? Perhaps not as driven by Christmas as I suspected? However, I had a quick look at grade level, and there were some grades that showed more distinct ramp up to Christmas, in particular B Grade.

10093
Worthwhile considering what month defaults start in as well (not when the loan starts). Fairly flat with a bit of a bump in May? This could be influenced by the timing of when Harmoney started up i.e. there have been more cycles in some months than others. Slight bias in the beginning of the year influence from post Christmas debt challenge maybe...

10092
Not as clear cut as I speculated :(

myles
20-10-2018, 08:04 PM
Just re-looking at that first Calendar Month Started chart - when I started, I dropped $100K into Harmoney in the months of March, April and May - bloody good luck I guess? :)

myles
22-10-2018, 02:31 PM
Any thoughts on the pattern showing letter grade 1's seem to have higher defaults that letter grade 2's in most grades?

Had seen that at some stage and didn't know what it was about, so I took the time to plot it :)

It appears it only occurs in A's and B's, and with their volume, they influence the whole when looking at the entire data set. Why it occurs, I have no idea:

10095
If you're picking A's or B's, A2 and B2 certainly look like a good pick...

A bit of a ripple in the Harmoney matrix? Grading system's got something wrong in those areas...

RMJH
22-10-2018, 07:11 PM
Had seen that at some stage and didn't know what it was about, so I took the time to plot it :)

It appears it only occurs in A's and B's, and with their volume, they influence the whole when looking at the entire data set. Why it occurs, I have no idea:

10095
If you're picking A's or B's, A2 and B2 certainly look like a good pick...

A bit of a ripple in the Harmoney matrix? Grading system's got something wrong in those areas...
Thanks Myles. Doesn't it also occur in D's and F's? I wonder if bumping up is used as a sweetener by the sales force.

myles
22-10-2018, 07:54 PM
The change from 1 to 2 looks something like:



Grade
Change


A1 - A2
78% drop


B1 - B2
74% drop


C1 - C2
3% rise


D1 - D2
9% drop


E1 - E2
69% rise


F1 - F2
4% drop



So I'd say only A and B are showing significant drops from 1 to 2?

RMJH
23-10-2018, 09:19 AM
The change from 1 to 2 looks something like:



Grade
Change


A1 - A2
78% drop


B1 - B2
74% drop


C1 - C2
3% rise


D1 - D2
9% drop


E1 - E2
69% rise


F1 - F2
4% drop



So I'd say only A and B are showing significant drops from 1 to 2?
I'd agree but the interest charged is somewhat significantly different.. Possibly just statistical noise or bias in our sample. I won't be changing my filters for it.

beacon
23-10-2018, 10:09 AM
One thing I've not noticed before is that the early loans ~ pre mid 2017 were being paid off earlier - around week 20, while ~post mid 2017 loans are being paid off at around 30-35 week mark. Could be something to do with scorecard changes or borrower selection or even a shift in grade selection by lenders. Just interesting...

Could be change in Harmoney sales focus and strategy due to Commerce Commission actions/ruling...

beacon
23-10-2018, 11:26 AM
No (I think), exactly why I produced the Expected Defaults graph. The Hazard Curve (i.e. Harmoney Hazard Curve), is plotting defaults in a month vs total loans (all loans from all time). ... I'm just not sure [mine] is the Hazard Curve?

The way I'm reading it, your graph is a hazard curve too. It shows that 42% of the defaults in our data pool occurred within the first 15 months of the loan. Harmoney hazard curve (to july 17) stated that almost 60% of the defaults that occurred (https://www.harmoney.co.nz/investors/investment-risks) were within the first 15 months of the loan. Harmoney hazard curve was plotted on maturity approx two-thirds of ours (time wise), which explains some of the improvement in percentage.

I'm getting the distinction you are making, as well as the goal you are aiming at, through Survival rate computation. But how to plot it, given that survival rate gets impacted not just by defaults but by early repaids too? Hmm...

beacon
23-10-2018, 12:20 PM
The return wasn't too different from the overall average though - mainly due to the set containing over 60% A and B grade loans. In contrast the default minimiser set contained 60% at CDE grade and turned in a lower default rate and higher return.

Thanks leesal, very interesting. Also your work on max vs non-max loans, which was supplemented by additional insights from Myles, humvee, Cool Bear and RMJH.
Also interesting was the "grade ceiling overflows" from A limits into the B1 and B2, and B limits into C1, C2 and C4. So, not a ceiling after all, more like a rough guideline...

beacon
23-10-2018, 02:00 PM
If you liked that then you might like these even more - same but with grade as well :) pdf runs to 67 pages though...

Some interesting observations emerged as I played around with your trivariate tables today, Myles. For example, while all except 50-59 year olds were high risk when buying new vehicles, all new vehicle buyers in AB grades were cool to loan to (despite small bases by age groups within grade).

Full owners in A grade taking holidays were cool, but in C were high risk. Etc.

While the above 60 wanting loans to buy a new vehicle were not at all cool overall (2 defaults /21 total), we may have to wait until the next data pool to analyze them by grade.

Thanks again for your time and effort in producing and sharing these nevertheless. :)

myles
25-10-2018, 12:18 PM
Not sure if this has been discussed before so just putting it out there - it's really a question: Is a loan paid off in a shorter period, higher, lower or the same risk/value as a loan paid off over a longer period?

Some numbers relating to variants and their average age in months (Paid Off​ loans only):



Variant
Description
Loans
Avg Mths


Age Band
30-39
2852
9.82


Age Band
40-49
3744
10.13


Age Band
20-29
1200
10.58


Age Band
50-59
3039
10.75


Age Band
Above 60
948
11.33


Loan Purpose
Purchase Boat
42
8.31


Loan Purpose
Clear Overdraft
307
8.51


Loan Purpose
Funeral Expenses
116
8.53


Loan Purpose
Legal Fees
38
8.86


Loan Purpose
Computer
60
8.91


Loan Purpose
Medical Expenses
150
9.17


Loan Purpose
Education Expenses
210
9.20


Loan Purpose
Home Improvements
1743
9.36


Loan Purpose
Holiday Expenses
1191
9.96


Loan Purpose
Household Items
385
10.06


Loan Purpose
Purchase Caravan
32
10.37


Loan Purpose
Purchase New Vehicle
141
10.46


Loan Purpose
Loan to Family Member
150
10.49


Loan Purpose
Debt Consolidation
4539
10.56


Loan Purpose
Business Cash Flow
436
10.61


Loan Purpose
Wedding Expenses
236
10.69


Loan Purpose
Tax Bill
57
11.12


Loan Purpose
Purchase Used Vehicle
679
11.22


Loan Purpose
Other
1271
11.91


Residential Status
Owned - Paying Mortgage
5721
9.88


Residential Status
Fully Owned - No Mortgage
211
10.43


Residential Status
Living with Parents
543
10.51


Residential Status

5
10.61


Residential Status
Renting
4119
10.66


Residential Status
Boarding
533
10.84


Residential Status
Other
497
12.00


Residential Status
Supplied By Employer
154
12.14


Term Months
60
8398
9.73


Term Months
36
3385
11.91



The fact that, on average, a 60 month term loan is paid of in less time than a 36 month term loan is just nuts? Perhaps it's human nature to take the extra time and perhaps most don't realise the extra Interest paid...

Cool Bear
25-10-2018, 04:12 PM
My view is that early repayment does not affect the returns or risk. If you accept a certain risk/returns and if the loan is repaid early then you can always invest the proceeds into a similar loan.

RMJH
25-10-2018, 04:45 PM
Under the original fee structure, early repayment hurt. Not so much now but there is still a re-investment lag. Just to confirm. Your figures represent the average age at which early repaid loans were repaid rather than average loan life? I think that must be the case.

RMJH
25-10-2018, 04:50 PM
Forgot to say, I think the remainder of the original cohort will become a bit more risky with these early repayments as they are prior to the default peak and thus the bad loans are less diluted.

Cool Bear
25-10-2018, 06:56 PM
Under the original fee structure, early repayment hurt. Not so much now but there is still a re-investment lag. Just to confirm. Your figures represent the average age at which early repaid loans were repaid rather than average loan life? I think that must be the case.
Yes, those old loans hurts as early repayment attract a 1.25% fee on the amount repaid.

Currently, early repayments does hurt us if the loan has PP. As we have to pay the whole PP sales commission to HM. That is why some of our outstanding principals for fully paid loans are negative (meaning we owe HM money) instead of zero.

Vagabond47
25-10-2018, 11:42 PM
The fact that, on average, a 60 month term loan is paid of in less time than a 36 month term loan is just nuts? Perhaps it's human nature to take the extra time and perhaps most don't realise the extra Interest paid...

When there are no barriers or penalties to repaying early I used to borrow money at the longest term, and make extra payment whenever I could, then if something unexpected cropped up I could just make the minimum payments for a while then get back to paying it down as fast as possible.

myles
26-10-2018, 01:35 AM
When there are no barriers or penalties to repaying early I used to borrow money at the longest term, and make extra payment whenever I could, then if something unexpected cropped up I could just make the minimum payments for a while then get back to paying it down as fast as possible.
I can see how that would be a more flexible way to do it - it's just that these 60 Month loans are being paid back, on average, in less than 10 months. I find it difficult to see why you wouldn't take the loan as a 36 Month loan and pay less interest - surely you would be able to make the slightly higher monthly payments if you have the ability to pay it off in 10 months?

Having said that - most of these are probably re-writes, so they aren't being paid off, just re-negotiated...

myles
26-10-2018, 01:50 AM
Sorry, got called away and didn't get back to this...


My view is that early repayment does not affect the returns or risk. If you accept a certain risk/returns and if the loan is repaid early then you can always invest the proceeds into a similar loan.

This has been my thinking, just thought it worth seeing how others see it.


Forgot to say, I think the remainder of the original cohort will become a bit more risky with these early repayments as they are prior to the default peak and thus the bad loans are less diluted.

I think the default peak, as per the Harmoney hazard curve is somewhat misleading - yes, most defaults occur at the highest point on the hazard curve, but this is were most loans occur. The chances of a single loan defaulting later, don't appear to changed all that much (some grades appear to get a little worse, some get a little better). I'm still working on the best way to show this... I consider the hazard curve as showing the frequency of defaults not the probability/likelihood of defaults - if that makes sense.


Currently, early repayments does hurt us if the loan has PP. As we have to pay the whole PP sales commission to HM. That is why some of our outstanding principals for fully paid loans are negative (meaning we owe HM money) instead of zero.

Yes, those pesky PP's do have some drawbacks. Really need to work through some examples and try to determine if they end up as a gain or a loss on the majority of short term loans...

myles
26-10-2018, 01:54 AM
Your figures represent the average age at which early repaid loans were repaid rather than average loan life? I think that must be the case.
Yes - average age of Paid Off​ loans.

Vagabond47
26-10-2018, 03:51 PM
I can see how that would be a more flexible way to do it - it's just that these 60 Month loans are being paid back, on average, in less than 10 months. I find it difficult to see why you wouldn't take the loan as a 36 Month loan and pay less interest - surely you would be able to make the slightly higher monthly payments if you have the ability to pay it off in 10 months?

Having said that - most of these are probably re-writes, so they aren't being paid off, just re-negotiated...

Yep, I suspect so, the number of loans that I look at that are <9 payments on the old loan and now looking for a rewrite. These people need educating on how to manage their finances.. but then how would we make money.. *shrug*

myles
28-10-2018, 10:18 AM
Not of huge value, more for conversation/reference:

Google Map of Defaults (https://drive.google.com/open?id=1FL2yaWwLh1hgnGSvGmFkll-yrGKkTPHx&usp=sharing)

Population of loans greater than 10 loans just to remove the odd ones in Australia and misspelled names etc...[source unique.csv].

IntheRearWithTheGear
28-10-2018, 10:33 AM
Cool, we can get out our pitch forks and pay them a visit....... who is with me ?

BJ1
29-10-2018, 09:55 AM
I like the pitchfork, but perhaps more useful to avoid loans like this:
10109

Gill
01-11-2018, 12:38 PM
Am I missing something here? $45k for home improvements when the person is renting?

10113

BJ1
01-11-2018, 02:00 PM
This is why my autolend has been turned off all 2018. The norm is to expect that housing will take 25%+ of income, although this is a very rough measure. Add 20% of after tax income for this debt and the large topup after only 6 months and I wouldn't touch it with your bargepole. Let the institutions take a portfolio approach - we individuals need to steer clear of these.
While you're at it have a look at 144078 - has just moved house with a new mortgage and needs to refinance $40k of non mortgage debt and is willing to pay a $4000 fee to protect him/herself. I'll bet the bank didn't know all the details when it approved the house loan.

icyfire
01-11-2018, 02:11 PM
Am I missing something here? $45k for home improvements when the person is renting?

10113
The loan could be for their investment rental property. I know someone who lives in a rented property but owns two rental properties.

Gill
01-11-2018, 09:38 PM
If this is the case, then this is the vital information and should have been added to the comments. But without the information, there is no way to know.

Wsp
01-11-2018, 11:23 PM
Am I missing something here? $45k for home improvements when the person is renting?

10113

One possibility is that they are renting temporarily while they undertake house renovations.

Having said that I still wouldn't touch it

RMJH
02-11-2018, 07:32 AM
Am I missing something here? $45k for home improvements when the person is renting?

10113
Rewrite with large payment protect fee....

PennyPicker
02-11-2018, 10:33 AM
This is why my autolend has been turned off all 2018.
Does auto lend still even work? Is there a confirmed cash to loan ratio that is known to trigger the auto lend feature? (I have ~5% and it's not enough).

IntheRearWithTheGear
03-11-2018, 09:52 AM
How do other people calculate what a dollar invested in harmoney might be worth theoretically at the end of 5 years if you could keep the capital/interest bouncing into other loans as it becomes available (taking into account taxes and harmony fees etc).

I don’t think its correct to use a online compounding interest to do this.

However this site could be a contender if you add the harmony fees percentage to the tax percentage.

https://www.calculator.net/interest-calculator.html?cstartingprinciple=1&cannualaddition=0&cmonthlyaddition=0&cadditionat1=beginning&cinterestrate=22.99&ccompound=monthly&cyears=5&ctaxtrate=32.5&cinflationrate=0&printit=0&x=65&y=33#interestresults


For example $1 dollar invested in c2 loan at 22.99%, 15% percent harm fee, 17.5 tax could be worth 83 cents in interest but according to the above site its worth much more ie $2.16 if compounding is used.

I wrote a program to do it using the vb pmt functions and to recursivly call the interest calculation - but then you come into issues with rounding, and i dont have the fin background to get over this issue.


Just interested on how others calculate it.

BJ1
03-11-2018, 10:50 AM
I have 3 family members piggy backing with small amounts included in my total Harmoney investment, so I need to be able to account accurately for annual results and to forecast future returns so they can make decisions to continue. I run a spreadsheet of all exposures with forecast returns taking into account all fees, RWT and writeoffs (the latter based on my personal expectation which to date over nearly four years is tracking on the money). While RAR to date is 13.58% (got hit a month ago with a big one which took it down a few points) my forecast return is currently 14.54% which is right on the monthly average projection for the past 2 years. I will add that I am expecting another hit to my RAR soon, of about 0.14%, but that is already in my forecast rate. Over the past few months I have altered my mix to fewer A and more B, shrunk my average term and almost halved my average commitment per loan, to reflect my concerns with the global / local economy.

myles
03-11-2018, 01:17 PM
For example $1 dollar invested in c2 loan at 22.99%, 15% percent harm fee, 17.5 tax could be worth 83 cents in interest but according to the above site its worth much more ie $2.16 if compounding is used.


There shouldn't be a problem calculating return rate for a month on a single (or group of) loan(s) and then plugging that into a compounding 'calculator'/calculation, however it won't take into consideration defaults unless you use some average value or rely on Harmoney's suggested default rates.

My current 'guage' for the value of my current loans is to calculate the return for 1 day using interest, tax, fee rates on each loan and then applying a historical default rate window, at grade level and annualising that. I believe it gives me a fair snapshot indication of the return/value of my current loans with as up-to-date default rates as I can confidently calculate.

At a portfolio level, in my opinion, an XIRR calculation is still the best calculation to use, as it takes into consideration all ins and outs, but it is total portfolio value from the start, so is 'slow' to show more recent changes and is not a prediction of future value - but that should be obvious.

myles
03-11-2018, 01:30 PM
A partial sample of what I calculate - which shows annualised values and rates based on current loans:



Grade
Loans
Principal
Interest
Fee
Default
Tax
Income
CRAR


C5
93
$9423.31
$2061.83
($309.28)
($205.58)
($216.49)
$1330.49
14.12%


D1
100
$12381.41
$2855.35
($428.30)
($133.39)
($299.81)
$1993.85
16.10%


D2
91
$11222.20
$2719.02
($407.85)
($95.96)
($285.50)
$1929.71
17.20%



I ignore PP for simplicity, but may add it in at some stage. Runs a little low of actual due to PP which has a positive return for me.

I don't include tax deductions - I see those as a bonus :)

IntheRearWithTheGear
03-11-2018, 04:22 PM
Below is the sort of function im working to get the true number - can anybody see issue ? note the recursion.



static public double CalcInterest(double InterestRate, int Months, double LoanAmount)
{

double TotalInterest = 0;

for (int i = 1; i < Months + 1; i++)
{

double PmtValue = Financial.Pmt(InterestRate / 12, Months, -LoanAmount);
double IPmtValue = Financial.IPmt(InterestRate / 12, i, Months, -LoanAmount);
double PPmtValue = Financial.PPmt(InterestRate / 12, i, Months, -LoanAmount);

TotalInterest += IPmtValue - ( (IPmtValue * 0.175) + (IPmtValue * 0.15) );

if(PPmtValue + IPmtValue > 0.01)
{
TotalInterest = TotalInterest + CalcInterest(InterestRate, Months - i, IPmtValue + PPmtValue);
}

//Console.WriteLine("{0},{1},{2},{3}", i, PmtValue, PPmtValue, IPmtValue);

}

return TotalInterest;
}

myles
03-11-2018, 05:02 PM
Maybe I'm not getting what you're trying to do but it is a very simple calculation for compound interest.

A VB example here:
http://www.vb-helper.com/howto_calculate_interest.html

Snow Leopard
03-11-2018, 09:58 PM
Below is the sort of function im working to get the true number - can anybody see issue ? note the recursion.

static public double CalcInterest(double InterestRate, int Months, double LoanAmount)
{

double TotalInterest = 0;

for (int i = 1; i < Months + 1; i++)
{

double PmtValue = Financial.Pmt(InterestRate / 12, Months, -LoanAmount);
double IPmtValue = Financial.IPmt(InterestRate / 12, i, Months, -LoanAmount);
double PPmtValue = Financial.PPmt(InterestRate / 12, i, Months, -LoanAmount);

TotalInterest += IPmtValue - ( (IPmtValue * 0.175) + (IPmtValue * 0.15) );

if(PPmtValue + IPmtValue > 0.01)
{
TotalInterest = TotalInterest + CalcInterest(InterestRate, Months - i, IPmtValue + PPmtValue);
}

//Console.WriteLine("{0},{1},{2},{3}", i, PmtValue, PPmtValue, IPmtValue);

}

return TotalInterest;
}

Surely what you do is this?



static public double CalcInterest(double yearlyInterestRateAsPercentage, int monthLengthOfLoan, double originalLoanAmount)
{
// 12: months, 100: percent to decimal
double monthlyInterestRate = yearlyInterestRateAsPercentage/(12.0*100.0);

double monthlyRepayment = Financial.Pmt(monthlyInterestRate, monthLengthOfLoan, -originalLoanAmount);

double totalRepayments = monthlyRepayment * monthLengthOfLoan;

double totalInterest = totalRepayments - originalLoanAmount;

return(totalInterest);
}

myles
03-11-2018, 11:07 PM
Why use PMT?

It is a simple calculation:

10117

Note: The ^ is to the power of, if it's not obvious?

IntheRearWithTheGear
03-11-2018, 11:13 PM
in repy to Snow Leopard

This one dosnt do the tax and fees for harmony. Where are you investing the returned capital and interest as the loan progresses ? ie the totalrepayments could be another loan at the same investment settings starting up as soon as its returned your account.

I think a fault in my one is that it each "recursive loan" should be out to a full 60 months instead of a diminishing loan duration but the total interest from all subsquent loans should be to the end of the original 60 months.

My original code is coming from the idea of being a borrower and summing the interest payments to the lenders as our "in theory" net income (not taking defaults into account).

Food for thought.

myles
03-11-2018, 11:17 PM
No, the calculation is based on re-investment of the interest - since the Principal remains invested for the entire period it is equivalent to re-investing the Principal that would be returned (as per p2p lending)...

Added: I think your reply may have been to Snow Leopard?

The calculation I provided above is correct and as you can see it's quite simple, but ignores defaults. If you can determine a rate to use for defaults it can be easily added in - this is the difficult part, coming up with a reliable default rate.

[Interest for Period - as I have it, has tax and fees taken out as per the Effective Rate.]

alundracloud
04-11-2018, 11:51 AM
Well I've sent harmoney a large list of loans that have extremely obvious reporting errors

Broken down into the following main areas

Showing as Paid off but still owe money = 67
Showing as Current but $0 owing = 54
Negitative outstanding principal = 17


They have said "The team has been made aware of the loan IDs and problems with each as detailed in your email and look to have this corrected soon. I don't have a timeframe at this stage as the list of loans with errors is, as stated, large."

Only time will tell how long they will take to fix these

Hi Humvee,
Have you received any more correspondence regarding the errors you reported, or have you seen corrections come through in your reports section? Very curious to know how you got on.

Snow Leopard
04-11-2018, 12:21 PM
in repy to Snow Leopard

This one dosnt do the tax and fees for harmony. Where are you investing the returned capital and interest as the loan progresses ? ie the totalrepayments could be another loan at the same investment settings starting up as soon as its returned your account.

...

If that is what you want to calculate then as myles as said it is the simple compounding interest formula you need.



// monthlyRateOfReturn: monthly interest rate adjusted down for fees, taxes etc

double totalInterest = (Math.Pow(1.0 + monthlyRateOfReturn, months) - 1.0) * originalLoanAmount;

leesal
05-11-2018, 01:13 PM
I'm intrigued to know - how does someone go about having 21 enquiries in the last 6 months??

Not trying to denigrate HM/the borrower. Genuinely curious to understand how this arises

10118

RMJH
05-11-2018, 01:23 PM
I'm intrigued to know - how does someone go about having 21 enquiries in the last 6 months??

Not trying to denigrate HM/the borrower. Genuinely curious to understand how this arises

10118
They shopped around and 20 other lenders said NO or wanted higher interest?!!

leesal
05-11-2018, 01:42 PM
They shopped around and 20 other lenders said NO or wanted higher interest?!!

Takes time to make an application doesn't it, and they wind up going for one at 26%? Is it even possible? And if so, why do they get a D4 grade living with Mum & Dad?

Somebody at HM with fat-fingers?

myles
05-11-2018, 01:42 PM
They shopped around and 20 other lenders said NO or wanted higher interest?!!

Yep, I reckon that's probably it.

I know some still think Harmoney rates are excessive, but there was a loan comment in the last week or so that indicated the borrower was moving from GEM to Harmoney because the GEM rate had increased to 49%!!!

leesal
05-11-2018, 02:13 PM
Yep, I reckon that's probably it.

I know some still think Harmoney rates are excessive, but there was a loan comment in the last week or so that indicated the borrower was moving from GEM to Harmoney because the GEM rate had increased to 49%!!!

Alec Baldwin "You can do better..... then 49%!"

I must be underestimating the industriousness/persistence the applicants. Was verging taking that loan as thought it was a typo and it otherwise met my filter conditions. Probably a good one to leave!

Loans having more then 12 enquiries from the dataset:
Number of instances = 93
Range: 13 to 43 enquiries
Invested = $5450
Interest earned = $1079
Charged off = $580

humvee
05-11-2018, 02:28 PM
Hi Humvee,
Have you received any more correspondence regarding the errors you reported, or have you seen corrections come through in your reports section? Very curious to know how you got on.


There has certainly been some improvement/reduction in the number of errors - however no real updates or explanations - particularly as to if we are out of pocket because of some of the errors eg completed loans that still owe money


Before
Showing as Paid off but still owe money = 67
Showing as Current but $0 owing = 54
Negative outstanding principal = 17


Now

Showing as Paid off but still owe money = 16
Showing as Current but $0 owing = 1
Negative outstanding principal = 13

Wsp
05-11-2018, 06:17 PM
Somebody at HM with fat-fingers?

I believe Harmoney get these figures automatically from credit agencies. So I doubt it's data entry error by Harmony.

RMJH
06-11-2018, 06:41 AM
My account balance has gone negative all on its own. How does that happen?

IntheRearWithTheGear
06-11-2018, 08:51 AM
My account balance has gone negative all on its own. How does that happen?

Happens all the time, fixes itself over time. Usually within the day.

humvee
06-11-2018, 02:42 PM
My account balance has gone negative all on its own. How does that happen?

Mine was fine until ~15 mins ago then it went from just under +$25 to -$89.45

10120

humvee
06-11-2018, 02:46 PM
Something is even more strange then usual

-$89.45
then
-$83.81
then
-$133.81

10121

10122

humvee
06-11-2018, 02:50 PM
now +$166.18

10123

Cool Bear
06-11-2018, 04:23 PM
Have updated the summary.pdf (https://www.dropbox.com/s/e02rj5kc8d6fgt9/summary.pdf?raw=1) document with quite a few characteristic by grade chart sets where they make sense. Some of these appear to offer some hints of how the Harmoney grading process may be working.

At this stage I'll take a break from making new charts unless someone asks for something specific or I think of something useful - if I've missed anything, let me know.

I'll put up the final summary document and unique.csv and raw.csv sometime late tonight. If you find any errors or have problems with these let me know.

Time to digest some of this info ;)
Myles, I have a bit of time on my hands today and decided to study your summary report in detail as well as play with the unique.csv file (which more than double my own data set). Your report is really helpful.

One item that puzzle me is your income ratio chart on page 13. The total number of charge-offs for the unique.csv is 1028 out of 22252 loans. The total charge-offs shown in your chart is only about 650+. Wonder if there is some error somewhere. Can you have a look please?

Much appreciated, CB

myles
06-11-2018, 05:01 PM
One item that puzzle me is your income ratio chart on page 13. The total number of charge-offs for the unique.csv is 1028 out of 22252 loans. The total charge-offs shown in your chart is only about 650+. Wonder if there is some error somewhere. Can you have a look please?

Good pickup CB - typo of 'Dept Sold' instead of 'Debt Sold' - now corrected so just download the pdf again.

Makes a significant difference to that graph, and a quite interesting one too!

Cool Bear
06-11-2018, 08:12 PM
Good pickup CB - typo of 'Dept Sold' instead of 'Debt Sold' - now corrected so just download the pdf again.

Makes a significant difference to that graph, and a quite interesting one too!
Thanks Myles.

Interesting indeed. We would have thought that higher % would leads to higher defaults. Maybe it is due to the F grades where the loan limits (and thus monthly repayment amounts) are low. When you can spare the time, can you do the same chart again with just A,B,C,D and E, leaving out the F loans. Thanks again

myles
06-11-2018, 08:43 PM
Colour change just so their is no confusion. Looks very much like the original which didn't have the 'Debt Sold' loans (probably a lot of F's in those 'Debt Sold'?):

10124
In general the same shape though. Can't think of any obvious reasons why it is like it is?

Perhaps as you suggest, higher grade loans are only given out to those with lower income ratios, hence some influence even down a few more grades?

myles
06-11-2018, 08:53 PM
Ran it for just ABC grades and still a similar shape, though low number of defaults so a bit 'jagged'? Beets me why?

I'd bumped my Income Ratio test up to 20% (from 15%) when loans dropped off a while back - never changed it back, but I've bumped it to 21% now ;)

Cool Bear
06-11-2018, 10:48 PM
Colour change just so their is no confusion. Looks very much like the original which didn't have the 'Debt Sold' loans (probably a lot of F's in those 'Debt Sold'?):

10124
In general the same shape though. Can't think of any obvious reasons why it is like it is?

Perhaps as you suggest, higher grade loans are only given out to those with lower income ratios, hence some influence even down a few more grades?

Thanks again Myles

Cool Bear
06-11-2018, 10:56 PM
Ran it for just ABC grades and still a similar shape, though low number of defaults so a bit 'jagged'? Beets me why?

I'd bumped my Income Ratio test up to 20% (from 15%) when loans dropped off a while back - never changed it back, but I've bumped it to 21% now ;)
I think if we do not invest in F, we might as well forget about this repayment to income ratio. Like in your case, why stop at 21%? The total default rate (from your latest chart) for income ratio of the over 22% as a group is only about 3.4% which is about the same or less than your 0 to 21%

Cool Bear
06-11-2018, 11:05 PM
Colour change just so their is no confusion. Looks very much like the original which didn't have the 'Debt Sold' loans (probably a lot of F's in those 'Debt Sold'?):

10124
In general the same shape though. Can't think of any obvious reasons why it is like it is?

Perhaps as you suggest, higher grade loans are only given out to those with lower income ratios, hence some influence even down a few more grades?
Would be it too much to ask if you could run one for each grade, just to see the difference?

myles
06-11-2018, 11:13 PM
My thoughts are that anything over 0.25 are a bit 'random' and numbers are perhaps a little low to be significant.

Leaving out 0.22-0.24 should reduce defaults if only by a little. Plus I think it just makes sense to limit to around that 20% range.

Otherwise agree - doesn't appear to be an overly useful metric for loan selection.

myles
06-11-2018, 11:23 PM
10125
10126
10127
10128
10129

myles
06-11-2018, 11:25 PM
10130
Have to include some detail or I can't post a single image...

myles
06-11-2018, 11:31 PM
Looks like E grade is the major influencing grade.

myles
07-11-2018, 08:48 AM
Income ratio may not be overly useful, but it appears that just Income (combined) is:

10134

Somewhat obvious I guess...

Cool Bear
07-11-2018, 10:01 AM
Loan to income ratio may not be overly useful, but it appears that just Income (combined) is:

10134

Somewhat obvious I guess...

Myles, thanks again.

One of my criteria was not to invest if the combined income is less than $2000. Maybe I should now bump that to $2500.

Cool Bear
07-11-2018, 10:10 AM
thanks too for the all the repayment to income charts. They are certainly very helpful.

myles
07-11-2018, 10:21 AM
Probably higher? Perhaps it is a case of having to have a minimum income to put food on the table and pay the bills before being able to reliably pay off a loan, irrespective of the loan/repayment size? Over $60,000pa looks to be a bit of a threshold point?

Cool Bear
07-11-2018, 01:17 PM
Probably higher? Perhaps it is a case of having to have a minimum income to put food on the table and pay the bills before being able to reliably pay off a loan, irrespective of the loan/repayment size? Over $60,000pa looks to be a bit of a threshold point?
The trouble with cutting it too high is the reduction in the number of loans available to invest in. You may also end up with just the safer grades of A and B with lower interest too. So hard finding the sweet spot (range).

myles
09-11-2018, 08:05 AM
A positive story for Harmoney:

Licensed P2P lender Harmoney says it has facilitated loans across New Zealand and Australia valued at $1 billion (https://www.interest.co.nz/news/96775/licensed-p2p-lender-harmoney-says-it-has-facilitated-loans-across-new-zealand-and)

with the typical ill-informed/disgruntled Harmoney bashers coming out in the comments...

beacon
09-11-2018, 09:23 AM
A positive story for Harmoney:

Licensed P2P lender Harmoney says it has facilitated loans across New Zealand and Australia valued at $1 billion (https://www.interest.co.nz/news/96775/licensed-p2p-lender-harmoney-says-it-has-facilitated-loans-across-new-zealand-and)

with the typical ill-informed/disgruntled Harmoney bashers coming out in the comments...

https://www.chrislee.co.nz/taking-stock exhibits negative sentiments too. But we hope Harmoney goes from strength to strength as a P2P and alternative funding source for the community

myles
09-11-2018, 09:40 AM
https://www.chrislee.co.nz/taking-stock exhibits negative sentiments too. But we hope Harmoney goes from strength to strength as a P2P and alternative funding source for the community

Hmmm, a poorly researched opinion piece that perhaps has some bias...brokerage...