-
09-10-2018, 09:02 AM
#3771
Member
Originally Posted by IntheRearWithTheGear
You may wish to strip borrower information from it as well - use your own key.
ie LAI-00140637 becomes some sort of new id tracked across mutliple data sources - that way loan is not identified.
I don't see the need to do this - harmoney has already done this - the loan ID is already harmoney's key that does not identify the borrower.
-
09-10-2018, 09:08 AM
#3772
Member
From my view point.
If i had a loan which turned bad - and my harmony documents had LAI-00140633 written on it as loan identity. Somebody could google that at a later point and find loan outcome from one of our uploaded datasets - maybe ex husband or something - at the momment ex husband cant really do that. So by stripping it out - and replacing with a number which does link to harmoney loan id - we get best of both worlds - anonymous but linked.
Like a phone number in its own dosnt identify you - but once you have the phone number you can find where a person has worked, compaines they own etc.
Last edited by IntheRearWithTheGear; 09-10-2018 at 09:20 AM.
-
09-10-2018, 09:42 AM
#3773
Member
Originally Posted by IntheRearWithTheGear
From my view point.
If i had a loan which turned bad - and my harmony documents had LAI-00140633 written on it as loan identity. Somebody could google that at a later point and find loan outcome from one of our uploaded datasets - maybe ex husband or something - at the momment ex husband cant really do that. So by stripping it out - and replacing with a number which does link to harmoney loan id - we get best of both worlds - anonymous but linked.
Like a phone number in its own doesnt identify you - but once you have the phone number you can find where a person has worked, companies they own etc.
Unlike a phone number which is linked to a person or location, the loan ID is relates to a single loan - not a person - ie all re-writes get a new loan id. this means it has a maximum life expectancy of 3-5 years - but as past data has show in general this is more like 1-3 years
The disadvantage of changing the loan ID is to a small extent extra work and complexity. But mostly it makes it more difficult for any one else to build appon the data, add to it, or if myles ever stopped doing it for what ever reason - carry on the project.
I also see value in comparisons made in 6 months time of things like what % of loans in arrears today are written off vs in arrears vs current 6 months later - to do this you need to link data sets 6 months apart - that may or may not be processed by the same person
The only reason I really see for changing the loan ID would be to stop some one who did not contribute to the data pool from getting the data then adding their own to it (and not sharing) then having a more complete data set then anyone else. But I'm putting my data into the pool and I still don't think this is a good enough reason to change the ID
Last edited by humvee; 09-10-2018 at 09:49 AM.
-
09-10-2018, 09:47 AM
#3774
yeah, nah
Originally Posted by IntheRearWithTheGear
You may wish to strip borrower information from it as well - use your own key.
I tend to agree with Humvee - this has already been done by Harmoney when they give the details to us - the LAI-# would not be shown on any borrower records (i.e. link to borrower would be internal only - well it should be?).
Originally Posted by IntheRearWithTheGear
But At the end of the day, we cant beat what they provide which is the loan grade as a loan risk, this loan grade takes into account all the hidden data points ie credit score.
Some of us are doing much better than what they provide, that's the point of doing this.
Originally Posted by IntheRearWithTheGear
You would be wrong to associate risk with a individual loan dimenson such as "board" or "living with parents"
Disagree with this, sure you've got to take multiple things into consideration, but you've only got to look at the re-write numbers to see that one dimension can be enough to lower default rates significantly.
If you want to splice and dice the data, a spreadsheet pivot table will do it for you, nothing too flash is really needed, once you have the data set to play with.
-
09-10-2018, 09:48 AM
#3775
yeah, nah
Originally Posted by beacon
Data labels on bar charts can be very helpful. Eg., in plotting default by coborrower, you currently include sample sizes on x-axis (1079 co-borrowers and 9909 single borrower). Both bars are between 0 to 5%. Could you also include labels on top of the bars like 3% or better still 32, if including them is no trouble to you.
Good idea, if I forget, please remind me.
-
09-10-2018, 10:02 AM
#3776
yeah, nah
Just to expand on what I suggested about timeliness before - the data set is really a snapshot in time - as a whole, it can't be built on because it will potentially contain 'stale' loan data. To try to explain: If someone uploads a particular loan that no one else has in the initial data set, and then never uploads again, then that particular loan will just sit with no further detail updated - it will become 'stale' and impact on the data set.
The dates stored with the data i.e. LAST_PAYMENT_DATE, don't allow, with confidence, for these 'stale' records to be removed (they may just be in arrears). So the best way to get a refreshed set of data will be to do this whole thing again - I think once very 6 months or so would be frequent enough - I don't see this as a 'continuous' process.
However, the detail of loans that are 'Paid Off', 'Charged Off', 'Debt Sold', will not change and could be built on. Something to think about a bit later on I think.
Point I'm trying to make is that this is a one off at this stage, that could be repeated, but not too frequently.
-
09-10-2018, 10:12 AM
#3777
Member
Another option could be to just trust myles - all send him our raw data. Have him create google sheet with all data dumps and then for him to google doc's the sheet with all data consoldated to all worthy contributions - we can then create our own graphs - and if a graph is any good publicise the summary data which we reference for dicussions or improve on.
So one source of data control by myles with a many google sheets using that data as source for pivot tables etc.
-
09-10-2018, 10:29 AM
#3778
Member
Myles - What is your cut off date for receiving data, Ive uploaded mine - but I can re export and upload just before cut off to make the data more current if this is helpful?
-
09-10-2018, 10:41 AM
#3779
yeah, nah
Originally Posted by IntheRearWithTheGear
Another option could be to just trust myles - all send him our raw data. Have him create google sheet with all data dumps and then for him to google doc's the sheet with all data consoldated to all worthy contributions - we can then create our own graphs - and if a graph is any good publicise the summary data which we reference for dicussions or improve on.
So one source of data control by myles with a many google sheets using that data as source for pivot tables etc.
Yeah, nah - too much work for me and I don't want control Everyone will want to look at the data differently, that's why the easy solution of just providing the whole set back seemed to be the best way to go. I'll do up the basics for those who don't have the skills to generate graphs etc. which, with a bit of tidy up I've already scripted so no drama.
-
09-10-2018, 10:43 AM
#3780
yeah, nah
Originally Posted by humvee
Myles - What is your cut off date for receiving data, Ive uploaded mine - but I can re export and upload just before cut off to make the data more current if this is helpful?
I had that in the original post - late Sunday (I'll close the link sometime Monday morning - probably early early am).
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
|
Bookmarks