CFI.Org | Aim. Build. Leverage. Partner. Persevere: 5 Tips to Leverage Alternative Data to Bank the Unbanked

Alternative data can help FSPs reduce loan defaults and speed up the approval process, but pitfalls exist

Written by Rodrigo SanabriaLenddoEFL



I have been rolling out alternative data initiatives for financial inclusion across Latin America for several years. At some point, my clients ask: “is this going to work?” My usual answer is “I’ve failed enough times to have figured this out.”

This is a fairly new and not completely mature field. LenddoEFL has been doing this for over 10 years. While there is still a lot to learn, my team and I can share some wisdom.

In response to Accelerating Financial Inclusion with New Data, I recently wrote about the promise and challenge of using alternative data to bank the unbanked. We’ve learned a lot about applying alternative data and have identified five key success factors:


1. Aim at the pain
2. Build on top of your current business
3. Leverage the best data source for you
4. Partner with somebody that can handle multiple data sources
5. Persevere. Capture low-hanging fruit without losing sight of the big prize

We will tackle one at a time.

1. Aim at the pain

Some financial institutions come to us interested in “trying out” alternative data. Our usual question is “what problem are you trying to solve?” Sometimes they are not clear about what they want to solve, and sometimes they want to fix too many things at the same time. The whole approach for the initiative will depend on this understanding. Choose one pain, focus on it, and build the KPIs to measure success according to this.

Keep repeating to everybody the pain you are attempting to solve to make sure everybody shares the same understanding.

These are some examples from our experience:

• An MFI wanted to increase productivity per loan officer while maintaining default rates: reduce turn-around-time, workload in the field, and complexity. Its client base was made up of unbanked and thin-file customers, so, automation based on traditional scores was not an option. Solution: Collect psychometric information for credit scoring which would allow a centralized, automated process.

• A non-traditional microlender wanted to obtain early warnings of clients that would likely fall in arrears on their next installment so that they could better focus pre-emptive collections efforts. By combining traditional repayment data with Android phone data, we are able to “rank” clients by the probability of next payment default. Now they can focus on the the one-third that will create 75 percent of the defaults.

• A traditional financial institution was turning down about one-third of applicants due to lack of credit history, and not belonging to the “right” demographics. They decided to invite “rejects” to re-apply by providing psychometric information, which allowed us to “rescue” about half of those prospects without increasing the default rate.

• A home appliances retailer providing $200 loans to consumers was losing clients due to the time required to verify their identity. By leveraging social network data, they have been able to reduce the approval turnaround time from two days to a few minutes in most cases. They have been able to approve more clients, reduce the cost of identity verification, and reduce cases of fraud.

2. Build on top of your current business

A good friend and a brilliant risk professional called me asking for help: “We are planning to launch a new product, for a new segment, in a new channel, so we need to use a new source of data to build an origination model.”

“Too many ‘news’ in the equation,” I told him. However, I joined his new venture.

You can guess how this adventure ended: slow volume uptake, lack of an actionable model after several months, and little enthusiasm to keep investing in order to capture value.

As we discussed in the first post, building models with alternative data is a numbers game. You need volume.

In the successful cases we mentioned before, we collected alternative data from a population that was already being served through a channel already established. This was to support a product with existing traction in the business. Innovation was concentrated in the data source and methodology to asses risk.

3. Leverage the best data source for you

Each source of data has advantages and drawbacks. In the front end, some sources may create more or less friction on the client onboarding, depending on origination processes. On the backend, usually the “low-friction” data is not structured. Unstructured data is not organized in a predefined way, so using it to build a risk model is more challenging than using structured data.

Once you have identified the pain point, you may work out with your partner/vendor the tradeoffs considering your population and channel. Note the following tips:

• Highly digital populations already served through an online channel may be approached using digital data, but you must make sure that you can get the volumes required to build a model based on unstructured data (unstructured data requires more volume to build a model).

• People with whom you already have an ongoing relationship may be a good population to leverage mobile phone data, as they may perceive a benefit to downloading and keeping your mobile application.

• Less digitized populations, served through traditional channels (branches or field loan officers) may be better suited for psychometrics.

Avoid the pitfall of falling in love with a specific data source and then figure out a use case within your business. Go the other way around: “given my business need, what data source better fits it?”

4. Partner with somebody that can handle multiple data sources

“When you only have a hammer, all problems look like nails,” my first boss told me a long time ago. To avoid the pitfall described on recommendation three, you must partner up with a vendor that can manage several data sources.

This will not only let you choose the right pain and business to focus on, but also give you flexibility as you roll out.

For example, we found, while working with a one client that their clients would willingly share their email data. Unfortunately, we found that they used their email so scarcely, that we couldn’t score many of them. Now we are working with psychometrics in this population.

In another situation, we started using psychometrics to approve more people at a Mexican e-lender. In the meantime—while they were approving more clients—we collected digital data from these same applicants. After several months, we have been able to combine both sources of data to approve even more people.

5. Persevere

If you are like most of us and work for an organization that needs results in a few quarters, structure your initiative to collect early results that may give you inertia while you go for the long-term prize.

We work with an institution that provides big loans. They do not have that much volume, but they invest heavily in each prospect. Big stakes, low volume is the most challenging environment to build an alternative data-based score. It took us almost 4 years, but now they are harvesting the fruits of their perseverance.

To deal with this issue, you need to be creative to identify secondary pain points that may be addressed quickly along the way.

For example, we worked for a retailer that wanted to increase approvals while keeping defaults in line by approving new-to-credit consumers. Loans had mostly 24 to 36-month terms and most 60 days defaulters tended to recover. That was a challenging situation: we would have to wait 12 months for vintages to mature, and look for 90 or 120 days in arrears for the “bads” to profile. It looked like a 2 to 3 year project.

But we found a secondary pain: “straight rollers.” These were loan recipients who didn’t pay their first two or three installments and were eventually written off. We collected data on all their clients to quickly build a “straight rollers model.” We only needed 3 installments on each vintage to identify bads.

Along the way, we are collecting data that will be used to build an admission score to address the main pain.

In summary, building credit policies based on alternative data is challenging. Fortunately, there is enough learning accumulated in our community to avoid some pitfalls and we hope you find these tips useful.

See post in CFI.Org

CFI.Org | To Bank the Unbanked, Start Using Alternative Data

Capturing digital footprints using psychometrics can help FSPs reach the unbanked.

By Rodrigo Sanabria, Partner Success Director, Latin America, LenddoEFL

Originally posted on the Center for Financial Inclusion's Blog.

In a recent post on her report, Accelerating Financial Inclusion with New Data, Tess Johnson highlighted the huge opportunity that alternative data represents for the future of financial services. The simple fact that mobile and internet penetration have surpassed financial services penetration in most emerging markets hints at a big opportunity: many people who have had no meaningful access to formal financial services are creating digital footprints financial service providers can capture and analyze to reach them with commercially viable services that help them improve their lives. This prospect is also made possible thanks to machine learning and big data methods that were not available to us a few years ago.

Field team testing its psychometric credit assessment in Mexico. Credit: LenddoEFL

Field team testing its psychometric credit assessment in Mexico. Credit: LenddoEFL

For those of us in the world of financial inclusion, these are very exciting times: the simultaneous emergence of online penetration and data analysis methods is generating an opportunity that our predecessors in this field couldn’t even have imagined.

The bad news is that harnessing digital footprint data using machine learning is not easy; it requires time, commitment and skills that are in short supply. However, the good news is that those with the vision and  endurance to leverage this opportunity will build a competitive advantage that will be sustainable for years to come.

When developing an alternative credit score based on traditional information (e.g., demographics, repayment data), analysts usually have historical data to design and train models. Through back testing, the credit scoring model is applied to historical data to see how accurately it would have predicted the actual results (i.e., loan repayment). We can get a pretty good sense of how the model will perform in the future and set up a credit policy accordingly. Yet, when we cannot use such traditional data sources, we are entering into uncharted territory.

Lacking prior information about our current customers’ psychometric profile or digital footprint, we must build those data sets from scratch. Depending on the data source, we may need very large data sets to compensate for the lack of data structure (unstructured data is simply data that is not easily accessible in a format or structure, like an Excel spreadsheet, that is optimal for generating insights). Just as with all other artificial intelligence applications, the more data you collect, the more predictive and stable your algorithms become. LenddoEFL is an example of an organization that gathers data for these profiles and footprints. It is an alternative credit scoring and verification provider that uses psychometric and other data about a loan applicant to determine a credit score and verify identity.

Furthermore, even state-of-the-art alternative data sources do not necessarily allow you to build models that are stable and reliable across multiple segments of the market. Therefore, you need to build algorithms that are specific to your target population.

One of the most challenging issues when implementing alternative data scoring initiatives is showing the results that can be achieved within a given set of time and budgetary constraints. In the long run, after the portfolio has matured, you can show whether using alternative data allowed you to approve more applicants within your target default levels, controlling by business cycle. But if you are working with 24- to 36-month loans, it may take three or four years before you can fully assess the impact of using alternative data, by which time internal attention spans may have already run short.

To account for that, LenddoEFL uses early indicators of model performance. We set a target maturity and days in arrears according to a financial institution’s portfolio’s profile, for example, 60 days in arrears within the first 9 months. Then we calculate a Gini coefficient—a scale of predictive power that can help lenders understand how good its credit score is at assessing who will repay and who will default on a loan (not to be confused with the Gini coefficient that measures income inequality) for the model as applied to that portfolio. (For more details on how to use the Gini, check out our blog series from our risk and analytics team: Part 1Part 2Part 3).

Is it too late to pursue an alternative credit scoring initiative? I would say yes, there are plenty of companies already doing this—Te Creemos in MexicoMynt in the Philippines and Business Partners in South Africa—but only a few lenders are utilizing alternative data in each market. You could be the first institution in your segment and country to implement such an initiative, and you can still take advantage of others’ experiences and learning.

The sooner you start collecting data and building models, the sooner you will be able to underwrite the unbanked segment better than your competition, and the longer the window of advantage will be. For those who start late, catching up with the early adopters will be a great challenge.

Read article on cfi-blog.org

Microfinance Gateway | Malaysia: Fintech Heavyweight CTOS Expands Services for A Better Financial Inclusion

CTOS has been Malaysia’s largest in terms of credit reporting, just announced a partnership with LenddoEFL to achieve a joint vision of financial inclusion for the people who had difficulties securing loans in Malaysia due to the lack of credit history. 

Read article in MicroFinance Gateway website: https://www.microfinancegateway.org/announcement/malaysia-fintech-heavyweight-ctos-expands-services-better-financial-inclusion

The ASEAN Post | The potential of big data for microfinancing in Southeast Asia

"Microfinance is described by the Financial Times Lexicon as a service where financial institutions will back small start-ups and would-be entrepreneurs with small loans, in the poorest parts of the world. In Southeast Asia, the biggest microfinance players currently include Asia Pacific-based LenddoEFL, Singapore's CredoLab and the Philippines’ Lendr, for example..." Read full article.

Markets and Fintech | El Big Data en la evaluaćión del riesgo de crédit

LenddoEFL, fundado por varios profesionales de perfil tecnológico en 2011, nacía con la misión de mejorar el acceso bancario a la emergente clase media de los países en vías de desarrollo. Con este objetivo en mente se acercó a las principales entidades financieras de Estados Unidos con la idea de estudiar los datos que éstas tenían sobre su población objetivo y poder elaborar un algoritmo de credit scoring alternativo. Tras la negativa de los bancos decidió emprender el viaje en solitario. 
Siete años después, Lenddo parece haber dado con algo parecido a la receta de la tarta de frutas perfecta. Analizando multitud de variables, desde el comportamiento en redes sociales, hábitos de comercio electrónico o la velocidad a la hora de rellenar los formularios de solicitud afirma reducir la mora en un 12%, aumentando las aprobaciones en un 15% y ser capaz de realizar una evaluación en menos de tres minutos. Read full article.