BENGALURU, Karnataka—An Indian start-up that few outside the fintech industry would have heard of embedded tracking software inside popular apps, including one that streamed Sai Baba stories and another that streamed Ilaiyaraaja songs, to scoop up sensitive user data including GPS locations, and business SMSes from ecommerce sites and banks to monitor spending activity, personal contacts, and much more, HuffPost India has found.
CreditVidya, a Hyderabad-based fin-tech company, ran this snooping code (technically known as a Software Development Kit or SDK) for several months in 2017 until a new version of Google’s Android operating system made it harder to scrape such data. The data, scooped up from users, was used to power CreditVidya’s self-learning algorithms that help lending companies determine the credit-worthiness of loan applicants. (Fin-tech is industry speak for financial technology, a fast growing category of software firms).
SDKs like the one developed by CreditVidya are called “Middleware”. If you assume an app is like a machine, middleware would be a component or a cog in that machine. As apps grow more complex, developers often rely on middleware developed by third parties, increasing the risk that user data is scraped and sold on for a fee.
Upon installing these apps, many of which were developed by a third party app developer call Winjit, users would have been asked for access permissions that are increasingly common and intrusive, but would have had no idea that their personal data was being scraped and sold further in a manner that could affect their credit-worthiness.
“Even though there might not be proper notice / informed consent, at least it’s understandable that lending apps that user uses is downloaded consciously and some might have knowledge on the fact that app,” said Srikanth L., a contributor to Cashless Consumer, a collective studying digital payments and fintech businesses in India. “The Creditvidya SDK was also found in a Sai Baba app, Ilaiyaraaja Hits app and other music apps of popular record labels with its SDK where user is clueless about this background data collection.”
Thus a user could consent to an app collecting data without knowing how such data would be used.
CreditVidya, Srikanth said, “used the data from unsuspecting users as part of the huge database it uses to generate the trust score, but there is opaqueness about where this data comes from and how many data brokers were engaged in trading personal data with companies like CreditVidya.”
Worse, given that many of these algorithms are proprietary and hence un-auditable, it is unclear if these credit-rating apps even work. Users could find themselves denied credit, or charged high interest rates on the basis of purely arbitrary decision making by CreditVidya algorithms trained on data scraped on the sly.
“Given how untransparent the industry is,” said Fredrike Kaltheuner, from the Data Exploitation Programme of Privacy International, a privacy-focused global non-profit organisation that investigates and advocates for user privacy. “It’s hard to say if this information is actually helping anyone get a loan. There are a lot of companies in this space now, but their algorithms are a black box, and the data they use is usually not clear either.”
CreditVidya and Winjit did not reply to HuffPost India’s emailed requests for comment. We will update this story if the companies share a response.
CreditVidya does not offer loans directly to consumers. Instead, the company offers its services to over 50 lenders, ranging from banks like Axis Bank, DBS, Yes Bank, and financing companies like Tata Capital, TVS Credit, and Hero FinCorp, according CreditVidya’s website.
This means that when consumers approach these companies for loans, CreditVidya’s software helps determine if the loan should be given or not. To do so, the company compares a given loan application with its giant database, to evolve something called “Trust-score” that, the company claims, determines if the applicant is likely to pay back the loan.
The company raised Series A funding from Kalaari Capital, and Matrix Partners joined in its Series B round. It has raised a third round of funding as well, led by the Bharat Innovation Fund. One of the partners at the fund is Sanjay Jain, former Chief Product Officer at the UIDAI, and a volunteer at Bengaluru-based think-tank iSPIRT.
In a blog post, Kailash Nath, a Senior Associate at Bharat Innovation Fund wrote that CreditVidya processes over 500GB of data every day. It uses data related to over 10,000 parameters to assess creditworthiness, and plugs its SDK into the lenders’ apps, to make the decision to approve the loan or not. He added that the platform has processed over 25 million profiles so far. The post does not mention anything about the sources of this vast amount of data.
“It’s not necessary that the data is coming from nefarious means,” said Saravanan K., a Bengaluru-based security consultant. “There could be any number of ways in which the company has acquired this data, and a lot of it is above board — people aren’t always aware of what they are signing up for, where they are giving their data.”
“Your phone number acts as a unifying element, and then the amount of data that becomes available about you simply from offline sources will boggle your mind. But getting data directly from your phone can be very valuable, because it’s happening in real time and gives a very clear picture of what you are doing.”
The companies doing all this data gathering are keeping quiet about the matter. For example, Srikanth found CreditVidya’s SDK in a number of applications made by Winjit, which has developed a number of music apps, including for huge companies like Times Music. However, the nature of the relationship between the two companies is not clear; nor have they made any public statement on why Winjit’s apps on music carried CreditVidya’s lending SDK.
When a user downloaded a Winjit app, it would create a profile linked to their phone number, and then update this, analysis of the SDK by Cashless Consumer showed. APIs in the SDK revealed code for the user being initialised, and the data being updated.
A report by Aayush Rathi and Shweta Mohandas for the Centre for Internet and Society that researched the privacy commitments taken by Indian fin-tech companies also goes over some of this ground.
“The unprecedented growth of this sector with a number of players that have an amorphous nature (not banking entities) has concomitantly come with regulatory challenges around inter alia privacy and security concerns,” Rathi and Mohandas say in their report. “For instance, a survey of 1,300 senior executives in the global financial services, and fintech industries revealed that 54% of respondents identified privacy and data protection as barriers to fintech innovation.”
They also noted that a study stated identified that 79.4 percent of the surveyed participants stated that they did not read the privacy policies and only 11 percent of them stated that they understood them. They also wrote that another study conducted on the most popular apps in India also observed that the privacy policies were drafted to protect the service providers from liability, rather than to help the consumers.
What’s in the SDK?
Analysis of the SDK by Srikanth suggests CreditVidya collected the following info:
- Mobile IMEI
- All contacts
- Measured frequency of SIM changes to see if this is a person who frequently swaps SIMs
- GPS location
- Business SMS to monitor spending activity
- Wifi ON/OFF
Given that CreditVidya talks of over 10,000 data points, it’s safe to say that this is not all the information that the company is collecting about potential borrowers. What’s particularly worrying in this case though is how the information was being collected through applications that have nothing to do with lending.
“They are collecting user specific data, and also location specific data for demographic mapping,” said Srikanth L. of Cashless Consumer.
Getting data directly from your phone can be very valuable, because it’s happening in real time and gives a very clear picture of what you are doing.
Kaltheuner, from Privacy International, said this kind of arrangement with SDKs is not uncommon.
“A lot of researchers have come across such arrangements,” said Kaltheuner, “but it is very hard to find actual evidence.” In that sense, the work done by Cashless Consumer is very important, she added, as it shows how companies are quietly collecting user data.
“But a bigger concern is the use of pre-installed applications for tracking,” she added. “These apps are installed by the phone manufacturers, or by the telecom companies, and that’s how you get very cheap smartphones being subsidised by third party trackers.”
“These pre-installed trackers often don’t need to ask you for permission before getting access to your data, and they can have access to deeper information than the third-party trackers,” she said. This is made worse by how opaque the industry is; information flows in only one direction.”
“Middleware is very hard to track because there are a number of ways in which companies are going around regulations. Even if a developer doesn’t mean to take your data, it’s often very hard to know what all an SDK is going to do. This is a systemic problem in the industry, with a lot of reliance on third party software.”
Standard procedure in India
Although a number of developers who spoke to HuffPost India confirmed that practices like these are common in the Indian ecosystem, they refused to go on the record, explaining that this is normal business practice, and speaking out about it will lead to a loss of opportunities in the future.
“The big change was Google cracking down on this stuff, but otherwise it’s all over the place,” one developer based in Bengaluru said. “Like, there’s a company in Bombay whose business model is to offer its SDK for apps, and it basically gives you solutions like OTP capture — but it also keeps tracking SMS data afterwards, which is used to build a financial profile. And they offer a cut for doing this, so it subsidises the cost of developing the app.”
Another developer said that IBM’s analytics middleware has also created similar problems but refused to give any details fearing reprisals from the company which has offered his startup projects in the past. However, IBM denied the allegation—a representative said that it would require more technical details from the developer to give a detailed response, but the developer refused to share further information.
But the problem is actually not limited to India. In May 2019, mobile app developer QuarkWorks found that one of its apps on the Google Play store was flagged and removed for violating store policies. According to Devun Schmutzler, Native Mobile Developer QuarkWorks, Google said their app was violating Android’s advertising ID policy.
Google had identified that the app collected and transmitted the Android advertising identifier, which could be used to identify and target a user.
Except, according to Schmutzler, the app wasn’t either collecting, or transmitting any data as far as the developers were aware. It was at this point that the team carried out an investigation into the matter, and found their app was using an old version of Fabric Crashlytics—middleware developed by a third party, which was embedded in the Quarkworks app to analyze crashes and other software errors. The Crashlytics component was collecting this information without Quarkworks’s knowledge.
But this was just the only bit of middleware they found tracking sensitive user information.
Firebase, which is a mobile and Web development platform acquired by Google also does this, though it’s very easy to change the settings to stop sending this data, Schmutzler noted.
OneSignal, which is used for high volume mobile and Web push notifications also tracks this user information, and QuarkWorks had to tweak the app to limit the data being shared. These were just the ones found in the case of a small app with limited libraries by one developer, but given the scale of the industry, the number of providers that are collecting user data in an opaque manner is simply staggering.
Google and Apple have evolved policies against the sharing of background data through apps which are available online. Although the companies did not share details about the size of teams in India that audit apps, for both platforms privacy has become a big talking point with Apple highlighting this for multiple years now, and Google also strongly talking about privacy in the last Google IO developer conference.
In India though, companies like this are likely to soon get another tool to use to track and profile users—Aadhaar. The Aadhaar Amendment bill is expected to pass in the Lok Sabha, and once it becomes a law, the use of Aadhaar by the private sector opens up again.
Once that happens, aside from your phone number, there is also a permanent, immutable identity that can be used to track a person, or collate their information.
Is this data even useful?
It is possible that companies are compromising users’ privacy on a broad scale, but coming up with results that are not more accurate than traditional lending was.
HuffPost India reached out to several lending companies who did not wish to comment on this story once we explained that it was about the covert collection of user data, in the past, some of these companies have commented about the use of data.
Speaking to this reporter in the past, Bala Parthasarathy, the Chairman and CEO of lending app MoneyTap said that “the data is not sophisticated enough. We use mostly traditional data. Right now, there are a lot of low hanging fruit whom the banks are too rigid for, and that’s where we can make a difference.”
“Typically, companies look at a number of different factors, so they’ll look at your account data, or they might read your SMS messages to track your spending,” he had said. “This is of course a privacy concern. But they read your transaction SMSes to understand your financial history. They might take a look at the apps on your phone, or your social media logins to see what kind of relationships you have, how strong a local circle you have, so they know you’re not going to disappear.”
MoneyTap, on the other hand, he said was mostly using user data only to make filling the forms simpler since they had to be entered through the company’s app on the phone.
As Privacy International’s Kaltheuner pointed out—such algorithms being a black box means that there is no clarity on whether anyone is actually benefiting from such use of data, yet it’s quickly becoming the norm.