Alternative data: after the gold rush, what are the risks for investors?

Alternative Data

As use of alternative data grows and the number of alternative data sources multiply, so will the risks for the investment sector.

Misuse of third-party data has been brought into the spotlight recently by the Facebook-Cambridge Analytica data scandal.

The use of large-scale data to inform commercial decisions is, of course, a practice carried out in various industries. In future it's likely to be even more critical to business success, so managing the related risks is vital.

In 2016, 16.1 zettabytes (ZB) of data were generated globally. This statistic is forecast to grow tenfold by 2025, according to a new study from analyst firm IDC and hard drive maker Seagate. Most of this new data will come from end-user devices, making it largely unstructured.

Alternative data is thought to provide a 'big picture' view of consumer trends and behaviours.

Data harvested from social media, consumer transactions and smartphones has gained popularity in the investment sector, and is commonly known as “alternative data”. Part of a big data boom that is flourishing in several industries, alternative data is thought to provide a "big picture" view of consumer trends and behaviours, adding greater insight to the investment process.

There are lessons to be learned from the recent Facebook scandal that hedge funds and investment firms ought to bear in mind, if they decide to make the most of alternative data in their investment strategies. Crucially, data is a powerful tool that can be used to influence outcomes but, if this occurs at the expense of security and regulation, it could be detrimental to a company’s reputation, integrity and finances.

What is alternative data?

The data provider Eagle Alpha defines alternative data as "non-traditional data that can be used in the investment process". One of the sector’s forerunners, Eagle Alpha has pointed out the potential for non-traditional data sets to revolutionise investments, much like expert networks did in the 1990s. As of July 2017, they identified 482 datasets across 24 different categories:

What is alternative data?Based on these categories, they revealed that consumer transactions and geo-location have proved to be the most popular categories with clients. 

So what is "non-traditional data"? It has been described in the press as "digital exhaust": information that is often a by-product of other online processes. This includes credit card transactions, app downloads, online bookings and smartphone location trails.

Being able to keep track of real-time developments creates a new, more agile approach to making investments.

A rapidly growing number of data vendors assimilate this information to find evidence of consumer trends and industry changes. Being able to keep track of real-time developments has obvious advantages over more established information sources – such as networks, economic data releases and earnings reports – and creates a new, more agile approach to making investments.

Alternative data in use

Alternative data provision is a fast-growing niche: since 2010, the number of data vendors has more than doubled. Companies are willing to invest in a resource that is developing exponentially and in real time.

In part, data vendors have multiplied so quickly because they dedicate time and expertise to analysing vast amounts of information, something many hedge funds and investment companies simply do not have the resource for internally. In a survey of asset managers and hedge funds by Greenwich Associates about obstacles to alternative data adoption, 32% claimed a "lack of time needed to evaluate data", and 17% claimed "human capital needed for integration not available".

Citi Velocity added that "the 'rawness' or unstructured nature of the new datasets requires different skillsets that perhaps the wider investment community do not possess".  New types of data require skills that would be far too costly for an organisation to acquire or provide training for internally, particularly if the tradable value of the data being analysed is not necessarily guaranteed.

While this rapid growth in data vendors reveals that many companies are clearly taking advantage of alternative data, it is unclear exactly who is profiting from it. Many vendors’ clients bases are a closely guarded secret, and with good reason. Exclusivity is one of the main draws of data sets – being the first to market with information that will give businesses an edge over the competition.

However, accessing data with this kind of potential is costly. In the Greenwich Associates survey, the leading obstacle to alternative data adoption was "prohibitively high fees", at 38%.  Until the value of alternative data has been proven over a longer period of time, with greater market accessibility (and as a result, more competitive fees), many organisations will not attempt to use it as part of their investment strategy.

Legal considerations

Risks and legal challenges

The two key risks when using alternative data sets for investment strategy are exclusivity and the legal implications surrounding data sources. With both factors, it is the potential commercial benefit gained from this information that could leave a business exposed to litigation.

There are regulatory concerns over a hedge fund’s ability to gain an unfair market advantage over their peers by paying for information to which only they have access.

Lawyers have urged caution over the legality of businesses entering into exclusive agreements with data vendors. There are regulatory concerns over a hedge fund’s ability to gain an unfair market advantage over their peers by paying for information to which only they have access.

The second risk, regarding data permissions and the legality of data use, is more pertinent in the current environment, as many businesses prepare for or undergo changes to comply with GDPR.

The Financial Times reported that some hedge funds are concerned that the information they receive from data vendors has not been adequately "scrubbed" of sensitive information. In fact, many employ "dedicated internal teams to clean the data before it is used in the investment process".

Below are some of the main ways that data used in the investment process could be open to legal challenges:

1. Copyright
Just because information is available to the public does not mean it can be used for commercial benefit. This includes information taken from personal written and visual content such as photos, status updates, blogs and articles.

2. Value-added databases
Similar to copyright law, this involves taking publicly available information (for example, flight schedules) and creating a commercially valuable database.

3. Website scraping
While a common way of obtaining data, it is still important to check a website’s terms of use beforehand. Many websites are easily accessible without the need to accept their terms and conditions, but this does not guarantee that the information published there is available for commercial use.

4. Personally Identifiable Information (PII)
Using a database containing PII, even unwittingly, could lead a company to fall foul of the Data Protection Act 1998 and the EU General Data Protection Regulation 2018 (GDPR), as regulations tighten. Ensure data such as geo-locations, consumer transactions and mobile app usage is anonymous before use.

5. Exclusive vs accessible information
Information can be widely available to the market, or it can be made exclusive. In the US, it is possible for a firm to purchase an exclusive or bespoke data set, whereas in the UK and EU legislation is stricter. Particularly in a global firm, it is vital to comply with local data rulings.

Ensure any sensitive data is cleaned before use and that it complies with data protection, copyright and other appropriate regulations.

While the risk of litigation is not great at the moment, this could change. Jonathan Streeter, Partner at the New York law firm Dechert, where he offers advice on data sets to his hedge fund clients, told the Financial Times: "I don’t know of a case that’s been brought, but everyone anticipates that there will be one soon and [prosecutors] would like to bring one. This is a hot topic."

Risk mitigation and protection

Effective insurance and risk management can provide protection for hedge funds that choose to use alternative data sets as part of their investment strategy.

Thorough knowledge of data sources and potential regulatory issues is vital. Ensure any sensitive data is cleaned before use, and that it complies with data protection, copyright and other appropriate regulations.

In the event that the data used in an investment process was not compliant, certain insurance products could help to protect your firm, and the key decision-makers in it. Both professional indemnity (PI) and directors' and officers' (D&O) liability insurance can protect your business in the event of legal action being brought against you.

PI insurance can assist with the financial impact of legal action, covering legal fees, PR expenses and fines (excluding fines that are uninsurable by law) D&O insurance protects the key decision-makers in a firm, should they face legal action or reputational damage as individuals.

Additionally, a dedicated cyber insurance policy can provide financial support as well as security and recovery experts to mitigate damage, should a dataset be compromised. This applies to either the dataset used for investment information or that of clients and employees.

While alternative data has great potential to revolutionise the way investments are made, it is essential to remain aware of the legal pitfalls and other associated risks. Big data in business is set to carry on developing, becoming not only more advanced, but also more ubiquitous. As it does, companies and legislation are bound to respond and evolve.

For more information, please contact Calvin Barnes on:

+44 (0)20 7933 2390