SmartInternz

Clustering Startups Based on Customer-Value Proposition

Project Complexity - Basic

Technology: Machine Learning | Business Sector: Industry

Project Description
Countless new startups are born every single day, and venture capitalists are always on the lookout to find which one will be the next big thing. To do this, they must learn an incredible amount about the startups they want to invest in. One piece of information that is especially valuable to investors is the industry that a startup is in and the industry competition it faces. As such, classifying startups by industry function is an important tool in investing; however, doing this for the many thousands of startups that are formed every day is impossible by hand. We thus want to use machine learning to cluster companies by customer value proposition, given nothing more than short one to two lines describing what the company does.

Solution 

The main purpose of project is to be able to take a text description of a startup and return a classification into a clustering of related companies.K-means clustering is a popular unsupervised learning technique for grouping closely related data points. This project utilizes K-means to group together startups by text description into related working industry.Additional techniques such as dimensionality reduction via singular-value decomposition are implemented to reduce the space of the feature set and document-term matrix and improve speed and efficiency of the algorithm.
add-right