Machine Learning The Event Horizon To Making People/Business Life Easier? Or Is It Not?
A few decades ago AI (Artificial Intelligence) came out as the buzz talk and we had all sorts of cartoons and movies showing how much things were going to change. The issue was that we were at the very start of what is possible today and the decades to come. Our, lives will take a different downturn. Life as we know it today already has the impact of the AI/ML and actually a lot of our habits, face expressions are readily analyzed and a pattern exists for each and every one of us. It is not science fiction anymore but realities that are way beyond of our understanding. But with the petabyte of data existing on us as humanity is really only the edge. We are slowly taking for granted all of these because it is just an easy way of life.
However, it is not beyond understand but we know that many people are confused about the specifics of machine learning and predictive analytics. Although they are both centered on efficient data processing, there are many differences.
Machine learning is considered a modern-day extension of predictive analytics. Efficient pattern recognition and self-learning are the backbones of ML models, which automatically evolve based on changing patterns in order to enable appropriate actions. Hundreds of existing and newly developed machine learning algorithms are applied to derive high-end predictions that guide real-time decisions with less reliance on human intervention.
One common, uncomplicated, yet successful business application of machine learning is measuring real-time employee satisfaction.
Machine learning applications can be highly complex, but one that’s both simple and very useful for business is a machine learning algorithm that compares employee satisfaction ratings to salaries. Instead of plotting a predictive satisfaction curve against salary figures for various employees, as predictive analytics would suggest, the algorithm assimilates huge amounts of random training data upon entry, and the prediction results are affected by any added training data to produce real-time accuracy and more helpful predictions.
Take it or leave it, 75% of Business leaders state ‘growth’ as the key source of value from analytics but only 60% of those leaders have predictive analytics capabilities. So what’s preventing the businesses from achieving predictive analytics capabilities? The major roadblock is applying the right set of tools, which can pull powerful insights from this stockpile of data. But first, a big data system requires identifying and storing of digital information. Using Machine learning and Artificial Intelligence algorithms, businesses can optimize and uncover new statistical patterns which form the backbone of predictive analytics.
Predictive analytics can be defined as the procedure of condensing huge volumes of data into information that humans can understand and use. Basic descriptive analytic techniques include averages and counts. Descriptive analytics based on obtaining information from past events has evolved into predictive analytics, which attempts to predict the future based on historical data.
This concept applies complex techniques of classical statistics, like regression and decision trees, to provide credible answers to queries such as: ‘’How exactly will my sales be influenced by a 10% increase in advertising expenditure?’’ This leads to simulations and “what-if” analyses for users to learn more.
All predictive analytics applications involve three fundamental components:
- Data: The effectiveness of every predictive model strongly depends on the quality of the historical data it processes.
- Statistical modeling: Includes the various statistical techniques ranging from basic to complex functions used for the derivation of meaning, insight, and inference. Regression is the most commonly used statistical technique.
- Assumptions: The conclusions drawn from collected and analyzed data usually assume the future will follow a pattern related to the past.
In the past, valuable marketing campaign resources were wasted by businesses using instincts alone to try to capture market niches. Today, many predictive analytic strategies help businesses identify, engage, and secure suitable markets for their services and products, driving greater efficiency into marketing campaigns.
While businesses must understand the differences between machine learning and predictive analytics, it’s just as important to know how they are related. Basically, machine learning is a predictive analytics branch. Despite having similar aims and processes, there are two main differences between them:
- Machine learning works out predictions and recalibrates models in real-time automatically after design. Meanwhile, predictive analytics works strictly on “cause” data and must be refreshed with “change” data.
- Unlike machine learning, predictive analytics still relies on human experts to work out and test the associations between cause and outcome.
Organization with huge data can begin analytics. Before beginning data scientists should make sure that predictive analytics fulfills their business goals and is appropriate for the big data environment.
A quick look at the three types of analytics, brings these areas of interest up –
- Descriptive analytics – It is the basic form of analytics which aggregates big data and provides useful insights into the past.
- Predictive analytics – Next step in data reduction; It uses various statistical modelling and machine learning techniques to analyze past data and predict the future outcomes
- Prescriptive analytics – New form of analytics which uses a combination of business rules, machine learning and computational modelling to recommend the best course of action for any pre-specified outcome.
Neural network is a system of hardware and software mimicked after the central nervous system of humans, to estimate functions that depend on huge amount of unknown inputs. Neural networks are specified by three things – architecture, activity rule and learning rule.
In short neural networks are adaptive and modify themselves as they learn from subsequent inputs. For example, below is a representation of a neural network that performs image recognition for ‘humans’. The network has been trained with a lot of sample human and non-human images. The resulting network works as a function that takes an image as input and outputs label human or non-human.
As Machine Learning and Artificial Intelligence landscape evolves predictive analytics is finding its way into more business use cases. Coupled with Business intelligence (BI) tools such as Domo and Tableau, business executives can make sense of big data.
Some prospective use cases for ML-based predictive analytics are:
- E-commerce – Using ML businesses can predict customer churn and fraudulent transaction. Also predicting which product customer will click on.
- Marketing – There are many examples of ML in B2B marketing. Common use case is identifying and acquiring prospects with attributes similar to existing customers. They can also prioritize known prospects, leads, and accounts based on their likelihood to take action.
- Customer service – Satisfaction Prediction made by Zendesk uses a machine learning algorithm to process results of historical satisfaction surveys, learning from signals such as the total time to resolve a ticket, response delay, and the specific wording of tickets cross-referenced with customer satisfaction ratings.
- Medical Diagnosis – Medical professionals can use a program modelled using ML to predict the likeliness of a particular illness. The model will use a database of patient records and will make predictions based on symptoms exhibited by the patient.
Organizations and technology companies are employing machine learning based predictive analytics to gain an edge over the rest of the market. Machine learning advancements such as neural networks and deep learning algorithms can discover hidden patterns in unstructured data sets and uncover new information. But building a comprehensive data analysis and predictive analytics strategy requires big data and progressive IT systems.
For illustrative purposes, it will be helpful to list a number of well-established business use-cases for machine learning so that you (the reader) can churn up your own application ideas:
- Face detection: It’s incredibly difficult to write a set of “rules” to allow machines to detect faces (consider all the different skin colors, angles of view, hair / facial hair, etc), but an algorithm can be trained to detect faces, like those used at Facebook. Many tools for facial detection and recognition are open source. Below is a video of facial recognition:
- Email spam filters – Some spam filtering can be done by rules (IE: by overtly blocking IP addresses known explicitly for spam), but much of the filtering is contextual based on the inbox content relevant for each specific user. Lots of email volume and lots of user’s marking “spam” (labeling the data) makes for a good supervised learning problem.
- Product / music / movie recommendation – Each person’s preferences are different, and preferences change over time. Companies like Amazon, Netflix and Spotify use ratings and engagement from a huge volume of items (products, songs, etc) to predict what any given user might want to buy, watch, or listen to next.
- Speech recognition – There is no single combination of sounds to specifically signal human speech, and individual pronunciations differ widely – machine learning can identify patterns of speech and help to convert speech to text. Nuance Communications (maker of Dragon Dictation) is among the better-known speech recognition companies today.
- Real-time bidding (online advertising) – Facebook and Google could never write specific “rules” to determine which ads a given type of user is most likely to click on. Machine learnings helps to identify patterns in user behavior and determine which individual advertisements are most likely to be relevant to which individual user.
- Credit card purchase fraud detection – Like email spam filters, only a small portion of fraud detection can be done using concrete rules. New fraud methods are constantly being used, and systems must adapt to detect these patterns in real time, coaxing out the common signals associated with fraud.
“Clean data is better than big data” is a common phrase among experienced data science professionals. If you have reams of business data from years ago, it may have no relevance today, particularly in fields where the basic business processes change drastically year-over-year, such as mobile eCommerce). If you have reams of unstructured and disjointed data, you may have too much “cleaning” to do before you can ever get around to learning from the information collected.
While unsupervised learning allows for a wide degree of applications in making sense of data without labels, it’s usually not advised for companies to “jump into” ML with a first application in unsupervised learning. The low-hanging fruit for an ML use case is likely to spawn from its historical, labelled data. Below are some examples that might help with new ideas:
- Facebook had millions and millions of tagged human faces on its platform, faces that were already associated with an individual person. This gave Facebook the ability to train algorithms on a tremendous volume of labeled data, with millions of faces in all kinds of lighting conditions and from various angles, allowing the algorithms to be highly refined and attuned to identifying specific human faces.
- Google serves billions and billions on search results, and can gauge the usefulness and relevance of its search results based on click-through rate of its top lists, page -load time, time-on-page from a specific visitor, and many other factors. It would be impossible to find a set of hard and fast rules for showing the right search results, so Google’s algorithms learn what the best options will be based on real-time engagement from billions of daily searches.
- Credit card companies like CapitalOne are faced with a huge volume of chargebacks and reportedly fraudulent purchases each day. By finding connections and patterns across types of purchases, locations of purchases, and types of customers, CapitalOne can use the “labelled” instances of fraud to predict other transactions that are most likely to be fraud. Anomaly detection plays an important role in various security and fraud applications; below is a short “anomaly detection primer” video from Numenta’s YouTube channel:
- An eCommerce company with a massive volume of customer support emails will have a lengthy record of support tickets that were labelled “refund requests”, “technical issues”, “delivery issues”, among others. The company may choose to develop an ML system to instantly label incoming emails, transcribed phone calls, and chat requests with the proper support issue “type.”
ML might be thought of as a kind of “skill”, in the same sense that one might apply the word to human beings. A skill that’s alive, adapting, growing and informed by experience. For this reason, an ML solution will often be incorrect a certain percentage of the time, especially when it’s informed by new or varied stimuli. If your task absolutely cannot allow for any error, ML is likely to be the wrong tool for the job.
Thinking through what information to “feed” your algorithm is not as easy as one might presume. While ML algorithms are adept in identifying correlations, they won’t understand the facts surrounding the data that might make it relevant or irrelevant. Here are some examples of how “context” could get in the way of developing an effective ML solution:
- Predicting eCommerce customer lifetime value: An algorithm could be given data about historical customer lifetime value, without taking into account that many of the customers with the highest lifetime value were contacted via a phone outreach program that ran for over two years but failed to break even, despite generating new sales. If such a telephone follow-up program will not be part of future eCommerce sales growth, then those sales shouldn’t have been fed to the machine.
- Determining medical recovery time: Data might be provided to a machine in order to determine treatment for people with first- or second-degree burns. The machine may predict that many second-degree burn victims will need only as much time as first-degree burn victims, because it doesn’t take into account the faster and more intensive care that second-degree burn victims received historically. The context was not in the data itself, so the machine simply assumes that second degree burns heal just as fast as first degree.
- Recommending related products: A recommendation engine for an eCommerce retailer over-recommends a specific product. Researchers only discover later that this product was promoted heavily over a year ago, so historical data showed a large uptick in sales from existing buyers; however, these promotional purchases were sold more based on the “deal” and the low price, and less so by the actual related intent of the customer.
Building a ML solution requires careful thinking and testing in selecting algorithms, selecting data, cleaning data, and testing in a live environment. There are no “out-of-the-box” machine learning solutions for unique and complex business use cases. Even for extremely common use cases (recommendation engines, predicting customer churn), each application will vary widely and require iteration and adjustment. If a company goes into an ML project without resources committed to an extended period of tinkering, it may never achieve a useful result.
ML doesn’t yet show up in a neat box, and value is wrought by hard thinking, experimental design and – in some cases – hard mathematics. A little bit of time on Google and YouTube, and you can get a hang of how to set up DropBox for your business. Predicting churn rate across your customer segments with machine learning? Not the same game.
Preparing to derive business value from ML implies having trained talent, expert guidance, and an (often) tremendous “data cleansing” period – and none of it is guaranteed to be a win, as Dr. Martin states aptly above. If Google, Amazon, and Facebook could get their interns to set up ML systems, would they really be spending millions and millions of dollars to scoop the world’s top AI talent out of academics to work for them?
The ultimate question for executives remains: When can we have (a) the resources required to invest in machine learning seriously, and (b) a legitimate use case that started from trying to find real business value, not from “trying to find a way to kind of use machine learning.” That’s a thought process that can’t be done for you, but our hope is this article has helped to inform your perspective and give you resources to draw from in future.
With the rise in big data, machine learning has become a key technique for solving problems in areas, such as:
- Computational finance, for credit scoring and algorithmic trading
- Image processing and computer vision, for face recognition, motion detection, and object detection
- Computational biology, for tumor detection, drug discovery, and DNA sequencing
- Energy production, for price and load forecasting
- Automotive, aerospace, and manufacturing, for predictive maintenance
- Natural language processing, for voice recognition applications
There is no best method or one size fits all. Finding the right algorithm is partly just trial and error—even highly experienced data scientists can’t tell whether an algorithm will work without trying it out. But algorithm selection also depends on the size and type of data you’re working with, the insights you want to get from the data, and how those insights will be used.
Here are some guidelines on choosing between supervised and unsupervised machine learning:
- Choose supervised learning if you need to train a model to make a prediction–for example, the future value of a continuous variable, such as temperature or a stock price, or a classification—for example, identify makes of cars from webcam video footage.
- Choose unsupervised learning if you need to explore your data and want to train a model to find a good internal representation, such as splitting data up into clusters.
Expect a steady drumbeat of announcements throughout 2018 and beyond about ML applied to tasks including cleansing and combining data, discovering new data, and suggesting new combinations of data that could, in turn, uncover important insights. Non-technical business users will appreciate ML-powered suggestions on best-fit data visualizations. Automated modeling features, meanwhile, will help non-technical business users tap into the power of predictive analytics.
Of course, many business users are more interested in action and outcomes than interpreting reports, dashboards, and data visualization. These are the users more likely to take advantage of the growing list of smart, ML- and AI-powered prescriptive applications emerging. Here’s where the context of decisions is built into business applications for sales, marketing, HR, supply chain, logistics, and more. In these cases, the data analysis can be tuned to deliver recommended next steps or even to automate actions sure to lead to desired outcomes.
These emerging capabilities will make BI, analytics, and data-driven decision-making that much more accessible, understandable, and actionable for non-technical business users, but embracing the new won’t be as easy as waving a magic wand. So, data analytics is no fad. In fact, the global market for data analytics has been predicted to exhibit a CAGR of 30.08% between 2017–2023 to surpass a valuation of USD 77.64 billion. A large part of this is due to the increased generation of data during the period but far more because of the increasing ability to use statistical algorithms and machine learning techniques to deliver actionable results for businesses.
“People used to say that information is power but that is no longer the case. It’s the analysis of the data, use of the data, digging into it — that is the power”.
N.B: The views and opinions expressed in this article are those of the author and do not necessarily reflect the official position of the African Academic Network on Internet Policy.