Getting Started With Machine Learning (No, "Practical" Machine Learning!) (Part 3)
Hello, and welcome!
In this blog post, we introduce standard Machine Learning methodologies and workflows. The importance of following a structured workflow. How to conduct a needs analysis for a Machine Learning project based on a fictitious business challenge while pointing out the real-world challenges you may encounter. The rest of the series will walk you through a real-world project where you will see all these in action.
There are no prerequisites for this particular post expect the first part of this series of blog posts.
Conventions Used In This Post
"🔊 Audio: (COMING SOON!)" If you are very busy and would love to learn the main points from this article while doing your chores (or working out), we provide you with an explanation of the key points from the article. We also state sessions where you may have to go through the code snippet and try things out in your IDE or at least look through the article to understand the point better.
"📺Video:" We complement this blog post with embedded videos on specific topics to help you blend your learning. The videos serve as complements to the articles that follow them.
"❗Main point": If you are in a hurry, you should read the text under this convention.
"💡Insight:" If you want to learn more about that specific topic, you can deep into the text under this convention.
Table of Contents
Introduction to Standard Machine Learning Methodologies.
Understanding the Machine Learning Workflow.
- There are no two ways to do Machine Learning; you either follow standard methodologies or you don't do Machine Learning.
- Machine Learning projects/systems are mostly plagued with uncertainty when compared to traditional software engineering. With traditional software engineering since you programmed the software yourself, you are quite certain on how it will work in production (when your users or other systems utilize the software) but with Machine Learning programs, since the computer constructed the program, you are not 100% certain on how it will work and this is why you have to make sure you follow standard methodologies so you can measure the performance using defined metrics that tell you how good or bad the system may perform when deployed to real-world use-cases.
- The standard methodologies based on Aurélien Géron's book or workflows to follow for Machine Learning projects can be found in the handy sheet below you can save to your device.
Why You Should Standard Machine Learning Methodologies.
When we posed the question "What qualities do you think are most important for someone to become a world-class Machine Learning engineer?" to Aurélien Géron in our recent interview with him, this was his response: "Learning all the time, getting your hands dirty, being patient and most importantly, having a good methodology. There are so many steps in ML projects, you need to make sure you follow a good methodology, and you don't just go in random directions."
Aurélien is a former product manager for YouTube and the best-selling author of the popular Machine Learning book "Hands-On Machine Learning with Scikit-Learn and TensorFlow". I am sure that's enough opening to convince you that following standard Machine Learning methodologies is not just key to successfully completing Machine Learning projects, but also key to becoming a world-class Machine Learning Engineer.
We are going to use this checklist to go over a Machine Learning project that involves solving a problem for a real-estate properties company "Tonye Daniels & Co. Properties" located somewhere on earth :), while taking into account best-practices that are guaranteed to work for any real-world project.
We will walk through the end-to-end Machine Learning workflow using this checklist while taking into consideration the challenges you will face as a Machine Learning Engineer such as;
Dealing with data and algorithmic biases (building the ML system responsibly).
Taking into account a human-centered design approach.
Understanding what metrics to optimize for.
Data quality problems.
Model complexity trade-offs and model training challenges.
Let's now walk through the steps of an ML workflow to solve the fictitious business problem of Tonye Daniels & Co. Properties. But wait! Heck, we don't even know what the problem we are trying to solve is. Of course, according to our previous article, before delving onto the Machine Learning bandwagon, there are some steps we need to take to know if Machine Learning will be appropriate for the current business challenge. Let's do just that.
Can Machine Learning Solve Tonye Daniels & Co. Properties Business Challenge?
The first thing you should do of course is to have a discussion with the stakeholders at the company to know what they want—basically the problem statement.
Just before the meeting, you were lucky to get an overview of the company (if you don't get one, you should ask for it) in preparation for the meeting. The overview is quite detailed and for your purposes, you may choose to have the following included in the overview; (I am assuming you are a contractor or new to the company.)
A brief history of the company.
The kind of business they are into.
What is their core business?
How do they go about their core business? (This should be primary to you.)
Although this might be asking too much, you may choose to go on further and ask what their primary sources of revenue are, their economic valuation, and any other information on their business process.
Do they have an in-house IT and/or analytics team?
The representative sends you a brief overview just before the meeting...
Tonye Daniels & Co. Properties is a real estate company located somewhere on earth. Tonye Daniels & Co. Properties have been around for about 10 years and have explored areas from marketing consultancy to print services, and only settled for offering real-estate services 6 years ago. They currently offer a wide range of real-estate services from rent agency to property acquisition. The core of their business lies in providing the most optimal price estimate on their real-estate properties for their clients.
The majority of their clients have had testimonials on the company helping them save a lot of money on housing property acquisitions. while the company also makes sure they get the value-for-money for these houses. They have been largely successful at this because they employ a lot of highly qualified property and pricing analysts (mostly on contracts with only 2 in-house) to help the process of determining the most optimal price for a property.
You get to the meeting room and Mr. Dokubo, the meeting presentative gives you a brief presentation on the problem overview;
The problem here is that the stakeholders of Tonye Daniels & Co. Properties have not been very happy with the company's revenue performance (an average loss of $603,700 over 8 quarters!). They have been running on loss for the past 2 years because of the cost of paying highly qualified property and pricing analysts to help their customers. The stakeholders passed down their aggression on the current CEO of Tonye Daniels & Co. Properties; Tonye Nnamdi. Under pressure, Mr. Tonye prompted Mr. Dokubor, the operations manager to look for a solution to the current problem. After some “little research”, Mr. Dokubo found out that Artificial Intelligence can “magically’ solve the problems they are currently having.
Mr. Dokubo asked around and eventually, you were recommended by a couple of sources as the best person to meet to solve this problem.
After you went through the problem overview, Mr. Dokubo started the conversation like below; (Note that this, of course, is a fictitious conversation and real-world conversations may be more difficult than this.)
Mr. Dokubo: Hello! Once again, my name is Mr. Dokubo and I am the operations manager at Tonye Daniels & Co. Properties. I guess you must be up-to-speed with what it's like in the company and the challenge we are currently facing.
You: I am perfectly fine with the content for now...
Mr. Dokubo: Great, Mr. You! We would want to bring you in to see if this technology can provide a solution to our current business challenge. Mr. You, we agree to all your standard contract terms, that is how desperate we are. So what do you say?
You: This seems feasible. It will be wise for me to take a look at your business's current needs and figure out the best path to look for a solution to the challenges your business is facing.
Mr. Dokubo: That will be fine by us, Mr. You. We will expect to hear back from you soon.
Mr. You firmly shake hands with Mr. Dokubo and 2 other reps at the meeting.
End of conversation
Build a Machine Learning system that can help complement the work of in-house pricing analysts by automating the process of estimating the optimal value of a housing property for clients. The stakeholders expect that this system will perform at a level that will help them reduce the cost of hiring property and pricing analysts while easing the experience of customers on their website looking for fast and accurate pricing estimates by providing the details on the type of house they want and getting back the price of the house in real-time.
Hey! Great move ending the conversation on that note. Giving them a glimmer of hope that a solution to their current challenges is feasible while giving yourself space to assess the problem properly in relation to their business needs. This is a very crucial step because sometimes due to some form of buzzword hype or underwhelming business performance, stakeholders tend to search for "magical" quick-fixes to their business challenges.
Intuitively, their specific problem seems feasible to be solved with ML, but you'd want to make sure you are applying ML to the right area of the business that can give the needed return on investment (ROI) so they know what to expect. Here you are conducting a "needs analysis" and asking then needed questions for their proposed solution to know what they truly want and if that is actually the proper thing for their business at that time.
Sometimes stakeholders' wants and expectations might be far-fetched when compared to their business needs. You need to make sure they are in touch with reality. All these will help you develop a proper AI strategy for the organization based on this challenge—if at all their project will require AI/ML technology.
Say you have conducted proper needs analysis and found out that the best way to get the company out of its current crisis may be to build a Machine Learning system that can automate some expensive processes. Just to be sure the project will require ML, you conduct another analysis by asking the following questions, based on our previous article.;
Frame Your Challenge As A Machine Learning Problem
Note: During your own projects, if you cannot provide a "yes" to the questions above, then you might want to re-evaluate your understanding of the problem or consider finding other solutions that do not require Machine Learning.
Based on our responses, we can definitely be framed the challenge as an ML problem but before we go on into hacking away a solution, let us determine its feasibility and if it is truly worth it to build a Machine Learning solution. Just because something can be done doesn't mean it should be done. Besides, looking at the common challenges of running an ML project we discussed in the 1st part of this blog post series, an alternative that can perform as well should perhaps be considered.
Assessing The Feasibility Of A Machine Learning Solution
Now that you have figured out the feasibility of an ML solution to the challenge, identifying the "sweet spot" as we discussed in the previous article, you can go on to answering some other questions that may seem redundant...`
Phew! We have gone through the long process of evlauating if there's an actual need for an ML solution to the problem or not. It was long but necessary and you'd save yourself a lot of headache and potentially your career at worst. So I would say this sort of scrutiny is definetly worth it.
Notice that we did not take into account some other crucial real-world problems like how to negotiate your contract pricing and terms because like it or not, how you price ML projects as a contractor is mquite different from traditional software development. This will be an article or talk for another day.
The next step is to delve right into building our solutions using standard Machine Learning methodoliges, guidelines, and best practices. We will walk through the ML project workflow in the next part of this series of blog posts.
In this relatively short blog post, we went through a handy overview of the standard Machine Learning workflow and understood why it is important for the success of your Machine Learning projects.
Following good methodolgies are one of the qualities of a great Machine Learning Engineer, accroding to Aurélien Géron.
We went through a ficitious business challenge and you could see how performing needs analysis by asking the most relevant questions are crucial for the rest of the project.
In the next article, we will start implementing these standard methodoliges to solve this challenge right infront of us. See you next time.
Aurélien's GitHub page here.
Let us know if you have any feedback on this post (typo, we missed sometimes, claims are wrong, and so on) in the comment section. We take critique and complements well. :)
Till next time, stay safe. 💚💚
If you also enjoyed this post, do leave a reaction 🔥 to the story, hit the like button 👍🏽, and share 📩 it with your friends that may be interested in learning. See you soon!