Staying abreast of user behavior is crucial for SEO.  If you’re waiting to see what customers think and only then getting ready to respond, you’re already behind.  AI tools enable marketers to glean real-time insights, and to optimize content–as needed–for each users’ individual needs.

Machine learning is integral to personalizing content.  ML technology is what enables Google to understand the idea behind any given query.  Its AI algorithm continuously augments that understanding as new words and queries are introduced.  As algorithms get better at determining which content best meets the needs of each searcher, we are being challenged to create content that meets those needs; and to optimize it so that relevance is clear.

Thus a key function of machine learning is ascertaining user intentions.  While most SEO experts are versed in techniques for content optimization, many are derelict when it comes to identifying intent.  This is why it is important to know how to scrape Google SERPs to optimize for user intent.  Custom SERP extractions (data mining, outreach list building, competitor intelligence) help with SEO strategy for many reasons, but determining INTENT is becoming the most pivotal.

Scraping useful data from SERPs has always been the bread and butter of SEO.  This is typically a matter of making a list of SERP URLs that you want to crawl (presumably the URLs that Google would display for the query you enter).  One can then use the same process to scrape results from the ‘People Also Ask’ suggestions that Google displays for certain keywords.  It’s worth noting that there are already tools in place to glean insights.  (For example, you can use Screaming Frog to extract related searches that Google displays for keywords.)

But how shall one find the best training data?  A predictive model can be tested on, say, news headlines that can be scraped from Google Trends.  But when building a model that takes into account INTENT, it is important to use a dataset with questions grouped by surmised intention.

AI tools can tell you which SERP features (local pack, featured snippets) are present for a keyword, but this isn’t always reliable for identifying what types of pages Google likes for organic results.  You can assume the presence of a local pack to suggest visitor intent; but other search results might favor informational results that could be more applicable.  Intent classification is still more of an art than a science; but the goal is still about gaining insights into how Google is interpreting query intent, clues of which are sometimes contained within the page titles and meta descriptions that Google displays.

Custom extractions are commands given to a crawling tool to identify and extract information from a specific element on a webpage.  In this case, the webpage we want to crawl just happens to be a SERP; and such data is invaluable for compiling an effective query classification dataset.  You can just download the dataset from Kaggle (after logging in), then load it all into a data-frame in Google Colaboratory (“Colab”), which comes with the Tensorflow feature.

The key, then, is using AI to build predictive models.  Uber recently released a tool called Ludwig that makes it possible to build and use predictive models with incredible ease.  Ludwig can be run from within Colab in order to use Google’s free GPU runtime.  Training deep learning models without using GPUs can be the difference between waiting a few minutes and waiting many hours.  Instead of using a parallel CNN encoder to put together the training set, you can use the BERT encoder that was recently added to Ludwig.  This encoder leverages a pre-trained model, which enables it to perform well without requiring too much input data.  Ludwig’s output saves you from having to perform tasks manually; as it automatically parses the dataset into training, development, and testing.

Research groups from, say, Google or Facebook train a classification model on massive datasets (as with those from Imagenet when it comes to images).  This process would take days and be very expensive.  But once the training is done, the final part of the trained model is replaced, and retrained on new data to perform similar but new tasks. Such fine-tuning new datasets is a matter of grouping queries by their likely intent.  (This is often a matter of surmising intentions by considering high search impressions and low number of clicks.)  The training process is about removing noise (irrelevant data) from the input by trial and error; and reiterating that routine until you are left with only pertinent data.

SEO professionals can no longer consider visitors moving on a linear path from awareness to decision.  Machine learning can help businesses understand and predict user intent in ways that simply aren’t possible manually, even with large staffs working around the clock.